Failover¶

Failover routing enhances reliability by automatically switching to an alternate AI model if the primary model becomes unresponsive or encounters an error. This strategy ensures continuous service availability without manual intervention.

Tip

You can configure more than one Failover policy.

Configure Failover¶

You can configure failover for your AI API by attaching the Model Failover policy. Here are the steps that you need to follow:

Login to the Publisher Portal (https://<hostname>:9443/publisher).
Select the AI API for which you want to configure load balancing.
Navigate to API Configurations, and click Policies.
Look for the policy named Model Failover listed under the Common Policies section within the policy list. Let's, drag and drop the Model Failover policy to the Request flow of /chat/completions POST operation.

Fill in the requested details and click Save.

Section	Description
Production/Sandbox	Select the target model and target endpoint. If the request made to this combination fails, fallback will get triggered. Add any amount of fallback models by clicking on Add Fallback Model button. For each fallback model addition, select a model and an endpoint from the respective dropdowns.
Request Timeout	Request timeout in seconds.
Suspend Duration	Suspend duration in seconds. This will be used to suspend any failed model-endpoint pairs. If not configured, knowledge about failed invocations are not persisted.

AWS Bedrock Configuration

If you are configuring a multi model provider service, you must select both the Provider (model family) and the Model for each target and fallback model entry. The Provider dropdown lists the model families you have set up in the Admin Portal (such as Meta, Anthropic, DeepSeek, etc.), and once a provider is selected, the Model dropdown will display the specific models available under that provider.

Finally, scroll to the bottom of the page and click on Save and deploy.
For more information on how to work with API Policies, refer to the API Policies section.