Skip to main content
You can configure fail over by listing multiple models in a request. The gateway tries each model in order until one succeeds. Combine this with an access key configuration that allows the providers you need. Your app sends an access key; the AI Gateway routes to each provider with the credentials from your configuration.

Basic example

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.ngrok.ai/v1",
    api_key="ng-xxxxx-g1-xxxxx",
)

response = client.chat.completions.create(
    model="gpt-4o",
    extra_body={
        "models": ["anthropic:claude-sonnet-4-6"],
    },
    messages=[{"role": "user", "content": "Hello!"}],
)
Or in raw JSON:
{
  "model": "gpt-4o",
  "models": ["anthropic:claude-sonnet-4-6"],
  "messages": [{"role": "user", "content": "Hello"}]
}

How it works

  1. Request arrives with your access key
  2. Gateway tries gpt-4o (OpenAI)
  3. On failure, tries anthropic:claude-sonnet-4-6
  4. Returns the first successful response
Both providers must be allowed by the access key’s configuration, with routing rules that supply credentials (ngrok.ai inference or provider keys).

Configuration example

Allow OpenAI and Anthropic on one key:
{
  "name": "Multi-provider production",
  "access": {
    "providers": { "allow": ["openai", "anthropic"] }
  },
  "router": {
    "rules": [
      { "provider": "openai", "steps": [{ "type": "ngrok" }] },
      { "provider": "anthropic", "steps": [{ "type": "ngrok" }] }
    ]
  }
}
Use provider keys instead of ngrok steps when billing should go to your provider accounts.

Next steps