> ## Documentation Index
> Fetch the complete documentation index at: https://ngrok.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Handle gateway errors

> Understand failover behavior and error recovery in the AI Gateway.

When a request to a provider fails, the AI Gateway automatically attempts failover to the next available candidate. This page explains how failover works and how to configure error behavior.

## Automatic failover

The gateway automatically tries the next candidate when a request fails due to:

* **Timeouts** - Request exceeded `per_request_timeout`
* **HTTP errors** - Any 4xx or 5xx response from providers
* **Connection errors** - Network failures, DNS issues, TLS errors

The gateway never retries the same model/key combination. It always moves to the next candidate.

## Failover order

When a request fails, the gateway follows this order:

### 1. Try another API key

If multiple API keys are configured for the current model's provider, the gateway tries the next key:

```yaml theme={null}
providers:
  - id: "openai"
    api_keys:
      - value: ${secrets.get('openai', 'key-one')}   # Try first
      - value: ${secrets.get('openai', 'key-two')}   # Try if first fails
```

### 2. Try another model

After exhausting all keys for a model, the gateway moves to the next model candidate. Candidates come from:

* The client's `models` array in the request body
* Model selection strategies that return multiple models

```json theme={null}
{
  "model": "gpt-4o",
  "models": ["anthropic:claude-3-5-sonnet-20241022", "google:gemini-2.0-flash"],
  "messages": [{"role": "user", "content": "Hello"}]
}
```

The `model` field is tried first, then entries in `models` as fallbacks.

<Note>
  Cross-provider failover requires the client to specify models from different providers, or model selection strategies that return candidates from multiple providers.
</Note>

## Error behavior configuration

### `on_error: "halt"` (default)

Stop processing and return the error to the client:

```yaml theme={null}
on_http_request:
  - type: ai-gateway
    config:
      on_error: "halt"
```

### `on_error: "continue"`

Continue to the next action in the Traffic Policy, allowing custom error handling:

```yaml theme={null}
on_http_request:
  - type: ai-gateway
    config:
      on_error: "continue"
  - type: custom-response
    config:
      status_code: 503
      body: "AI service temporarily unavailable"
```

When using `on_error: "continue"`, you can inspect the error details using [action result variables](/ai-gateway/guides/debugging#legacy-endpoint-setup).

## Timeout configuration

Control failover timing with these settings:

```yaml theme={null}
on_http_request:
  - type: ai-gateway
    config:
      per_request_timeout: "30s"   # Each provider attempt
      total_timeout: "5m"          # All attempts combined
```

| Setting               | Default | Description                                     |
| --------------------- | ------- | ----------------------------------------------- |
| `per_request_timeout` | `30s`   | Maximum time for a single provider attempt      |
| `total_timeout`       | `5m`    | Maximum time for all failover attempts combined |

If `total_timeout` is reached, failover stops immediately even if more candidates remain.

## Errors that skip failover

These errors return immediately without attempting failover:

| Error                 | Description                                                    |
| --------------------- | -------------------------------------------------------------- |
| Invalid request body  | Request JSON could not be parsed                               |
| No models available   | No models matched the gateway configuration and client request |
| Model selection empty | All model selection strategies returned empty results          |
| Configuration errors  | Invalid provider or model configuration                        |

Once failover begins, all provider errors (including 4xx) trigger the next candidate until exhausted.

<Note>
  Token limit and API key errors for a specific model trigger failover to the next model, not immediate failure.
</Note>

## Best practices

1. **Configure multiple API keys** per provider for key-level failover
2. **Use the `models` array** in client requests for cross-provider failover
3. **Set appropriate timeouts** based on your latency requirements
4. **Use `on_error: "continue"`** with custom responses for graceful degradation
5. **Monitor with log exports** to track failover patterns

## Next steps

<CardGroup cols={2}>
  <Card title="Troubleshooting" icon="circle-exclamation" href="/ai-gateway/reference/error-codes">
    Error codes, causes, and solutions
  </Card>

  <Card title="Debugging" icon="bug" href="/ai-gateway/guides/debugging">
    Inspect action results and diagnose issues
  </Card>
</CardGroup>