Skip to main content
Custom providers are upstream endpoints you define, such as self-hosted Ollama or vLLM, a private deployment, or any OpenAI- or Anthropic-compatible API not in the built-in catalog. Use a custom provider when you want the AI Gateway to call a model running on your machine, private network, or cloud GPU. Traffic still authenticates to the gateway with an access key. Attach provider keys and routing rules through an access key configuration.

When to use one

Use a custom provider when:
  • You run models locally with Ollama, LM Studio, or vLLM.
  • You host models on your own cloud infrastructure.
  • You have a private OpenAI- or Anthropic-compatible API.
  • You need a provider that isn’t in the built-in catalog.

What a custom provider defines

A custom provider tells the gateway:
  • Which provider ID to use in requests, such as my-ollama.
  • Which base URL to call, such as https://my-ollama.internal.
  • Which API surface the endpoint supports.
  • Which model IDs are available.
You call custom provider models with the same provider:model format as built-in providers:
{
  "model": "my-ollama:llama3.2",
  "messages": [{"role": "user", "content": "Hello"}]
}

URL requirements

URL typeSchemeExample
Externalhttps:// onlyhttps://api.example.com/v1
ngrok internalhttp:// or https://https://my-service.internal
HTTP is only allowed for ngrok .internal endpoints. External URLs must use HTTPS.

Self-hosted on a local network

To reach models on your machine or private network, expose the service with an ngrok internal endpoint, then use that URL as the custom provider’s baseUrl. Reaching internal endpoints may require a ngrok platform plan. See Credits. For the full setup flow, see Use a model you run yourself.

Next steps