Skip to main content
Providers are the AI services that host and serve models. The AI Gateway routes your requests to providers and handles authentication, failover, and observability. Manage providers at app.ngrok.ai. If you’re deciding which setup path to use, start with Choose how to reach providers. The dashboard separates built-in catalog providers from custom providers you define.

Built-in providers

Built-in providers are public AI APIs with a known model catalog: OpenAI, Anthropic, Google, Groq, and others listed in the model catalog. You can send requests with an access key. Your access key authenticates you to the gateway, not to the upstream provider. For OpenAI and Anthropic, the AI Gateway can supply upstream credentials when you have credits. You aren’t required to have your own provider account or provider API keys—the AI Gateway handles upstream authentication and bills model cost through credits. For other built-in providers, bring your own provider key. Upstream costs go to your provider account; credits still cover the gateway processing fee. This is generally referred to as BYOK (bring your own key).

OpenAI

GPT and o-series models. ngrok.ai inference available.

Anthropic

Claude models. ngrok.ai inference available. Supports both OpenAI and Anthropic SDK formats.

OpenRouter

Access hundreds of models from multiple providers through a single API.

Google

Gemini models from Google AI Studio.

Groq

LPU-accelerated inference for open-source models (Llama, Mixtral).

DeepSeek

High-performance reasoning and chat models.

Hyperbolic

Open-source model hosting with high-performance inference.

InceptionLabs

Diffusion-based language models for fast text generation.

Inference.net

Distributed inference network for AI models at scale.

Custom providers

Custom providers are upstream endpoints you define, such as self-hosted Ollama or vLLM, a private deployment, or any OpenAI- or Anthropic-compatible API that isn’t in the built-in catalog. See Use a model you run yourself. Custom providers require a provider key when the upstream needs authentication. See Custom providers for the concept.

Ollama

Run open-source models locally with Ollama.

vLLM

High-performance inference server.

LM Studio

Desktop app for local model inference.

Azure OpenAI

Microsoft’s OpenAI service on Azure.

How provider selection works

When a request arrives, the gateway determines which provider to use:
  1. Explicit provider prefix: if the model name includes a provider prefix (for example, openai:gpt-4o or openrouter:anthropic/claude-3.5-sonnet), that provider is used
  2. Catalog lookup: the gateway looks up the model ID in its catalog to find the default provider
  3. Request selection: model names and fallback lists can override the default

Next steps