Skip to main content
Groq provides high-speed inference for open-source models (Llama, Mixtral, and others) using their custom LPU hardware.
Even though Groq is a built-in provider, it still requires you to bring your own key because ngrok.ai inference is not available for it at this time.

Setup

1

Create an access key

Follow the quickstart to create an access key in app.ngrok.ai.
2

Store your Groq API key

Sign up at console.groq.com and create an API key. Then add a provider key.
3

Create a configuration

Create an access key configuration with a groq routing rule and allow Groq in the access scope.
4

Assign and send requests

Assign the configuration to your access key and send requests with your access key:
from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.ngrok.ai/v1",
    api_key="ng-xxxxx-g1-xxxxx",
)

response = client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-versatile",
    messages=[{"role": "user", "content": "Hello!"}]
)

Available models

See the model catalog.

Next steps