Groq - ngrok documentation

Groq provides high-speed inference for open-source models (Llama, Mixtral, and others) using their custom LPU hardware.

Even though Groq is a built-in provider, it still requires you to bring your own key because ngrok.ai inference is not available for it at this time.

Setup

Create an access key

Follow the quickstart to create an access key in app.ngrok.ai.

Store your Groq API key

Create a configuration

Create an access key configuration with a groq routing rule and allow Groq in the access scope.

Assign and send requests

Assign the configuration to your access key and send requests with your access key:

from openai import OpenAI

client = OpenAI(
    base_url="https://gateway.ngrok.ai/v1",
    api_key="ng-xxxxx-g1-xxxxx",
)

response = client.chat.completions.create(
    model="meta-llama/llama-3.3-70b-versatile",
    messages=[{"role": "user", "content": "Hello!"}]
)

Available models

See the model catalog.

Next steps

Google DeepSeek

​Setup

​Available models

​Next steps

Setup

Available models

Next steps