packages/docs/plugin-registry/llm/groq.md
The Groq plugin connects Eliza agents to Groq's inference API. Groq's Language Processing Unit (LPU) delivers significantly faster token generation speeds than GPU-based inference — making it ideal for latency-sensitive agent workflows.
Package: @elizaos/plugin-groq
eliza plugins install @elizaos/plugin-groq
The plugin auto-enables when GROQ_API_KEY is present:
export GROQ_API_KEY=gsk_...
| Environment Variable | Required | Description |
|---|---|---|
GROQ_API_KEY | Yes | Groq API key from console.groq.com |
GROQ_BASE_URL | No | Custom base URL for API requests |
GROQ_SMALL_MODEL | No | Override the small model identifier |
GROQ_LARGE_MODEL | No | Override the large model identifier |
GROQ_TTS_MODEL | No | Override the text-to-speech model |
GROQ_TTS_VOICE | No | Voice profile for text-to-speech output |
GROQ_TTS_RESPONSE_FORMAT | No | Output format for text-to-speech audio |
{
"auth": {
"profiles": {
"default": {
"provider": "groq",
"model": "openai/gpt-oss-120b"
}
}
}
}
| Model | Context | Speed | Best For |
|---|---|---|---|
openai/gpt-oss-120b | 128k | Fast | Default small and large text model |
| elizaOS Model Type | Groq Model |
|---|---|
TEXT_SMALL | openai/gpt-oss-120b |
TEXT_LARGE | openai/gpt-oss-120b |
Groq's LPU architecture excels at:
This makes Groq particularly well-suited for:
Groq enforces per-minute token limits by model. Free tier limits are lower; paid tiers scale based on usage.
See console.groq.com/docs/rate-limits for current limits.
Groq offers a free tier. Paid usage is billed per million tokens.
See groq.com/pricing for current rates.