content/providers/01-ai-sdk-providers/40-cerebras.mdx
The Cerebras provider offers access to powerful language models through the Cerebras API, including their high-speed inference capabilities powered by Wafer-Scale Engines and CS-3 systems.
API keys can be obtained from the Cerebras Platform.
The Cerebras provider is available via the @ai-sdk/cerebras module. You can install it with:
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}> <Tab> <Snippet text="pnpm add @ai-sdk/cerebras" dark /> </Tab> <Tab> <Snippet text="npm install @ai-sdk/cerebras" dark /> </Tab> <Tab> <Snippet text="yarn add @ai-sdk/cerebras" dark /> </Tab>
<Tab> <Snippet text="bun add @ai-sdk/cerebras" dark /> </Tab> </Tabs>You can import the default provider instance cerebras from @ai-sdk/cerebras:
import { cerebras } from '@ai-sdk/cerebras';
For custom configuration, you can import createCerebras and create a provider instance with your settings:
import { createCerebras } from '@ai-sdk/cerebras';
const cerebras = createCerebras({
apiKey: process.env.CEREBRAS_API_KEY ?? '',
});
You can use the following optional settings to customize the Cerebras provider instance:
baseURL string
Use a different URL prefix for API calls.
The default prefix is https://api.cerebras.ai/v1.
apiKey string
API key that is being sent using the Authorization header. It defaults to
the CEREBRAS_API_KEY environment variable.
headers Record<string,string>
Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation.
You can create language models using a provider instance:
import { cerebras } from '@ai-sdk/cerebras';
import { generateText } from 'ai';
const { text } = await generateText({
model: cerebras('llama3.1-8b'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});
Cerebras language models can be used in the streamText function
(see AI SDK Core).
You can create Cerebras language models using a provider instance. The first argument is the model ID, e.g. llama-3.3-70b:
const model = cerebras('llama-3.3-70b');
You can also use the .languageModel() and .chat() methods:
const model = cerebras.languageModel('llama-3.3-70b');
const model = cerebras.chat('llama-3.3-70b');
Cerebras offers several reasoning models including gpt-oss-120b, qwen-3-32b, and zai-glm-4.7 that generate intermediate thinking tokens before their final response. The reasoning output is streamed through the standard AI SDK reasoning parts.
For gpt-oss-120b, you can control the reasoning depth using the reasoningEffort provider option:
import { cerebras } from '@ai-sdk/cerebras';
import { streamText } from 'ai';
const result = streamText({
model: cerebras('gpt-oss-120b'),
providerOptions: {
cerebras: {
reasoningEffort: 'medium',
},
},
prompt: 'How many "r"s are in the word "strawberry"?',
});
for await (const part of result.fullStream) {
if (part.type === 'reasoning') {
console.log('Reasoning:', part.text);
} else if (part.type === 'text-delta') {
process.stdout.write(part.textDelta);
}
}
See AI SDK UI: Chatbot for more details on how to integrate reasoning into your chatbot.
The following optional provider options are available for Cerebras language models:
reasoningEffort 'low' | 'medium' | 'high'
Controls the depth of reasoning for GPT-OSS models. Defaults to 'medium'.
user string
A unique identifier representing your end-user, which can help with monitoring and abuse detection.
strictJsonSchema boolean
Whether to use strict JSON schema validation. When true, the model uses constrained decoding to guarantee schema compliance. Defaults to true.
| Model | Image Input | Object Generation | Tool Usage | Tool Streaming | Reasoning |
|---|---|---|---|---|---|
llama3.1-8b | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
llama-3.3-70b | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
gpt-oss-120b | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
qwen-3-32b | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
qwen-3-235b-a22b-instruct-2507 | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
qwen-3-235b-a22b-thinking-2507 | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
zai-glm-4.6 | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
zai-glm-4.7 | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |