content/providers/01-ai-sdk-providers/170-baseten.mdx
Baseten is an inference platform for serving frontier, enterprise-grade opensource AI models via their API.
The Baseten provider is available via the @ai-sdk/baseten module. You can install it with
<Tabs items={['pnpm', 'npm', 'yarn']}> <Tab> <Snippet text="pnpm add @ai-sdk/baseten" dark /> </Tab> <Tab> <Snippet text="npm install @ai-sdk/baseten" dark /> </Tab> <Tab> <Snippet text="yarn add @ai-sdk/baseten" dark /> </Tab> </Tabs>
You can import the default provider instance baseten from @ai-sdk/baseten:
import { baseten } from '@ai-sdk/baseten';
If you need a customized setup, you can import createBaseten from @ai-sdk/baseten
and create a provider instance with your settings:
import { createBaseten } from '@ai-sdk/baseten';
const baseten = createBaseten({
apiKey: process.env.BASETEN_API_KEY ?? '',
});
You can use the following optional settings to customize the Baseten provider instance:
baseURL string
Use a different URL prefix for API calls, e.g. to use proxy servers.
The default prefix is https://inference.baseten.co/v1.
apiKey string
API key that is being sent using the Authorization header. It defaults to
the BASETEN_API_KEY environment variable. It is recommended you set the environment variable using export so you do not need to include the field every time.
You can grab your Baseten API Key here
modelURL string
Custom model URL for specific models (chat or embeddings). If not provided, the default Model APIs will be used.
headers Record<string,string>
Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation.
You can select Baseten models using a provider instance.
The first argument is the model id, e.g. 'moonshotai/Kimi-K2-Instruct-0905': The complete supported models under Model APIs can be found here.
const model = baseten('moonshotai/Kimi-K2-Instruct-0905');
You can use Baseten language models to generate text with the generateText function:
import { baseten } from '@ai-sdk/baseten';
import { generateText } from 'ai';
const { text } = await generateText({
model: baseten('moonshotai/Kimi-K2-Instruct-0905'),
prompt: 'What is the meaning of life? Answer in one sentence.',
});
Baseten language models can also be used in the streamText function
(see AI SDK Core).
Baseten supports dedicated model URLs for both chat and embedding models. You have to specify a modelURL when creating the provider:
/sync/v1)For models deployed with Baseten's OpenAI-compatible endpoints:
import { createBaseten } from '@ai-sdk/baseten';
const baseten = createBaseten({
modelURL: 'https://model-{MODEL_ID}.api.baseten.co/sync/v1',
});
// No modelId is needed because we specified modelURL
const model = baseten();
const { text } = await generateText({
model: model,
prompt: 'Say hello from a Baseten chat model!',
});
/predict Endpoints/predict endpoints are currently NOT supported for chat models. You must use /sync/v1 endpoints for chat functionality.
You can create models that call the Baseten embeddings API using the .embeddingModel() factory method. The Baseten provider uses the high-performance @basetenlabs/performance-client for optimal embedding performance.
import { createBaseten } from '@ai-sdk/baseten';
import { embed, embedMany } from 'ai';
const baseten = createBaseten({
modelURL: 'https://model-{MODEL_ID}.api.baseten.co/sync',
});
const embeddingModel = baseten.embeddingModel();
// Single embedding
const { embedding } = await embed({
model: embeddingModel,
value: 'sunny day at the beach',
});
// Batch embeddings
const { embeddings } = await embedMany({
model: embeddingModel,
values: [
'sunny day at the beach',
'rainy afternoon in the city',
'snowy mountain peak',
],
});
Supported:
/sync endpoints (Performance Client automatically adds /v1/embeddings)/sync/v1 endpoints (automatically strips /v1 before passing to Performance Client)Not Supported:
/predict endpoints (not compatible with Performance Client)The embedding implementation includes:
@basetenlabs/performance-client for optimal performanceThe Baseten provider includes built-in error handling for common API errors:
import { baseten } from '@ai-sdk/baseten';
import { generateText } from 'ai';
try {
const { text } = await generateText({
model: baseten('moonshotai/Kimi-K2-Instruct-0905'),
prompt: 'Hello, world!',
});
} catch (error) {
console.error('Baseten API error:', error.message);
}
// Embeddings require a modelURL
try {
baseten.embeddingModel();
} catch (error) {
// Error: "No model URL provided for embeddings. Please set modelURL option for embeddings."
}
// /predict endpoints are not supported for chat models
try {
const baseten = createBaseten({
modelURL:
'https://model-{MODEL_ID}.api.baseten.co/environments/production/predict',
});
baseten(); // This will throw an error
} catch (error) {
// Error: "Not supported. You must use a /sync/v1 endpoint for chat models."
}
// /sync/v1 endpoints are now supported for embeddings
const baseten = createBaseten({
modelURL:
'https://model-{MODEL_ID}.api.baseten.co/environments/production/sync/v1',
});
const embeddingModel = baseten.embeddingModel(); // This works fine!
// /predict endpoints are not supported for embeddings
try {
const baseten = createBaseten({
modelURL:
'https://model-{MODEL_ID}.api.baseten.co/environments/production/predict',
});
baseten.embeddingModel(); // This will throw an error
} catch (error) {
// Error: "Not supported. You must use a /sync or /sync/v1 endpoint for embeddings."
}
// Image models are not supported
try {
baseten.imageModel('test-model');
} catch (error) {
// Error: NoSuchModelError for imageModel
}