content/providers/03-community-providers/01-custom-providers.mdx
The AI SDK provides a Language Model Specification that enables you to create custom providers compatible with the AI SDK. This specification ensures consistency across different providers.
Please publish your custom provider in your own GitHub repository and as an NPM package. You are responsible for hosting and maintaining your provider. Once published, you can submit a PR to the AI SDK repository to add your provider to the Community Providers documentation section. Use the OpenRouter provider documentation as a template for your documentation.
The Language Model Specification V4 is a standardized specification for interacting with language models that provides a unified abstraction layer across all AI providers. This specification creates a consistent interface that works seamlessly with different language models, ensuring that developers can interact with any provider using the same patterns and methods. It enables:
<Note> If you open-source a provider, we'd love to promote it here. Please send us a PR to add it to the [Community Providers](/providers/community-providers) section. </Note>The Language Model Specification V4 creates a robust abstraction layer that works across all current and future AI providers. By establishing a standardized interface, it provides the flexibility to support emerging LLM capabilities while ensuring your application code remains provider-agnostic and future-ready.
At its heart, the V4 specification defines three main interfaces:
ProviderV4The ProviderV4 interface acts as the entry point:
interface ProviderV4 {
languageModel(modelId: string): LanguageModelV4;
.embeddingModel(modelId: string): EmbeddingModelV4<string>;
imageModel(modelId: string): ImageModelV4;
}
LanguageModelV4The LanguageModelV4 interface defines the methods your provider must implement:
interface LanguageModelV4 {
specificationVersion: 'V4';
provider: string;
modelId: string;
supportedUrls: Record<string, RegExp[]>;
doGenerate(options: LanguageModelV4CallOptions): Promise<GenerateResult>;
doStream(options: LanguageModelV4CallOptions): Promise<StreamResult>;
}
Key aspects:
Before diving into the details, it's important to understand the distinction between two key concepts in the V4 specification:
LanguageModelV4ContentThe V4 specification supports five distinct content types that models can generate, each designed for specific use cases:
The fundamental building block for all text generation:
type LanguageModelV4Text = {
type: 'text';
text: string;
};
This is used for standard model responses, system messages, and any plain text output.
Enable models to invoke functions with structured arguments:
type LanguageModelV4ToolCall = {
type: 'tool-call';
toolCallType: 'function';
toolCallId: string;
toolName: string;
args: string;
};
The toolCallId is crucial for correlating tool results back to their calls, especially in streaming scenarios.
Support for multimodal output generation:
type LanguageModelV4File = {
type: 'file';
mediaType: string; // IANA media type (e.g., 'image/png', 'audio/mp3')
data: string | Uint8Array; // Generated file data as base64 encoded strings or binary data
};
This enables models to generate images, audio, documents, and other file types directly.
Dedicated support for chain-of-thought reasoning (essential for models like OpenAI's o1):
type LanguageModelV4Reasoning = {
type: 'reasoning';
text: string;
/**
* Optional provider-specific metadata for the reasoning part.
*/
providerMetadata?: SharedV4ProviderMetadata;
};
Reasoning content is tracked separately from regular text, allowing for proper token accounting and UI presentation.
type LanguageModelV4Source = {
type: 'source';
sourceType: 'url';
id: string;
url: string;
title?: string;
providerMetadata?: SharedV4ProviderMetadata;
};
LanguageModelV4PromptThe V4 prompt format (LanguageModelV4Prompt) is designed as a flexible message array that supports multimodal inputs:
Each message has a specific role with allowed content types:
System: Model instructions (text only)
{ role: 'system', content: string }
User: Human inputs supporting text and files
{ role: 'user', content: Array<LanguageModelV4TextPart | LanguageModelV4FilePart> }
Assistant: Model outputs with full content type support
{ role: 'assistant', content: Array<LanguageModelV4TextPart | LanguageModelV4FilePart | LanguageModelV4ReasoningPart | LanguageModelV4ToolCallPart> }
Tool: Results from tool executions
{ role: 'tool', content: Array<LanguageModelV4ToolResultPart> }
Prompt parts are the building blocks of messages in the prompt structure. While LanguageModelV4Content represents the model's output content, prompt parts are specifically designed for constructing input messages. Each message role supports different types of prompt parts:
Let's explore each prompt part type:
The most basic prompt part, containing plain text content:
interface LanguageModelV4TextPart {
type: 'text';
text: string;
providerOptions?: SharedV4ProviderOptions;
}
Used in assistant messages to capture the model's reasoning process:
interface LanguageModelV4ReasoningPart {
type: 'reasoning';
text: string;
providerOptions?: SharedV4ProviderOptions;
}
Enable multimodal inputs by including files in prompts:
interface LanguageModelV4FilePart {
type: 'file';
filename?: string;
data: LanguageModelV4DataContent;
mediaType: string;
providerOptions?: SharedV4ProviderOptions;
}
The data field offers flexibility:
supportedUrls)Represent tool calls made by the assistant:
interface LanguageModelV4ToolCallPart {
type: 'tool-call';
toolCallId: string;
toolName: string;
args: unknown;
providerOptions?: SharedV4ProviderOptions;
}
Contain the results of executed tool calls:
interface LanguageModelV4ToolResultPart {
type: 'tool-result';
toolCallId: string;
toolName: string;
result: unknown;
isError?: boolean;
content?: Array<{
type: 'text' | 'image';
text?: string;
data?: string; // base64 encoded image data
mediaType?: string;
}>;
providerOptions?: SharedV4ProviderOptions;
}
The optional content field enables rich tool results including images, providing more flexibility than the basic result field.
The streaming system uses typed events for different stages:
Stream Lifecycle Events:
stream-start: Initial event with any warnings about unsupported featuresresponse-metadata: Model information and response headersfinish: Final event with usage statistics and finish reasonerror: Error events that can occur at any pointContent Events:
text, file, reasoning, source, tool-call) stream directlytool-call-delta: Incremental updates for tool call argumentsreasoning-part-finish: Explicit marker for reasoning section completionExample stream sequence:
{ type: 'stream-start', warnings: [] }
{ type: 'text', text: 'Hello' }
{ type: 'text', text: ' world' }
{ type: 'tool-call', toolCallId: '1', toolName: 'search', args: {...} }
{ type: 'response-metadata', modelId: 'gpt-4.1', ... }
{ type: 'finish', usage: { inputTokens: 10, outputTokens: 20 }, finishReason: 'stop' }
Enhanced usage information:
type LanguageModelV4Usage = {
inputTokens: number | undefined;
outputTokens: number | undefined;
totalTokens: number | undefined;
reasoningTokens?: number | undefined;
cachedInputTokens?: number | undefined;
};
The V4 specification supports two types of tools:
Standard user-defined functions with JSON Schema validation:
type LanguageModelV4FunctionTool = {
type: 'function';
name: string;
description?: string;
parameters: JSONSchema7; // Full JSON Schema support
};
Native provider capabilities exposed as tools:
export type LanguageModelV4ProviderClientDefinedTool = {
type: 'provider-defined-client';
id: string; // e.g., 'anthropic.computer-use'
name: string; // Human-readable name
args: Record<string, unknown>;
};
Tool choice can be controlled via:
toolChoice: 'auto' | 'none' | 'required' | { type: 'tool', toolName: string };
Providers can declare URLs they can access directly:
supportedUrls: {
'image/*': [/^https:\/\/cdn\.example\.com\/.*/],
'application/pdf': [/^https:\/\/docs\.example\.com\/.*/],
'audio/*': [/^https:\/\/media\.example\.com\/.*/]
}
The AI SDK checks these patterns before downloading any URL-based content.
The specification includes a flexible system for provider-specific features without breaking the standard interface:
providerOptions: {
anthropic: {
cacheControl: true,
maxTokens: 4096
},
openai: {
parallelToolCalls: false,
responseFormat: { type: 'json_object' }
}
}
Provider options can be specified at multiple levels:
LanguageModelV4CallOptionsThis layered approach allows fine-grained control while maintaining compatibility.
The V4 specification emphasizes robust error handling:
{ type: 'error', error: unknown }stream-start and response objects'stop': Natural completion'length': Hit max tokens'content-filter': Safety filtering'tool-calls': Stopped to execute tools'error': Generation failed'other': Provider-specific reasonsTo implement a custom language model provider, you'll need to install the required packages:
npm install @ai-sdk/provider @ai-sdk/provider-utils
Implementing a custom language model provider involves several steps:
Start by creating a provider.ts file that exports a factory function and a default instance:
import {
generateId,
loadApiKey,
withoutTrailingSlash,
} from '@ai-sdk/provider-utils';
import { ProviderV4 } from '@ai-sdk/provider';
import { CustomChatLanguageModel } from './custom-chat-language-model';
// Define your provider interface extending ProviderV4
interface CustomProvider extends ProviderV4 {
(modelId: string, settings?: CustomChatSettings): CustomChatLanguageModel;
// Add specific methods for different model types
languageModel(
modelId: string,
settings?: CustomChatSettings,
): CustomChatLanguageModel;
}
// Provider settings
interface CustomProviderSettings {
/**
* Base URL for API calls
*/
baseURL?: string;
/**
* API key for authentication
*/
apiKey?: string;
/**
* Custom headers for requests
*/
headers?: Record<string, string>;
}
// Factory function to create provider instance
function createCustom(options: CustomProviderSettings = {}): CustomProvider {
const createChatModel = (
modelId: string,
settings: CustomChatSettings = {},
) =>
new CustomChatLanguageModel(modelId, settings, {
provider: 'custom',
baseURL:
withoutTrailingSlash(options.baseURL) ?? 'https://api.custom.ai/v1',
headers: () => ({
Authorization: `Bearer ${loadApiKey({
apiKey: options.apiKey,
environmentVariableName: 'CUSTOM_API_KEY',
description: 'Custom Provider',
})}`,
...options.headers,
}),
generateId: options.generateId ?? generateId,
});
const provider = function (modelId: string, settings?: CustomChatSettings) {
if (new.target) {
throw new Error(
'The model factory function cannot be called with the new keyword.',
);
}
return createChatModel(modelId, settings);
};
provider.languageModel = createChatModel;
return provider as CustomProvider;
}
// Export default provider instance
const custom = createCustom();
Create a custom-chat-language-model.ts file that implements LanguageModelV4:
import { LanguageModelV4, LanguageModelV4CallOptions } from '@ai-sdk/provider';
import { postJsonToApi } from '@ai-sdk/provider-utils';
class CustomChatLanguageModel implements LanguageModelV4 {
readonly specificationVersion = 'V4';
readonly provider: string;
readonly modelId: string;
constructor(
modelId: string,
settings: CustomChatSettings,
config: CustomChatConfig,
) {
this.provider = config.provider;
this.modelId = modelId;
// Initialize with settings and config
}
// Convert AI SDK prompt to provider format
private getArgs(options: LanguageModelV4CallOptions) {
const warnings: SharedV4Warning[] = [];
// Map messages to provider format
const messages = this.convertToProviderMessages(options.prompt);
// Handle tools if provided
const tools = options.tools
? this.prepareTools(options.tools, options.toolChoice)
: undefined;
// Build request body
const body = {
model: this.modelId,
messages,
temperature: options.temperature,
max_tokens: options.maxOutputTokens,
stop: options.stopSequences,
tools,
// ... other parameters
};
return { args: body, warnings };
}
async doGenerate(options: LanguageModelV4CallOptions) {
const { args, warnings } = this.getArgs(options);
// Make API call
const response = await postJsonToApi({
url: `${this.config.baseURL}/chat/completions`,
headers: this.config.headers(),
body: args,
abortSignal: options.abortSignal,
});
// Convert provider response to AI SDK format
const content: LanguageModelV4Content[] = [];
// Extract text content
if (response.choices[0].message.content) {
content.push({
type: 'text',
text: response.choices[0].message.content,
});
}
// Extract tool calls
if (response.choices[0].message.tool_calls) {
for (const toolCall of response.choices[0].message.tool_calls) {
content.push({
type: 'tool-call',
toolCallType: 'function',
toolCallId: toolCall.id,
toolName: toolCall.function.name,
args: JSON.stringify(toolCall.function.arguments),
});
}
}
return {
content,
finishReason: this.mapFinishReason(response.choices[0].finish_reason),
usage: {
inputTokens: response.usage?.prompt_tokens,
outputTokens: response.usage?.completion_tokens,
totalTokens: response.usage?.total_tokens,
},
request: { body: args },
response: { body: response },
warnings,
};
}
async doStream(options: LanguageModelV4CallOptions) {
const { args, warnings } = this.getArgs(options);
// Create streaming response
const response = await fetch(`${this.config.baseURL}/chat/completions`, {
method: 'POST',
headers: {
...this.config.headers(),
'Content-Type': 'application/json',
},
body: JSON.stringify({ ...args, stream: true }),
signal: options.abortSignal,
});
// Transform stream to AI SDK format
const stream = response
.body!.pipeThrough(new TextDecoderStream())
.pipeThrough(this.createParser())
.pipeThrough(this.createTransformer(warnings));
return { stream, warnings };
}
// Supported URL patterns for native file handling
get supportedUrls() {
return {
'image/*': [/^https:\/\/example\.com\/images\/.*/],
};
}
}
Map AI SDK messages to your provider's format:
private convertToProviderMessages(prompt: LanguageModelV4Prompt) {
return prompt.map((message) => {
switch (message.role) {
case 'system':
return { role: 'system', content: message.content };
case 'user':
return {
role: 'user',
content: message.content.map((part) => {
switch (part.type) {
case 'text':
return { type: 'text', text: part.text };
case 'file':
return {
type: 'image_url',
image_url: {
url: this.convertFileToUrl(part.data),
},
};
default:
throw new Error(`Unsupported part type: ${part.type}`);
}
}),
};
case 'assistant':
// Handle assistant messages with text, tool calls, etc.
return this.convertAssistantMessage(message);
case 'tool':
// Handle tool results
return this.convertToolMessage(message);
default:
throw new Error(`Unsupported message role: ${message.role}`);
}
});
}
Create a streaming transformer that converts provider chunks to AI SDK stream parts:
private createTransformer(warnings: SharedV4Warning[]) {
let isFirstChunk = true;
return new TransformStream<ParsedChunk, LanguageModelV4StreamPart>({
async transform(chunk, controller) {
// Send warnings with first chunk
if (isFirstChunk) {
controller.enqueue({ type: 'stream-start', warnings });
isFirstChunk = false;
}
// Handle different chunk types
if (chunk.choices?.[0]?.delta?.content) {
controller.enqueue({
type: 'text',
text: chunk.choices[0].delta.content,
});
}
if (chunk.choices?.[0]?.delta?.tool_calls) {
for (const toolCall of chunk.choices[0].delta.tool_calls) {
controller.enqueue({
type: 'tool-call-delta',
toolCallType: 'function',
toolCallId: toolCall.id,
toolName: toolCall.function.name,
argsTextDelta: toolCall.function.arguments,
});
}
}
// Handle finish reason
if (chunk.choices?.[0]?.finish_reason) {
controller.enqueue({
type: 'finish',
finishReason: this.mapFinishReason(chunk.choices[0].finish_reason),
usage: {
inputTokens: chunk.usage?.prompt_tokens,
outputTokens: chunk.usage?.completion_tokens,
totalTokens: chunk.usage?.total_tokens,
},
});
}
},
});
}
Use standardized AI SDK errors for consistent error handling:
import {
APICallError,
InvalidResponseDataError,
TooManyRequestsError,
} from '@ai-sdk/provider';
private handleError(error: unknown): never {
if (error instanceof Response) {
const status = error.status;
if (status === 429) {
throw new TooManyRequestsError({
cause: error,
retryAfter: this.getRetryAfter(error),
});
}
throw new APICallError({
statusCode: status,
statusText: error.statusText,
cause: error,
isRetryable: status >= 500 && status < 600,
});
}
throw error;
}
@ai-sdk/provider