content/providers/03-community-providers/43-voyage-ai.mdx
patelvivekdev/voyage-ai-provider is a community provider that uses Voyage AI to provide Embedding support for the AI SDK.
The Voyage provider is available in the voyage-ai-provider module. You can install it with
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}> <Tab> <Snippet text="pnpm add voyage-ai-provider" dark /> </Tab> <Tab> <Snippet text="npm install voyage-ai-provider" dark /> </Tab> <Tab> <Snippet text="yarn add voyage-ai-provider" dark /> </Tab> <Tab> <Snippet text="bun add voyage-ai-provider" dark /> </Tab> </Tabs>
You can import the default provider instance voyage from voyage-ai-provider:
import { voyage } from 'voyage-ai-provider';
If you need a customized setup, you can import createVoyage from voyage-ai-provider and create a provider instance with your settings:
import { createVoyage } from 'voyage-ai-provider';
const voyage = createVoyage({
// custom settings
});
You can use the following optional settings to customize the Voyage provider instance:
baseURL string
The base URL of the Voyage API.
The default prefix is https://api.voyageai.com/v1.
apiKey string
API key that is being sent using the Authorization header.
It defaults to the VOYAGE_API_KEY environment variable.
headers Record<string,string>
Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation.
Defaults to the global fetch function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
You can create models that call the Voyage embeddings API
using the .embeddingModel() factory method.
import { voyage } from 'voyage-ai-provider';
const embeddingModel = voyage.embeddingModel('voyage-3.5-lite');
You can use Voyage embedding models to generate embeddings with the embed or embedMany function:
import { voyage } from 'voyage-ai-provider';
import { embed } from 'ai';
const { embedding } = await embed({
model: voyage.embeddingModel('voyage-3.5-lite'),
value: 'sunny day at the beach',
providerOptions: {
voyage: {
inputType: 'document',
},
},
});
Voyage embedding models support additional provider options that can be passed via providerOptions.voyage:
import { voyage } from 'voyage-ai-provider';
import { embed } from 'ai';
const { embedding } = await embed({
model: voyage.embeddingModel('voyage-3.5-lite'),
value: 'sunny day at the beach',
providerOptions: {
voyage: {
inputType: 'query',
outputDimension: 512,
},
},
});
The following provider options are available:
inputType 'query' | 'document' | 'null'
Specifies the type of input passed to the model. Defaults to 'null'.
'null': When inputType is 'null', the embedding model directly converts the inputs into numerical vectors.For retrieval/search purposes it is recommended to use 'query' or 'document'.
'query': The input is a search query, e.g., "Represent the query for retrieving supporting documents: ...".'document': The input is a document to be stored in a vector database, e.g., "Represent the document for retrieval: ...".outputDimension number
The number of dimensions for the resulting output embeddings. Default is 'null'.
voyage-code-3 and voyage-3-large support: 2048, 1024 (default), 512, and 256.outputDtype 'float' | 'int8' | 'uint8' | 'binary' | 'ubinary'
The data type for the output embeddings. Defaults to 'float'.
'float': 32-bit floating-point numbers (supported by all models).'int8', 'uint8': 8-bit integer types (supported by voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3).'binary', 'ubinary': Bit-packed, quantized single-bit embedding values (voyage-3-large, voyage-3.5, voyage-3.5-lite, and voyage-code-3). The returned list length is 1/8 of outputDimension. 'binary' uses offset binary encoding.See FAQ: Output Data Types for more details.
truncation boolean
Whether to truncate the input texts to fit within the model's context length. If not specified, defaults to true.
You can find more models on the Voyage Library homepage.
| Model | Default Dimensions | Context Length |
|---|---|---|
voyage-3.5 | 1024 (default), 256, 512, 2048 | 32,000 |
voyage-3.5-lite | 1024 (default), 256, 512, 2048 | 32,000 |
voyage-3-large | 1024 (default), 256, 512, 2048 | 32,000 |
voyage-3 | 1024 | 32,000 |
voyage-code-3 | 1024 (default), 256, 512, 2048 | 32,000 |
voyage-3-lite | 512 | 32,000 |
voyage-finance-2 | 1024 | 32,000 |
voyage-multilingual-2 | 1024 | 32,000 |
voyage-law-2 | 1024 | 32,000 |
voyage-code-2 | 1024 | 16,000 |
import { voyage, ImageEmbeddingInput } from 'voyage-ai-provider';
import { embedMany } from 'ai';
const imageModel = voyage.imageEmbeddingModel('voyage-multimodal-3');
const { embeddings } = await embedMany<ImageEmbeddingInput>({
model: imageModel,
values: [
{
image:
'https://raw.githubusercontent.com/voyage-ai/voyage-multimodal-3/refs/heads/main/images/banana_200_x_200.jpg',
},
{
image: 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA...',
},
],
// or you can pass the array of images url and base64 string directly
// values: [
// 'https://raw.githubusercontent.com/voyage-ai/voyage-multimodal-3/refs/heads/main/images/banana_200_x_200.jpg',
// 'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA...',
// ],
});
import { voyage, ImageEmbeddingInput } from 'voyage-ai-provider';
import { embedMany } from 'ai';
const imageModel = voyage.imageEmbeddingModel('voyage-multimodal-3');
const { embeddings } = await embedMany<ImageEmbeddingInput>({
model: imageModel,
values: [
{
image: [
'https://raw.githubusercontent.com/voyage-ai/voyage-multimodal-3/refs/heads/main/images/banana_200_x_200.jpg',
'data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA...',
],
},
],
});
import { voyage, MultimodalEmbeddingInput } from 'voyage-ai-provider';
import { embedMany } from 'ai';
const multimodalModel = voyage.multimodalEmbeddingModel('voyage-multimodal-3');
const { embeddings } = await embedMany<MultimodalEmbeddingInput>({
model: multimodalModel,
values: [
{
text: ['Hello, world!', 'This is a banana'],
image: [
'https://raw.githubusercontent.com/voyage-ai/voyage-multimodal-3/refs/heads/main/images/banana_200_x_200.jpg',
],
},
{
text: ['Hello, coders!', 'This is a coding test'],
image: ['data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA...'],
},
],
});
The following constraints apply to the values list:
Voyage multimodal embedding models support additional provider options that can be passed via providerOptions.voyage:
import { voyage, MultimodalEmbeddingInput } from 'voyage-ai-provider';
import { embedMany } from 'ai';
const multimodalModel = voyage.multimodalEmbeddingModel('voyage-multimodal-3');
const { embeddings } = await embedMany<MultimodalEmbeddingInput>({
model: multimodalModel,
values: [
{
text: ['Hello, world!'],
image: ['data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAA...'],
},
],
providerOptions: {
voyage: {
inputType: 'query',
outputEncoding: 'base64',
truncation: true,
},
},
});
The following provider options are available:
inputType 'query' | 'document'
Specifies the type of input passed to the model. Defaults to 'query'.
When inputType is specified as 'query' or 'document', Voyage automatically prepends a prompt to your inputs before vectorizing them, creating vectors tailored for retrieval/search tasks:
'query': Prepends "Represent the query for retrieving supporting documents: "'document': Prepends "Represent the document for retrieval: "outputEncoding 'base64'
The data encoding for the resulting output embeddings. Defaults to null (list of 32-bit floats).
null, embeddings are returned as a list of floating-point numbers (float32).'base64', embeddings are returned as a Base64-encoded NumPy array of single-precision floats.See FAQ: Output Data Types for more details.
truncation boolean
Whether to truncate the inputs to fit within the model's context length. If not specified, defaults to true.
| Model | Context Length (tokens) | Embedding Dimension |
|---|---|---|
voyage-multimodal-3 | 32,000 | 1024 |