Back to Llama Index

LlamaIndex Embeddings Integration: VoyageAI

llama-index-integrations/embeddings/llama-index-embeddings-voyageai/README.md

0.14.2112.6 KB
Original Source

LlamaIndex Embeddings Integration: VoyageAI

The llama-index-embeddings-voyageai package contains LlamaIndex integrations for building applications with VoyageAI's state-of-the-art embedding models. This integration provides support for text embeddings, multimodal embeddings, and contextual embeddings via the VoyageAI API.

Installation

shell
pip install llama-index-embeddings-voyageai

Setup

1. Get Your API Key

Sign up for a VoyageAI account and obtain your API key from the VoyageAI Dashboard.

2. Set Environment Variable

Export your API key as an environment variable:

bash
export VOYAGE_API_KEY="your-api-key-here"

Usage

Basic Usage

python
from llama_index.embeddings.voyageai import VoyageEmbedding

# Initialize the VoyageAI Embedding model
embedding_model = VoyageEmbedding(
    model_name="voyage-3.5",
    voyage_api_key="your-api-key",  # Optional if VOYAGE_API_KEY is set
)

# Get a single embedding
embedding = embedding_model.get_text_embedding("Your text here")
print(f"Embedding dimension: {len(embedding)}")

# Get embeddings for multiple texts
texts = ["Text 1", "Text 2", "Text 3"]
embeddings = embedding_model.get_text_embedding_batch(texts)
print(f"Number of embeddings: {len(embeddings)}")

Query vs Document Embeddings

VoyageAI embeddings distinguish between queries and documents for optimal retrieval performance:

python
from llama_index.embeddings.voyageai import VoyageEmbedding

embedding_model = VoyageEmbedding(model_name="voyage-3.5")

# Get query embedding (automatically uses input_type="query")
query_embedding = embedding_model.get_query_embedding(
    "What is machine learning?"
)

# Get document embedding (automatically uses input_type="document")
doc_embedding = embedding_model.get_text_embedding("Machine learning is...")

Advanced Parameters

python
from llama_index.embeddings.voyageai import VoyageEmbedding

embedding_model = VoyageEmbedding(
    model_name="voyage-3.5",
    voyage_api_key="your-api-key",
    truncation=True,  # Enable text truncation
    output_dtype="float",  # Options: "float", "int8", "uint8", "binary", "ubinary"
    output_dimension=512,  # Reduce dimensionality (256, 512, 1024, 2048)
    embed_batch_size=128,  # Batch size for processing
)

# Use general text embedding with custom input type
embedding = embedding_model.get_general_text_embedding(
    "Your text here", input_type="query"
)

Multimodal Embeddings

VoyageAI supports multimodal embeddings for text and images with voyage-multimodal-3, and text, images, and video with voyage-multimodal-3.5. Important: You must set truncation=True when using multimodal models.

python
from llama_index.embeddings.voyageai import VoyageEmbedding
from io import BytesIO

# Initialize with multimodal model (truncation=True is REQUIRED)
embedding_model = VoyageEmbedding(
    model_name="voyage-multimodal-3",  # or "voyage-multimodal-3.5" for video support
    truncation=True,  # Required for multimodal models
)

# Embed an image from file path (PNG, JPEG, JPG, WEBP, GIF supported)
image_embedding = embedding_model.get_image_embedding("path/to/image.jpg")
print(f"Image embedding dimension: {len(image_embedding)}")  # 1024

# Embed an image from BytesIO
with open("path/to/image.png", "rb") as f:
    image_data = BytesIO(f.read())
    image_embedding = embedding_model.get_image_embedding(image_data)

# The multimodal model also works with text
text_embedding = embedding_model.get_text_embedding("Description of the image")
query_embedding = embedding_model.get_query_embedding(
    "Find images with red color"
)

# Batch text embeddings
batch_embeddings = embedding_model.get_text_embedding_batch(
    ["Image description 1", "Image description 2", "Image description 3"]
)

Video Embeddings (voyage-multimodal-3.5 only)

python
from llama_index.embeddings.voyageai import VoyageEmbedding

# Initialize with voyage-multimodal-3.5 for video support
embedding_model = VoyageEmbedding(
    model_name="voyage-multimodal-3.5",
    truncation=True,
)

# Embed a single video (max 20MB, supports MP4, MPEG, MOV, AVI, FLV, MPG, WEBM, WMV, 3GP)
video_embedding = embedding_model.get_video_embedding("path/to/video.mp4")
print(f"Video embedding dimension: {len(video_embedding)}")  # 1024

# Embed multiple videos
video_embeddings = embedding_model.get_video_embeddings(
    ["video1.mp4", "video2.mp4", "video3.mp4"]
)

# Async video embedding
video_embedding = await embedding_model.aget_video_embedding(
    "path/to/video.mp4"
)

Contextual Embeddings

For enhanced context-aware embeddings using the voyage-context-3 model:

python
from llama_index.embeddings.voyageai import VoyageEmbedding

# Initialize with contextual model
embedding_model = VoyageEmbedding(
    model_name="voyage-context-3", output_dtype="float", output_dimension=1024
)

# The model will use contextualized_embed internally
# providing enhanced embeddings with better context understanding
embeddings = embedding_model.get_text_embedding_batch(
    ["First document chunk", "Second document chunk", "Third document chunk"]
)

Async Usage

The integration supports async operations for better performance:

python
import asyncio
from llama_index.embeddings.voyageai import VoyageEmbedding


async def get_embeddings_async():
    # Regular text embeddings
    embedding_model = VoyageEmbedding(model_name="voyage-3.5")

    # Get async query embedding
    query_embedding = await embedding_model.aget_query_embedding("Your query")

    # Get async text embeddings
    embeddings = await embedding_model.aget_text_embedding_batch(
        ["Text 1", "Text 2", "Text 3"]
    )

    # For multimodal image embeddings
    multimodal_model = VoyageEmbedding(
        model_name="voyage-multimodal-3",
        truncation=True,  # Required for multimodal
    )
    image_embedding = await multimodal_model.aget_image_embedding(
        "path/to/image.jpg"
    )

    return query_embedding, embeddings, image_embedding


# Run async function
results = asyncio.run(get_embeddings_async())

Integration with LlamaIndex

python
from llama_index.core import VectorStoreIndex, Settings, Document
from llama_index.embeddings.voyageai import VoyageEmbedding
from llama_index.llms.openai import OpenAI

# Configure LlamaIndex settings
Settings.llm = OpenAI()
Settings.embed_model = VoyageEmbedding(
    model_name="voyage-3.5", voyage_api_key="your-api-key"
)

# Create documents
documents = [
    Document(text="LlamaIndex is a data framework for LLM applications."),
    Document(text="VoyageAI provides state-of-the-art embedding models."),
    Document(text="Embeddings convert text into numerical vectors."),
]

# Create vector index
index = VectorStoreIndex.from_documents(documents)

# Query the index
query_engine = index.as_query_engine(similarity_top_k=2)
response = query_engine.query("What is LlamaIndex?")
print(response)

Available Models

VoyageAI offers several specialized embedding models:

Text Embeddings

  • voyage-4: General-purpose and multilingual retrieval with 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-4-lite: Cost and latency optimized with highest throughput, 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-4-large: Best retrieval quality in voyage-4 series, 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-3.5: Latest general-purpose model with 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-3.5-lite: Cost and latency optimized variant with 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-3-large: Best for general-purpose and multilingual retrieval, 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-code-3: Specialized for code retrieval, 1024 dimensions (supports 256, 512, 1024, 2048)
  • voyage-3: General-purpose model (1024 dimensions)
  • voyage-3-lite: Lightweight variant (512 dimensions)

Domain-Specific Models

  • voyage-finance-2: Optimized for financial documents (1024 dimensions)
  • voyage-law-2: Specialized for legal documents (1024 dimensions)
  • voyage-multilingual-2: Enhanced multilingual support (1024 dimensions)

Specialized Models

  • voyage-multimodal-3: Supports text and image embeddings (1024 dimensions)
  • voyage-multimodal-3.5: Supports text, image, and video embeddings (1024 dimensions, supports 256, 512, 2048). Currently in preview.
  • voyage-context-3: Enhanced contextual embeddings with 32K batch token limit (1024 dimensions)

Legacy Models

  • voyage-2: Earlier generation model (1024 dimensions)
  • voyage-large-2: Large variant (1536 dimensions)
  • voyage-large-2-instruct: Large instruct variant (1024 dimensions)
  • voyage-code-2: Code embedding model (1536 dimensions)

For the latest model information, visit the VoyageAI documentation.

Configuration Options

ParameterTypeDefaultDescription
model_namestrRequiredThe embedding model to use
voyage_api_keystrNoneVoyageAI API key (falls back to VOYAGE_API_KEY env var)
embed_batch_sizeint1000Batch size for embedding calls (max 1000)
truncationboolNoneEnable text truncation for long inputs
output_dtypestrNoneOutput format: "float", "int8", "uint8", "binary", "ubinary"
output_dimensionintNoneReduce dimensionality (256, 512, 1024, 2048, model-dependent)
callback_managerCallbackManagerNoneLlamaIndex callback manager for observability

Features

  • Dynamic Batching: Automatically batches requests based on token limits for each model
  • Token Management: Respects per-model token limits (ranging from 32K to 1M tokens)
  • Multimodal Support: Process text, images, and videos with multimodal models
  • Video Embeddings: Embed video content with voyage-multimodal-3.5 (requires voyageai>=0.3.6)
  • Contextual Embeddings: Enhanced context-aware embeddings with specialized models
  • Async Support: Full async/await support for better performance
  • Flexible Output: Support for various output data types and dimensions
  • Auto-truncation: Optional text truncation for inputs exceeding model limits

API Batch Token Limits

These limits represent the maximum total tokens that can be sent in a single API request (across all texts in the batch):

ModelBatch Token Limit
voyage-4-lite1,000,000
voyage-3.5-lite1,000,000
voyage-4320,000
voyage-3.5320,000
voyage-multimodal-3320,000
voyage-multimodal-3.5320,000
voyage-2320,000
voyage-4-large120,000
voyage-3-large120,000
voyage-code-3120,000
voyage-large-2-instruct120,000
voyage-finance-2120,000
voyage-multilingual-2120,000
voyage-law-2120,000
voyage-large-2120,000
voyage-3120,000
voyage-3-lite120,000
voyage-code-2120,000
voyage-context-332,000

Note: The maximum batch size is 1,000 items per API request. The integration automatically handles batching based on both token limits and batch size.

Environment Variables

VariableDescription
VOYAGE_API_KEYVoyageAI API key (required)

Error Handling

The integration includes proper error handling for:

  • Missing or invalid API keys
  • Unsupported image formats (for multimodal models)
  • Invalid model selection
  • Network errors and API failures
  • Token limit violations

Additional Information

For more information about VoyageAI and its embedding models:

License

This project is licensed under the MIT License.