*LiteLLM* embeddings & speech-to-text - Cocoindex

The cocoindex.ops.litellm module provides integration with the LiteLLM library for text embeddings (LiteLLMEmbedder) and speech-to-text transcription (LiteLLMTranscriber).

python

from cocoindex.ops.litellm import LiteLLMEmbedder, LiteLLMTranscriber

:::note[Dependencies] This module requires additional dependencies. Install with:

bash

pip install cocoindex[litellm]

:::

Overview

The LiteLLMEmbedder class is a wrapper around LiteLLM's embedding API that:

Implements VectorSchemaProvider for seamless integration with CocoIndex connectors
Supports 100+ embedding providers (OpenAI, Azure, Vertex AI, Cohere, Bedrock, etc.) through a unified API
Provides a simple async embed() method
Passes through all additional arguments to the LiteLLM embedding API
Returns properly typed numpy arrays

Basic usage

Creating an embedder

All extra keyword arguments are passed through to every litellm.aembedding call. See Supported providers for provider-specific model strings and configuration.

python

from cocoindex.ops.litellm import LiteLLMEmbedder

embedder = LiteLLMEmbedder("text-embedding-3-small")

# With explicit API key and base URL
embedder = LiteLLMEmbedder("text-embedding-3-small", api_key="sk-...", api_base="https://my-proxy.example.com")

# With custom dimensions (OpenAI text-embedding-3 models)
embedder = LiteLLMEmbedder("text-embedding-3-small", dimensions=512)

# With a timeout (seconds)
embedder = LiteLLMEmbedder("text-embedding-3-small", timeout=30)

Embedding text

The embed() method converts text into a numpy.ndarray of float32. It's an async method — use await when calling it:

python

# In a CocoIndex function
embedding = await embedder.embed("Hello, world!")

# Use the embedding in a dataclass row, store in a vector database, etc.
table.declare_row(row=DocEmbedding(text="Hello, world!", embedding=embedding))

Using as a type annotation

The LiteLLMEmbedder implements VectorSchemaProvider, which means it can be used directly as metadata in Annotated type annotations. This is the recommended way to declare vector columns — CocoIndex connectors automatically extract the vector dimension and dtype from the annotation when creating tables.

python

from dataclasses import dataclass
from typing import Annotated
from numpy.typing import NDArray

embedder = LiteLLMEmbedder("text-embedding-3-small")

@dataclass
class DocEmbedding:
    id: int
    filename: str
    text: str
    embedding: Annotated[NDArray, embedder]

When you pass this dataclass to a connector's TableSchema.from_class(), the connector automatically reads the embedder annotation to determine the vector column's dimension and dtype. For example, with Postgres:

python

from cocoindex.connectors import postgres

table_schema = await postgres.TableSchema.from_class(
    DocEmbedding,
    primary_key=["id"],
)
target_table = await postgres.mount_table_target(
    PG_DB,
    table_name="doc_embeddings",
    table_schema=table_schema,
    pg_schema_name="my_schema",
)

The connector automatically creates the appropriate vector(N) column. See the Connectors docs for other supported backends (LanceDB, Qdrant, SQLite).

Speech-to-text

The LiteLLMTranscriber class is a wrapper around LiteLLM's transcription API that turns audio into text through any LiteLLM-supported speech-to-text provider (OpenAI Whisper, ElevenLabs, Groq, and more) using a unified interface.

Creating a transcriber

All extra keyword arguments are passed through to every litellm.atranscription call (e.g., api_key, api_base, language, extra_body).

python

from cocoindex.ops.litellm import LiteLLMTranscriber

transcriber = LiteLLMTranscriber("whisper-1")

# With an explicit API key
transcriber = LiteLLMTranscriber("whisper-1", api_key="sk-...")

# With a default language hint applied to every call
transcriber = LiteLLMTranscriber("whisper-1", language="en")

Transcribing audio

The transcribe() method takes a FileLike object (such as a localfs.File) containing audio data and returns the transcribed text as a str. It's an async method — use await when calling it. Per-call keyword arguments override the defaults provided at construction time:

python

# In a CocoIndex function, `file` is a FileLike (e.g. localfs.File)
transcript = await transcriber.transcribe(file)

# Override or add per-call options
transcript = await transcriber.transcribe(file, response_format="verbose_json")

A complete pipeline that walks local audio files, transcribes each one, and stores the transcripts in Postgres is available in the audio_to_text example.

Supported transcription providers

LiteLLMTranscriber accepts any model string supported by LiteLLM's transcription API. A few common examples:

Model	Model string	Environment variables
OpenAI Whisper	`whisper-1`	`OPENAI_API_KEY`
ElevenLabs Scribe	`elevenlabs/scribe_v1`	`ELEVENLABS_API_KEY`
Groq Whisper Large V3	`groq/whisper-large-v3`	`GROQ_API_KEY`

See the LiteLLM audio transcription docs for the full list of providers and model strings.

Supported providers

Below are common providers with their model strings and configuration. The litellm module is re-exported from cocoindex.ops.litellm for setting provider-specific variables. See the LiteLLM embedding docs for the full list.

Ollama

Model	Model string
Nomic Embed Text	`ollama/nomic-embed-text`
MXBai Embed Large	`ollama/mxbai-embed-large`
All MiniLM	`ollama/all-minilm`
Snowflake Arctic Embed	`ollama/snowflake-arctic-embed`
BGE M3	`ollama/bge-m3`

No API key required. Ollama must be running locally (default http://localhost:11434). Pull the model first with ollama pull <model-name>.

python

embedder = LiteLLMEmbedder("ollama/nomic-embed-text", api_base="http://localhost:11434")

OpenAI

Model	Model string
Text Embedding 3 Small	`text-embedding-3-small`
Text Embedding 3 Large	`text-embedding-3-large`
Text Embedding Ada 002	`text-embedding-ada-002`

Environment variables: OPENAI_API_KEY

python

embedder = LiteLLMEmbedder("text-embedding-3-small")

Azure OpenAI

Model	Model string
Text Embedding 3 Small	`azure/<your-deployment-name>`
Text Embedding Ada 002	`azure/<your-deployment-name>`

The model string uses your Azure deployment name, not the OpenAI model name.

Environment variables: AZURE_API_KEY, AZURE_API_BASE, AZURE_API_VERSION

python

embedder = LiteLLMEmbedder(
    "azure/my-deployment-name",
    api_key="your-azure-api-key",
    api_base="https://my-resource.openai.azure.com",
    api_version="2024-02-01",
)

Gemini (Google AI Studio)

Model	Model string
Text Embedding 004	`gemini/text-embedding-004`

Environment variables: GEMINI_API_KEY

python

embedder = LiteLLMEmbedder("gemini/text-embedding-004")

Vertex AI

Model	Model string
Text Embedding 004	`vertex_ai/text-embedding-004`
Text Multilingual Embedding 002	`vertex_ai/text-multilingual-embedding-002`
Textembedding Gecko	`vertex_ai/textembedding-gecko`

Environment variables: GOOGLE_APPLICATION_CREDENTIALS (path to service account JSON)

Additional configuration: Set project and location via the litellm module or environment variables VERTEXAI_PROJECT and VERTEXAI_LOCATION:

python

from cocoindex.ops.litellm import LiteLLMEmbedder, litellm

litellm.vertex_project = "my-gcp-project"
litellm.vertex_location = "us-central1"

embedder = LiteLLMEmbedder("vertex_ai/text-embedding-004")

AWS Bedrock

Model	Model string
Titan Text Embeddings V2	`bedrock/amazon.titan-embed-text-v2:0`
Titan Text Embeddings V1	`bedrock/amazon.titan-embed-text-v1`
Cohere Embed English	`bedrock/cohere.embed-english-v3`
Cohere Embed Multilingual	`bedrock/cohere.embed-multilingual-v3`

Environment variables: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION_NAME

python

embedder = LiteLLMEmbedder("bedrock/amazon.titan-embed-text-v2:0")

Mistral AI

Model	Model string
Mistral Embed	`mistral/mistral-embed`

Environment variables: MISTRAL_API_KEY

python

embedder = LiteLLMEmbedder("mistral/mistral-embed")

Voyage AI

Model	Model string
Voyage 3.5	`voyage/voyage-3.5`
Voyage 3.5 Lite	`voyage/voyage-3.5-lite`
Voyage Code 3	`voyage/voyage-code-3`

Environment variables: VOYAGE_API_KEY

python

embedder = LiteLLMEmbedder("voyage/voyage-3.5")

Cohere

Model	Model string
Embed English V3	`cohere/embed-english-v3.0`
Embed English Light V3	`cohere/embed-english-light-v3.0`
Embed Multilingual V3	`cohere/embed-multilingual-v3.0`

Environment variables: COHERE_API_KEY

Additional configuration: V3 models require an input_type parameter (defaults to "search_document"; use "search_query" for queries):

python

embedder = LiteLLMEmbedder("cohere/embed-english-v3.0", input_type="search_document")

Nebius AI

Model	Model string
BGE EN ICL	`nebius/BAAI/bge-en-icl`

Environment variables: NEBIUS_API_KEY

python

embedder = LiteLLMEmbedder("nebius/BAAI/bge-en-icl")