Embeddings Overview

Mistral.rs can load embedding models alongside chat, multimodal, diffusion, and speech workloads. Embedding models produce dense vector representations that you can use for similarity search, clustering, reranking, and other semantic tasks.

Supported models

Model	Notes	Documentation
EmbeddingGemma	Google’s multilingual embedding model.	EMBEDDINGGEMMA.md
Qwen3 Embedding	Qwen’s general-purpose embedding encoder.	QWEN3_EMBEDDING.md

Have another embedding model you would like supported? Open an issue with the model ID and configuration.

Usage overview

Choose a model from the table above.
Load it through one of our APIs:
- CLI/HTTP
- Python
- Rust

Detailed examples for each model live in their dedicated documentation pages.