docs/quickstart.rst
Characteristics of Sentence Transformer (a.k.a bi-encoder) models:
Once you have installed <installation.html>_ Sentence Transformers, you can easily use Sentence Transformer models:
.. sidebar:: Documentation
SentenceTransformer <sentence_transformers.sentence_transformer.model.SentenceTransformer>SentenceTransformer.encode <sentence_transformers.sentence_transformer.model.SentenceTransformer.encode>SentenceTransformer.encode_query <sentence_transformers.sentence_transformer.model.SentenceTransformer.encode_query>SentenceTransformer.encode_document <sentence_transformers.sentence_transformer.model.SentenceTransformer.encode_document>SentenceTransformer.similarity <sentence_transformers.sentence_transformer.model.SentenceTransformer.similarity>Other useful methods and links:
SentenceTransformer.similarity_pairwise <sentence_transformers.sentence_transformer.model.SentenceTransformer.similarity_pairwise>SentenceTransformer > Usage <./sentence_transformer/usage/usage.html>_SentenceTransformer > Usage > Speeding up Inference <./sentence_transformer/usage/efficiency.html>_SentenceTransformer > Pretrained Models <./sentence_transformer/pretrained_models.html>_SentenceTransformer > Training Overview <./sentence_transformer/training_overview.html>_SentenceTransformer > Dataset Overview <./sentence_transformer/dataset_overview.html>_SentenceTransformer > Loss Overview <./sentence_transformer/loss_overview.html>_SentenceTransformer > Training Examples <./sentence_transformer/training/examples.html>_.. tab:: Text
.. code-block:: python
from sentence_transformers import SentenceTransformer
# 1. Load a pretrained Sentence Transformer model
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
# The sentences to encode
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
# 2. Calculate embeddings by calling model.encode()
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# 3. Calculate the embedding similarities
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6660, 0.1046],
# [0.6660, 1.0000, 0.1411],
# [0.1046, 0.1411, 1.0000]])
.. tab:: Multimodal
.. tip::
Multimodal models require additional dependencies. Install them with e.g. ``pip install -U "sentence-transformers[image]"`` for image support. See `Installation <installation.html>`_ for all options.
.. code-block:: python
from sentence_transformers import SentenceTransformer
# 1. Load a model that supports both text and images
model = SentenceTransformer("Qwen/Qwen3-VL-Embedding-2B", revision="refs/pr/23")
# 2. Encode images from URLs
img_embeddings = model.encode([
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg",
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg",
])
# 3. Encode text queries (one matching + one hard negative per image)
text_embeddings = model.encode([
"A green car parked in front of a yellow building",
"A red car driving on a highway",
"A bee on a pink flower",
"A wasp on a wooden table",
])
# 4. Compute cross-modal similarities
similarities = model.similarity(text_embeddings, img_embeddings)
print(similarities)
# tensor([[0.5115, 0.1078],
# [0.1999, 0.1108],
# [0.1255, 0.6749],
# [0.1283, 0.2704]])
With SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2") we pick which Sentence Transformer model <https://huggingface.co/models?library=sentence-transformers>_ we load. In this example, we load sentence-transformers/all-MiniLM-L6-v2 <https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2>, which is a MiniLM model finetuned on a large dataset of over 1 billion training pairs. Using :meth:SentenceTransformer.similarity() <sentence_transformers.sentence_transformer.model.SentenceTransformer.similarity>, we compute the similarity between all pairs of sentences. As expected, the similarity between semantically related inputs is higher than between unrelated ones. Multimodal models like Qwen/Qwen3-VL-Embedding-2B <https://huggingface.co/Qwen/Qwen3-VL-Embedding-2B> can also encode images, audio, or video into the same embedding space.
Finetuning Sentence Transformer models is easy and requires only a few lines of code. For more information, see the Training Overview <./sentence_transformer/training_overview.html>__ section.
.. tip::
Read `Sentence Transformer > Usage > Speeding up Inference <sentence_transformer/usage/efficiency.html>`_ for tips on how to speed up inference of models by up to 2x-3x.
Characteristics of Cross Encoder (a.k.a reranker) models:
The usage for Cross Encoder (a.k.a. reranker) models is similar to Sentence Transformers:
.. sidebar:: Documentation
CrossEncoder <sentence_transformers.CrossEncoder>CrossEncoder.rank <sentence_transformers.CrossEncoder.rank>CrossEncoder.predict <sentence_transformers.CrossEncoder.predict>Other useful methods and links:
CrossEncoder > Usage <./cross_encoder/usage/usage.html>_CrossEncoder > Pretrained Models <./cross_encoder/pretrained_models.html>_CrossEncoder > Training Overview <./cross_encoder/training_overview.html>_CrossEncoder > Dataset Overview <./cross_encoder/dataset_overview.html>_CrossEncoder > Loss Overview <./cross_encoder/loss_overview.html>_CrossEncoder > Training Examples <./cross_encoder/training/examples.html>_.. tab:: Text
.. code-block:: python
from sentence_transformers import CrossEncoder
# 1. Load a pretrained CrossEncoder model
model = CrossEncoder("cross-encoder/stsb-distilroberta-base")
# We want to compute the similarity between the query sentence...
query = "A man is eating pasta."
# ... and all sentences in the corpus
corpus = [
"A man is eating food.",
"A man is eating a piece of bread.",
"The girl is carrying a baby.",
"A man is riding a horse.",
"A woman is playing violin.",
"Two men pushed carts through the woods.",
"A man is riding a white horse on an enclosed ground.",
"A monkey is playing drums.",
"A cheetah is running behind its prey.",
]
# 2. We rank all sentences in the corpus for the query
ranks = model.rank(query, corpus)
# Print the scores
print("Query: ", query)
for rank in ranks:
print(f"{rank['score']:.2f}\t{corpus[rank['corpus_id']]}")
"""
Query: A man is eating pasta.
0.67 A man is eating food.
0.34 A man is eating a piece of bread.
0.08 A man is riding a horse.
0.07 A man is riding a white horse on an enclosed ground.
0.01 The girl is carrying a baby.
0.01 Two men pushed carts through the woods.
0.01 A monkey is playing drums.
0.01 A woman is playing violin.
0.01 A cheetah is running behind its prey.
"""
# 3. Alternatively, you can also manually compute the score between two sentences
import numpy as np
sentence_combinations = [[query, sentence] for sentence in corpus]
scores = model.predict(sentence_combinations)
# Sort the scores in decreasing order to get the corpus indices
ranked_indices = np.argsort(scores)[::-1]
print("Scores:", scores)
print("Indices:", ranked_indices)
"""
Scores: [0.6732372, 0.34102544, 0.00542465, 0.07569341, 0.00525378, 0.00536814, 0.06676237, 0.00534825, 0.00516717]
Indices: [0 1 3 6 2 5 7 4 8]
"""
.. tab:: Multimodal
.. code-block:: python
from sentence_transformers import CrossEncoder
model = CrossEncoder("Qwen/Qwen3-VL-Reranker-2B", revision="refs/pr/11")
query = "A green car parked in front of a yellow building"
documents = [
# Image documents (URL or local file path)
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg",
"https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bee.jpg",
# Text document
"A vintage Volkswagen Beetle painted in bright green sits in a driveway.",
# Combined text + image document
{
"text": "A car in a European city",
"image": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg",
},
]
rankings = model.rank(query, documents)
for rank in rankings:
print(f"{rank['score']:.4f}\t(document {rank['corpus_id']})")
"""
0.9375 (document 0)
0.5000 (document 3)
-1.2500 (document 2)
-2.4375 (document 1)
"""
With CrossEncoder("cross-encoder/stsb-distilroberta-base") we pick which CrossEncoder model <./cross_encoder/pretrained_models.html>_ we load. CrossEncoder models can also work with multimodal inputs: Qwen/Qwen3-VL-Reranker-2B <https://huggingface.co/Qwen/Qwen3-VL-Reranker-2B>_ can rank images and text by relevance to a query.
Finetuning CrossEncoder models is easy and requires only a few lines of code. For more information, see the Training Overview <./cross_encoder/training_overview.html>__ section.
.. tip::
Read `CrossEncoder > Usage > Speeding up Inference <cross_encoder/usage/efficiency.html>`_ for tips on how to speed up inference of models by up to 2x-3x.
Characteristics of Sparse Encoder models:
The usage for Sparse Encoder models follows a similar pattern to Sentence Transformers:
.. sidebar:: Documentation
SparseEncoder <sentence_transformers.SparseEncoder>SparseEncoder.encode <sentence_transformers.SparseEncoder.encode>SparseEncoder.similarity <sentence_transformers.SparseEncoder.similarity>SparseEncoder.sparsity <sentence_transformers.SparseEncoder.sparsity>Other useful methods and links:
SparseEncoder > Usage <./sparse_encoder/usage/usage.html>_SparseEncoder > Pretrained Models <./sparse_encoder/pretrained_models.html>_SparseEncoder > Training Overview <./sparse_encoder/training_overview.html>_SparseEncoder > Loss Overview <./sparse_encoder/loss_overview.html>_Sparse Encoder > Vector Database Integration <../examples/sparse_encoder/applications/semantic_search/README.html#vector-database-search>_::
from sentence_transformers import SparseEncoder
model = SparseEncoder("naver/splade-cocondenser-ensembledistil")
sentences = [ "The weather is lovely today.", "It's so sunny outside!", "He drove to the stadium.", ]
embeddings = model.encode(sentences) print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings) print(similarities)
stats = SparseEncoder.sparsity(embeddings) print(f"Sparsity: {stats['sparsity_ratio']:.2%}") # Typically >99% zeros print(f"Avg non-zero dimensions per embedding: {stats['active_dims']:.2f}")
With SparseEncoder("naver/splade-cocondenser-ensembledistil") we load a pretrained SPLADE model that generates sparse embeddings. SPLADE (SParse Lexical AnD Expansion) models use MLM prediction mechanisms to create sparse representations that are particularly effective for information retrieval tasks.
Finetuning Sparse Encoder models is easy and requires only a few lines of code. For more information, see the Training Overview <./sparse_encoder/training_overview.html>__ section.
.. tip::
Read `Sparse Encoder > Usage > Speeding up Inference <sparse_encoder/usage/efficiency.html>`_ for tips on how to speed up inference of models by up to 2x-3x.
Consider reading one of the following sections next:
Sentence Transformers > Usage <./sentence_transformer/usage/usage.html>_Sentence Transformers > Pretrained Models <./sentence_transformer/pretrained_models.html>_Sentence Transformers > Training Overview <./sentence_transformer/training_overview.html>_Sentence Transformers > Training Examples > Multilingual Models <../examples/sentence_transformer/training/multilingual/README.html>_Sentence Transformers > Training Examples > Multimodal Models <../examples/sentence_transformer/training/multimodal/README.html>_Cross Encoder > Usage <./cross_encoder/usage/usage.html>_Cross Encoder > Pretrained Models <./cross_encoder/pretrained_models.html>_Cross Encoder > Training Examples > Multimodal Models <../examples/cross_encoder/training/multimodal/README.html>_Sparse Encoder > Usage <./sparse_encoder/usage/usage.html>_Sparse Encoder > Pretrained Models <./sparse_encoder/pretrained_models.html>_Sparse Encoder > Vector Database Integration <../examples/sparse_encoder/applications/semantic_search/README.html#vector-database-search>_