docs/components/rerankers/models/sentence_transformer.mdx
Sentence Transformer reranker provides local reranking using HuggingFace cross-encoder models, perfect for privacy-focused deployments where you want to keep data on-premises.
Any HuggingFace cross-encoder model can be used. Popular choices include:
cross-encoder/ms-marco-MiniLM-L-6-v2: Default, good balance of speed and accuracycross-encoder/ms-marco-TinyBERT-L-2-v2: Fastest, smaller model sizecross-encoder/ms-marco-electra-base: Higher accuracy, larger modelcross-encoder/stsb-distilroberta-base: Good for semantic similarity taskspip install sentence-transformers
from mem0 import Memory
config = {
"vector_store": {
"provider": "chroma",
"config": {
"collection_name": "my_memories",
"path": "./chroma_db"
}
},
"llm": {
"provider": "openai",
"config": {
"model": "gpt-4o-mini"
}
},
"rerank": {
"provider": "sentence_transformer",
"config": {
"model": "cross-encoder/ms-marco-MiniLM-L-6-v2",
"device": "cpu", # or "cuda" for GPU
"batch_size": 32,
"show_progress_bar": False,
"top_k": 5
}
}
}
memory = Memory.from_config(config)
For better performance, use GPU acceleration:
config = {
"rerank": {
"provider": "sentence_transformer",
"config": {
"model": "cross-encoder/ms-marco-MiniLM-L-6-v2",
"device": "cuda", # Use GPU
"batch_size": 64 # high batch size for high memory GPUs
}
}
}
from mem0 import Memory
# Initialize memory with local reranker
config = {
"vector_store": {"provider": "chroma"},
"llm": {"provider": "openai", "config": {"model": "gpt-4o-mini"}},
"rerank": {
"provider": "sentence_transformer",
"config": {
"model": "cross-encoder/ms-marco-MiniLM-L-6-v2",
"device": "cpu"
}
}
}
memory = Memory.from_config(config)
# Add memories
messages = [
{"role": "user", "content": "I love reading science fiction novels"},
{"role": "user", "content": "My favorite author is Isaac Asimov"},
{"role": "user", "content": "I also enjoy watching sci-fi movies"}
]
memory.add(messages, user_id="charlie")
# Search with local reranking
results = memory.search("What books does the user like?", filters={"user_id": "charlie"})
for result in results['results']:
print(f"Memory: {result['memory']}")
print(f"Vector Score: {result['score']:.3f}")
print(f"Rerank Score: {result['rerank_score']:.3f}")
print()
You can use any HuggingFace cross-encoder model:
# Using a different model
config = {
"rerank": {
"provider": "sentence_transformer",
"config": {
"model": "cross-encoder/stsb-distilroberta-base",
"device": "cpu"
}
}
}
| Parameter | Description | Type | Default |
|---|---|---|---|
model | HuggingFace cross-encoder model name | str | "cross-encoder/ms-marco-MiniLM-L-6-v2" |
device | Device to run model on (cpu, cuda, etc.) | str | None |
batch_size | Batch size for processing documents | int | 32 |
show_progress_bar | Show progress bar during processing | bool | False |
top_k | Maximum documents to return | int | None |