docs/examples/cookbooks/mixedbread_reranker.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/cookbooks/mixedbread_reranker.ipynb" target="_parent"></a>
mixedbread.ai has released three fully open-source reranker models under the Apache 2.0 license. For more in-depth information, you can check out their detailed blog post. The following are the three models:
mxbai-rerank-xsmall-v1mxbai-rerank-base-v1mxbai-rerank-large-v1In this notebook, we'll demonstrate how to use the mxbai-rerank-base-v1 model with the SentenceTransformerRerank module in LlamaIndex. This setup allows you to seamlessly swap in any reranker model of your choice using the SentenceTransformerRerank module to enhance your RAG pipeline.
!pip install llama-index
!pip install sentence-transformers
import os
os.environ["OPENAI_API_KEY"] = "YOUR OPENAI API KEY"
from llama_index.core import (
VectorStoreIndex,
SimpleDirectoryReader,
)
from llama_index.core.postprocessor import SentenceTransformerRerank
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()
index = VectorStoreIndex.from_documents(documents=documents)
mxbai-rerank-base-v1 rerankerfrom llama_index.core.postprocessor import SentenceTransformerRerank
postprocessor = SentenceTransformerRerank(
model="mixedbread-ai/mxbai-rerank-base-v1", top_n=2
)
We will first retrieve 10 relevant nodes and pick top-2 nodes using the defined postprocessor.
query_engine = index.as_query_engine(
similarity_top_k=10,
node_postprocessors=[postprocessor],
)
response = query_engine.query(
"Why did Sam Altman decline the offer of becoming president of Y Combinator?",
)
print(response)
response = query_engine.query(
"Why did Paul Graham start YC?",
)
print(response)