Back to Llama Index

mixedbread Rerank Cookbook

docs/examples/cookbooks/mixedbread_reranker.ipynb

0.14.212.4 KB
Original Source

<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/cookbooks/mixedbread_reranker.ipynb" target="_parent"></a>

mixedbread Rerank Cookbook

mixedbread.ai has released three fully open-source reranker models under the Apache 2.0 license. For more in-depth information, you can check out their detailed blog post. The following are the three models:

  1. mxbai-rerank-xsmall-v1
  2. mxbai-rerank-base-v1
  3. mxbai-rerank-large-v1

In this notebook, we'll demonstrate how to use the mxbai-rerank-base-v1 model with the SentenceTransformerRerank module in LlamaIndex. This setup allows you to seamlessly swap in any reranker model of your choice using the SentenceTransformerRerank module to enhance your RAG pipeline.

Installation

python
!pip install llama-index
!pip install sentence-transformers

Set API Keys

python
import os

os.environ["OPENAI_API_KEY"] = "YOUR OPENAI API KEY"
python
from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
)

from llama_index.core.postprocessor import SentenceTransformerRerank

Download Data

python
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Load Documents

python
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

Build Index

python
index = VectorStoreIndex.from_documents(documents=documents)

Define postprocessor for mxbai-rerank-base-v1 reranker

python
from llama_index.core.postprocessor import SentenceTransformerRerank

postprocessor = SentenceTransformerRerank(
    model="mixedbread-ai/mxbai-rerank-base-v1", top_n=2
)

Create Query Engine

We will first retrieve 10 relevant nodes and pick top-2 nodes using the defined postprocessor.

python
query_engine = index.as_query_engine(
    similarity_top_k=10,
    node_postprocessors=[postprocessor],
)

Test Queries

python
response = query_engine.query(
    "Why did Sam Altman decline the offer of becoming president of Y Combinator?",
)

print(response)
python
response = query_engine.query(
    "Why did Paul Graham start YC?",
)

print(response)