AIMon Rerank

If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙

python

%%capture
!pip install llama-index
!pip install llama-index-postprocessor-aimon-rerank

python

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.response.pprint_utils import pprint_response

An OpenAI and AIMon API key is required for this notebook. Import the AIMon and OpenAI API keys from Colab Secrets

python

import os

# Import Colab Secrets userdata module.
from google.colab import userdata

os.environ["AIMON_API_KEY"] = userdata.get("AIMON_API_KEY")
os.environ["OPENAI_API_KEY"] = userdata.get("OPENAI_API_KEY")

Download data

python

!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

Generate documents and build an index

python

# load documents
documents = SimpleDirectoryReader("./data/paul_graham/").load_data()

# build index
index = VectorStoreIndex.from_documents(documents=documents)

Define a task definition for the AIMon reranker and instantiate an instance of the AIMonRerank class. The task definition serves as an explicit instruction to the system, defining what the reranking evaluation should focus on.

python

import os
from llama_index.postprocessor.aimon_rerank import AIMonRerank

task_definition = "Your task is to assess the actions of an individual specified in the user query against the context documents supplied."

aimon_rerank = AIMonRerank(
    top_n=2,
    api_key=userdata.get("AIMON_API_KEY"),
    task_definition=task_definition,
)

Directly retrieve top 2 most similar nodes (i.e., without using a reranker)

python

query_engine = index.as_query_engine(similarity_top_k=2)
response = query_engine.query("What did Sam Altman do in this essay?")

python

pprint_response(response, show_source=True)

Retrieve top 10 most relevant nodes, but then rerank with AIMon Reranker

Explanation of the reranking process:

The diagram illustrates how a reranker refines document retrieval for a more accurate response.

Initial Retrieval (Vector DB):
- A query is sent to the vector database.
- The system retrieves the top 10 most relevant records based on similarity scores (top_k = 10).
Reranking with AIMon:
- Instead of using only the highest-scoring records directly, these 10 records are reranked using the AIMon Reranker.
- The reranker evaluates the documents based on their actual relevance to the query, rather than just raw similarity scores.
- During this step, a task definition is applied, serving as an explicit instruction that defines what the reranking evaluation should focus on.
- This ensures that the selected records are not just statistically similar but also contextually relevant to the intended task.
Final Selection (top_n = 2):
- After reranking, the system selects the top 2 most contextually relevant records for response generation.
- The task definition ensures that these records align with the query’s intent, leading to a more precise and informative response.

python

query_engine = index.as_query_engine(
    similarity_top_k=10, node_postprocessors=[aimon_rerank]
)
response = query_engine.query("What did Sam Altman do in this essay?")

python

pprint_response(response, show_source=True)

Conclusion

The AIMon reranker, using task definition, shifted retrieval focus from general YC leadership changes to Sam Altman’s specific actions. Initially, high-similarity documents lacked his decision-making details. After reranking, lower-similarity but contextually relevant documents highlighted his reluctance and timeline, ensuring a more accurate, task-aligned response over purely similarity-based retrieval.