Back to Llama Index

Composable Objects

docs/examples/retrievers/composable_retrievers.ipynb

0.14.215.4 KB
Original Source

Composable Objects

In this notebook, we show how you can combine multiple objects into a single top-level index.

This approach works by setting up IndexNode objects, with an obj field that points to a:

  • query engine
  • retriever
  • query pipeline
  • another node!
python
object = IndexNode(index_id="my_object", obj=query_engine, text="some text about this object")

Data Setup

python
%pip install llama-index-storage-docstore-mongodb
%pip install llama-index-vector-stores-qdrant
%pip install llama-index-storage-docstore-firestore
%pip install llama-index-retrievers-bm25
%pip install llama-index-storage-docstore-redis
%pip install llama-index-storage-docstore-dynamodb
%pip install llama-index-readers-file pymupdf
python
!wget --user-agent "Mozilla" "https://arxiv.org/pdf/2307.09288.pdf" -O "./llama2.pdf"
!wget --user-agent "Mozilla" "https://arxiv.org/pdf/1706.03762.pdf" -O "./attention.pdf"
python
from llama_index.core import download_loader

from llama_index.readers.file import PyMuPDFReader

llama2_docs = PyMuPDFReader().load_data(
    file_path="./llama2.pdf", metadata=True
)
attention_docs = PyMuPDFReader().load_data(
    file_path="./attention.pdf", metadata=True
)

Retriever Setup

python
import os

os.environ["OPENAI_API_KEY"] = "sk-..."
python
from llama_index.core.node_parser import TokenTextSplitter

nodes = TokenTextSplitter(
    chunk_size=1024, chunk_overlap=128
).get_nodes_from_documents(llama2_docs + attention_docs)
python
from llama_index.core.storage.docstore import SimpleDocumentStore
from llama_index.storage.docstore.redis import RedisDocumentStore
from llama_index.storage.docstore.mongodb import MongoDocumentStore
from llama_index.storage.docstore.firestore import FirestoreDocumentStore
from llama_index.storage.docstore.dynamodb import DynamoDBDocumentStore

docstore = SimpleDocumentStore()
docstore.add_documents(nodes)
python
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.retrievers.bm25 import BM25Retriever
from llama_index.vector_stores.qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

client = QdrantClient(path="./qdrant_data")
vector_store = QdrantVectorStore("composable", client=client)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

index = VectorStoreIndex(nodes=nodes)
vector_retriever = index.as_retriever(similarity_top_k=2)
bm25_retriever = BM25Retriever.from_defaults(
    docstore=docstore, similarity_top_k=2
)

Composing Objects

Here, we construct the IndexNodes. Note that the text is what is used to index the node by the top-level index.

For a vector index, the text is embedded, for a keyword index, the text is used for keywords.

In this example, the SummaryIndex is used, which does not technically need the text for retrieval, since it always retrieves all nodes.

python
from llama_index.core.schema import IndexNode

vector_obj = IndexNode(
    index_id="vector", obj=vector_retriever, text="Vector Retriever"
)
bm25_obj = IndexNode(
    index_id="bm25", obj=bm25_retriever, text="BM25 Retriever"
)
python
from llama_index.core import SummaryIndex

summary_index = SummaryIndex(objects=[vector_obj, bm25_obj])

Querying

When we query, all objects will be retrieved and used to generate the nodes to get a final answer.

Using tree_summarize with aquery() ensures concurrent execution and faster responses.

python
query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize", verbose=True
)
python
response = await query_engine.aquery(
    "How does attention work in transformers?"
)
python
print(str(response))
python
response = await query_engine.aquery(
    "What is the architecture of Llama2 based on?"
)
python
print(str(response))
python
response = await query_engine.aquery(
    "What was used before attention in transformers?"
)
python
print(str(response))

Note on Saving and Loading

Since objects aren't technically serializable, when saving and loading, then need to be provided at load time as well.

Here's an example of how I might save/load this setup.

Save

python
# qdrant is already saved automatically!
# we only need to save the docstore here

# save our docstore nodes for bm25
docstore.persist("./docstore.json")

Load

python
from llama_index.core.storage.docstore import SimpleDocumentStore
from llama_index.vector_stores.qdrant import QdrantVectorStore
from qdrant_client import QdrantClient

docstore = SimpleDocumentStore.from_persist_path("./docstore.json")

client = QdrantClient(path="./qdrant_data")
vector_store = QdrantVectorStore("composable", client=client)
python
index = VectorStoreIndex.from_vector_store(vector_store)
vector_retriever = index.as_retriever(similarity_top_k=2)
bm25_retriever = BM25Retriever.from_defaults(
    docstore=docstore, similarity_top_k=2
)
python
from llama_index.core.schema import IndexNode

vector_obj = IndexNode(
    index_id="vector", obj=vector_retriever, text="Vector Retriever"
)
bm25_obj = IndexNode(
    index_id="bm25", obj=bm25_retriever, text="BM25 Retriever"
)
python
# if we had added regular nodes to the summary index, we could save/load that as well
# summary_index.persist("./summary_index.json")
# summary_index = load_index_from_storage(storage_context, objects=objects)

from llama_index.core import SummaryIndex

summary_index = SummaryIndex(objects=[vector_obj, bm25_obj])