docs/examples/query_engine/knowledge_graph_rag_query_engine.ipynb
<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/examples/query_engine/knowledge_graph_rag_query_engine.ipynb" target="_parent"></a>
Graph RAG is an Knowledge-enabled RAG approach to retrieve information from Knowledge Graph on given task. Typically, this is to build context based on entities' SubGraph related to the task.
In Llama Index, there are two scenarios we could apply Graph RAG:
KnowledgeGraphIndex.KnowledgeGraphRAGQueryEngine.Note, the third query engine that's related to KG in Llama Index is
NL2GraphQueryorText2Cypher, for either exiting KG or not, it could be done withKnowledgeGraphQueryEngine.
Before we start the Knowledge Graph RAG QueryEngine demo, let's first get ready for basic preparation of Llama Index.
If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
%pip install llama-index-llms-azure-openai
%pip install llama-index-graph-stores-nebula
%pip install llama-index-llms-openai
%pip install llama-index-embeddings-azure-openai
!pip install llama-index
# For OpenAI
import os
os.environ["OPENAI_API_KEY"] = "sk-..."
import logging
import sys
logging.basicConfig(
stream=sys.stdout, level=logging.INFO
) # logging.DEBUG for more verbose output
# define LLM
from llama_index.llms.openai import OpenAI
from llama_index.core import Settings
Settings.llm = OpenAI(temperature=0, model="gpt-3.5-turbo")
Settings.chunk_size = 512
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding
# For Azure OpenAI
api_key = "<api-key>"
azure_endpoint = "https://<your-resource-name>.openai.azure.com/"
api_version = "2023-07-01-preview"
llm = AzureOpenAI(
model="gpt-35-turbo-16k",
deployment_name="my-custom-llm",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
# You need to deploy your own embedding model as well as your own chat completion model
embed_model = AzureOpenAIEmbedding(
model="text-embedding-ada-002",
deployment_name="my-custom-embedding",
api_key=api_key,
azure_endpoint=azure_endpoint,
api_version=api_version,
)
from llama_index.core import Settings
Settings.llm = llm
Settings.embed_model = embed_model
Settings.chunk_size = 512
We take NebulaGraphStore as an example in this demo, thus before next step to perform Graph RAG on existing KG, let's ensure we have a running NebulaGraph with defined data schema.
This step installs the clients of NebulaGraph, and prepare contexts that defines a NebulaGraph Graph Space.
# Create a NebulaGraph (version 3.5.0 or newer) cluster with:
# Option 0 for machines with Docker: `curl -fsSL nebula-up.siwei.io/install.sh | bash`
# Option 1 for Desktop: NebulaGraph Docker Extension https://hub.docker.com/extensions/weygu/nebulagraph-dd-ext
# If not, create it with the following commands from NebulaGraph's console:
# CREATE SPACE llamaindex(vid_type=FIXED_STRING(256), partition_num=1, replica_factor=1);
# :sleep 10;
# USE llamaindex;
# CREATE TAG entity(name string);
# CREATE EDGE relationship(relationship string);
# :sleep 10;
# CREATE TAG INDEX entity_index ON entity(name(256));
%pip install ipython-ngql nebula3-python
os.environ["NEBULA_USER"] = "root"
os.environ["NEBULA_PASSWORD"] = "nebula" # default is "nebula"
os.environ[
"NEBULA_ADDRESS"
] = "127.0.0.1:9669" # assumed we have NebulaGraph installed locally
space_name = "llamaindex"
edge_types, rel_prop_names = ["relationship"], [
"relationship"
] # default, could be omit if create from an empty kg
tags = ["entity"] # default, could be omit if create from an empty kg
Then we could instiatate a NebulaGraphStore, in order to create a StorageContext's graph_store as it.
from llama_index.core import StorageContext
from llama_index.graph_stores.nebula import NebulaGraphStore
graph_store = NebulaGraphStore(
space_name=space_name,
edge_types=edge_types,
rel_prop_names=rel_prop_names,
tags=tags,
)
storage_context = StorageContext.from_defaults(graph_store=graph_store)
Here, we assumed to have the same Knowledge Graph from this tutorial
Finally, let's demo how to do Graph RAG towards an existing Knowledge Graph.
All we need to do is to use RetrieverQueryEngine and configure the retriver of it to be KnowledgeGraphRAGRetriever.
The KnowledgeGraphRAGRetriever performs the following steps:
Please note, the way to Search related Entities could be either Keyword extraction based or Embedding based, which is controlled by argument retriever_mode of the KnowledgeGraphRAGRetriever, and supported options are:
Here is the example on how to use RetrieverQueryEngine and KnowledgeGraphRAGRetriever:
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.retrievers import KnowledgeGraphRAGRetriever
graph_rag_retriever = KnowledgeGraphRAGRetriever(
storage_context=storage_context,
verbose=True,
)
query_engine = RetrieverQueryEngine.from_args(
graph_rag_retriever,
)
Then we can query it like:
from IPython.display import display, Markdown
response = query_engine.query(
"Tell me about Peter Quill?",
)
display(Markdown(f"<b>{response}</b>"))
response = await query_engine.aquery(
"Tell me about Peter Quill?",
)
display(Markdown(f"<b>{response}</b>"))
The nature of (Sub)Graph RAG and nl2graphquery are different. No one is better than the other but just when one fits more in certain type of questions. To understand more on how they differ from the other, see this demo comparing the two.
<video width="938" height="800" src="https://github.com/siwei-io/talks/assets/1651790/05d01e53-d819-4f43-9bf1-75549f7f2be9" controls> </video>
While in real world cases, we may not always know which approach works better, thus, one way to best leverage KG in RAG are fetching both retrieval results as context and letting LLM + Prompt generate answer with them all being involved.
So, optionally, we could choose to synthesise answer from two piece of retrieved context from KG:
We could set with_nl2graphquery=True to enable it like:
graph_rag_retriever_with_nl2graphquery = KnowledgeGraphRAGRetriever(
storage_context=storage_context,
verbose=True,
with_nl2graphquery=True,
)
query_engine_with_nl2graphquery = RetrieverQueryEngine.from_args(
graph_rag_retriever_with_nl2graphquery,
)
response = query_engine_with_nl2graphquery.query(
"What do you know about Peter Quill?",
)
display(Markdown(f"<b>{response}</b>"))
And let's check the response's metadata to know more details of the retrival of Graph RAG with nl2graphquery by inspecting response.metadata.
Graph Store Query: MATCH (e:`entity`)-[r:`relationship`]->(e2:`entity`)
WHERE e.`entity`.`name` == 'Peter Quill'
RETURN e2.`entity`.`name`
SubGraph RAG, it get the SubGraph of 'Peter Quill' to build the context.
Finally, it combined the two nodes of context, to synthesize the answer.
import pprint
pp = pprint.PrettyPrinter()
pp.pprint(response.metadata)