docs/guides/dataset/advanced/construct_knowledge_graph.md
Generate a knowledge graph for your dataset.
To enhance multi-hop question-answering, RAGFlow adds a knowledge graph construction step between data extraction and indexing, as illustrated below. This step creates additional chunks from existing ones generated by your specified chunking method.
From v0.16.0 onward, RAGFlow supports constructing a knowledge graph on a dataset, allowing you to construct a unified graph across multiple files within your dataset. When a newly uploaded file starts parsing, the generated graph will automatically update.
:::danger WARNING Constructing a knowledge graph requires significant memory, computational resources, and tokens. :::
Knowledge graphs are especially useful for multi-hop question-answering involving nested logic. They outperform traditional extraction approaches when you are performing question answering on books or works with complex entities and relationships.
:::tip NOTE RAPTOR (Recursive Abstractive Processing for Tree Organized Retrieval) can also be used for multi-hop question-answering tasks. See Enable RAPTOR for details. You may use either approach or both, but ensure you understand the memory, computational, and token costs involved. :::
The system's default chat model is used to generate knowledge graph. Before proceeding, ensure that you have a chat model properly configured:
The types of the entities to extract from your dataset. The default types are: organization, person, event, and category. Add or remove types to suit your specific dataset.
The method to use to construct knowledge graph:
Whether to enable entity resolution. You can think of this as an entity deduplication switch. When enabled, the LLM will combine similar entities - e.g., '2025' and 'the year of 2025', or 'IT' and 'Information Technology' - to construct a more effective graph.
In a knowledge graph, a community is a cluster of entities linked by relationships. You can have the LLM generate an abstract for each community, known as a community report. See here for more information. This indicates whether to generate community reports:
Navigate to the Configuration page of your dataset and update:
Navigate to the Files page of your dataset, click the Generate button on the top right corner of the page, then select Knowledge graph from the dropdown to initiate the knowledge graph generation process.
You can click the pause button in the dropdown to halt the build process when necessary.
Go back to the Configuration page:
Once a knowledge graph is generated, the Knowledge graph field changes from Not generated to Generated at a specific timestamp. You can delete it by clicking the recycle bin button to the right of the field.
To use the created knowledge graph, do either of the following:
Nope. The knowledge graph does not update until you regenerate a knowledge graph for your dataset.
On the Configuration page of your dataset, find the Knowledge graph field and click the recycle bin button to the right of the field.
All chunks of the created knowledge graph are stored in RAGFlow's document engine: either Elasticsearch or Infinity.
Nope. Exporting a created knowledge graph is not supported. If you still consider this feature essential, please raise an issue explaining your use case and its importance.