docs/AdvancedFeatures.md
LightRAG integrates with RAG-Anything, an All-in-One Multimodal Document Processing RAG system that enables advanced parsing and RAG capabilities across diverse document formats including PDFs, images, Office documents, tables, and formulas.
Key Features:
pip install raganything
import asyncio
from raganything import RAGAnything
from lightrag import LightRAG
from lightrag.llm.openai import openai_complete_if_cache, openai_embed
from lightrag.utils import EmbeddingFunc
import os
async def load_existing_lightrag():
lightrag_working_dir = "./existing_lightrag_storage"
from functools import partial
lightrag_instance = LightRAG(
working_dir=lightrag_working_dir,
llm_model_func=lambda prompt, system_prompt=None, history_messages=[], **kwargs: openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key="your-api-key",
**kwargs,
),
embedding_func=EmbeddingFunc(
embedding_dim=3072,
max_token_size=8192,
model="text-embedding-3-large",
func=partial(
openai_embed.func,
model="text-embedding-3-large",
api_key=api_key,
base_url=base_url,
),
)
)
await lightrag_instance.initialize_storages()
rag = RAGAnything(
lightrag=lightrag_instance,
vision_model_func=lambda prompt, system_prompt=None, history_messages=[], image_data=None, **kwargs: openai_complete_if_cache(
"gpt-4o",
"",
system_prompt=None,
history_messages=[],
messages=[
{"role": "system", "content": system_prompt} if system_prompt else None,
{"role": "user", "content": [
{"type": "text", "text": prompt},
{"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{image_data}"}}
]} if image_data else {"role": "user", "content": prompt}
],
api_key="your-api-key",
**kwargs,
) if image_data else openai_complete_if_cache(
"gpt-4o-mini",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
api_key="your-api-key",
**kwargs,
)
)
result = await rag.query_with_multimodal(
"What data has been processed in this LightRAG instance?",
mode="hybrid"
)
print("Query result:", result)
await rag.process_document_complete(
file_path="path/to/new/multimodal_document.pdf",
output_dir="./output"
)
if __name__ == "__main__":
asyncio.run(load_existing_lightrag())
Overview and Usage
LightRAG provides a TokenTracker tool to monitor token consumption reported by supported LLM providers. This feature is useful for controlling API costs and optimizing performance.
TokenTracker does not automatically inject itself into LLM calls. Pass it to the provider binding directly, bind it through llm_model_kwargs, or capture it in your custom LLM function.
Method 1: Track direct LLM calls
from lightrag.llm.openai import openai_complete_if_cache
from lightrag.utils import TokenTracker
token_tracker = TokenTracker()
with token_tracker:
result1 = await openai_complete_if_cache(
"gpt-4o-mini",
"your question 1",
token_tracker=token_tracker,
)
result2 = await openai_complete_if_cache(
"gpt-4o-mini",
"your question 2",
token_tracker=token_tracker,
)
The context manager resets the tracker when entering the block and prints usage when leaving it. The token_tracker=token_tracker argument is still required.
Method 2: Track LightRAG calls
from lightrag import LightRAG, QueryParam
from lightrag.llm.openai import gpt_4o_mini_complete
from lightrag.utils import TokenTracker
token_tracker = TokenTracker()
rag = LightRAG(
working_dir="./rag_storage",
llm_model_func=gpt_4o_mini_complete,
llm_model_kwargs={"token_tracker": token_tracker},
embedding_func=embedding_func,
)
await rag.initialize_storages()
token_tracker.reset()
await rag.ainsert(["document one", "document two"])
await rag.aquery("your question 1", param=QueryParam(mode="naive"))
await rag.aquery("your question 2", param=QueryParam(mode="mix"))
print("Token usage:", token_tracker.get_usage())
llm_model_kwargs={"token_tracker": token_tracker} is passed to the default role LLM wrappers used by extraction, keyword generation, querying, and VLM calls. If you configure role-specific LLM kwargs, put token_tracker in the relevant role kwargs as well, or use the closure pattern below.
Robust custom wrapper pattern
from lightrag import LightRAG
from lightrag.llm.gemini import gemini_complete_if_cache
from lightrag.utils import TokenTracker
def make_llm_func(token_tracker: TokenTracker):
async def _llm_model_func(
prompt,
system_prompt=None,
history_messages=None,
**kwargs,
):
return await gemini_complete_if_cache(
"gemini-2.5-flash-lite",
prompt,
system_prompt=system_prompt,
history_messages=history_messages,
token_tracker=token_tracker,
**kwargs,
)
return _llm_model_func
token_tracker = TokenTracker()
rag = LightRAG(
working_dir="./rag_storage",
llm_model_func=make_llm_func(token_tracker),
embedding_func=embedding_func,
)
await rag.initialize_storages()
token_tracker.reset()
await rag.ainsert(["document one", "document two"])
print("Token usage:", token_tracker.get_usage())
Usage Tips:
reset() before each indexing or query phaseLightRAG allows you to export your knowledge graph data in various formats for analysis, sharing, and backup.
Basic Usage
# Basic CSV export (default format)
rag.export_data("knowledge_graph.csv")
# Specify any format
rag.export_data("output.xlsx", file_format="excel")
Supported File Formats
rag.export_data("graph_data.csv", file_format="csv")
rag.export_data("graph_data.xlsx", file_format="excel")
rag.export_data("graph_data.md", file_format="md")
rag.export_data("graph_data.txt", file_format="txt")
Additional Options
Include vector embeddings in the export (optional):
rag.export_data("complete_data.csv", include_vector_data=True)
All exports include entity information (names, IDs, metadata), relation data (connections between entities), and relationship information from the vector database.
Clear Cache
aclear_cache() clears all cached entries in llm_response_cache. It does not support selective cleanup by mode or cache type.
# Asynchronous
await rag.aclear_cache()
# Synchronous
rag.clear_cache()
For selective cleanup of query-related caches, use the lightrag.tools.clean_llm_query_cache tool and see the guide in lightrag/tools/README_CLEAN_LLM_QUERY_CACHE.md. It manages query caches and keywords caches for mix, hybrid, local, and global modes. It does not clean extraction caches such as default:extract:* and default:summary:*.
Langfuse provides a drop-in replacement for the OpenAI client that automatically tracks all LLM interactions, enabling developers to monitor, debug, and optimize their RAG systems.
pip install lightrag-hku[observability]
# Or from source:
pip install -e ".[observability]"
Add to .env file:
## Langfuse Observability (Optional)
LANGFUSE_SECRET_KEY=""
LANGFUSE_PUBLIC_KEY=""
LANGFUSE_HOST="https://cloud.langfuse.com" # or your self-hosted instance
LANGFUSE_ENABLE_TRACE=true
Once installed and configured, Langfuse automatically traces all OpenAI LLM calls. Dashboard features include:
Note: LightRAG currently only integrates OpenAI-compatible API calls with Langfuse. APIs such as Ollama, Azure, and AWS Bedrock are not yet supported for Langfuse observability.
RAGAS (Retrieval Augmented Generation Assessment) is a framework for reference-free evaluation of RAG systems using LLMs. LightRAG provides an evaluation script based on RAGAS. For detailed information, see RAGAS-based Evaluation Framework.