docs/OfflineDeployment.md
This guide provides comprehensive instructions for deploying LightRAG in offline environments where internet access is limited or unavailable.
If you deploy LightRAG using Docker, there is no need to refer to this document, as the LightRAG Docker image is pre-configured for offline operation.
Software packages requiring
transformers,torch, orcudawill not be included in the offline dependency group. Consequently, document extraction tools such as Docling, as well as local LLM models like Hugging Face and LMDeploy, are outside the scope of offline installation support. These high-compute-resource-demanding services should not be integrated into LightRAG. Docling will be decoupled and deployed as a standalone service.
LightRAG uses dynamic package installation (pipmaster) for optional features based on file types and configurations. In offline environments, these dynamic installations will fail. This guide shows you how to pre-install all necessary dependencies and cache files.
LightRAG dynamically installs packages for:
redis, neo4j, pymilvus, pymongo, asyncpg, qdrant-clientopenai, anthropic, ollama, zhipuai, aioboto3, voyageai, llama-index, lmdeploy, transformers, torchNote: Document processing dependencies (pypdf, python-docx, python-pptx, openpyxl) are now pre-installed with the api extras group and no longer require dynamic installation.
# Online environment: Install all offline dependencies
pip install lightrag-hku[offline]
# Download tiktoken cache
lightrag-download-cache
# Create offline package
pip download lightrag-hku[offline] -d ./offline-packages
tar -czf lightrag-offline.tar.gz ./offline-packages ~/.tiktoken_cache
# Transfer to offline server
scp lightrag-offline.tar.gz user@offline-server:/path/to/
# Offline environment: Install
tar -xzf lightrag-offline.tar.gz
pip install --no-index --find-links=./offline-packages lightrag-hku[offline]
export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache
# Online environment: Download packages
pip download -r requirements-offline.txt -d ./packages
# Transfer to offline server
tar -czf packages.tar.gz ./packages
scp packages.tar.gz user@offline-server:/path/to/
# Offline environment: Install
tar -xzf packages.tar.gz
pip install --no-index --find-links=./packages -r requirements-offline.txt
LightRAG provides flexible dependency groups for different use cases:
| Group | Description | Use Case |
|---|---|---|
api | API server + document processing | FastAPI server with PDF, DOCX, PPTX, XLSX support |
offline-storage | Storage backends | Redis, Neo4j, MongoDB, PostgreSQL, etc. |
offline-llm | LLM providers | OpenAI, Anthropic, Ollama, etc. |
offline | Complete offline package | API + Storage + LLM (all features) |
Note: Document processing (PDF, DOCX, PPTX, XLSX) is included in the api extras group. The previous offline-docs group has been merged into api for better integration.
Software packages requiring
transformers,torch, orcudawill not be included in the offline dependency group.
# Install API with document processing
pip install lightrag-hku[api]
# Install API and storage backends
pip install lightrag-hku[api,offline-storage]
# Install all offline dependencies (recommended for offline deployment)
pip install lightrag-hku[offline]
# Storage backends only
pip install -r requirements-offline-storage.txt
# LLM providers only
pip install -r requirements-offline-llm.txt
# All offline dependencies
pip install -r requirements-offline.txt
Tiktoken downloads BPE encoding models on first use. In offline environments, you must pre-download these models.
After installing LightRAG, use the built-in command:
# Download to default location (see output for exact path)
lightrag-download-cache
# Download to specific directory
lightrag-download-cache --cache-dir ./tiktoken_cache
# Download specific models only
lightrag-download-cache --models gpt-4o-mini gpt-4
gpt-4o-mini (LightRAG default)gpt-4ogpt-4gpt-3.5-turbotext-embedding-ada-002text-embedding-3-smalltext-embedding-3-large# Option 1: Environment variable (temporary)
export TIKTOKEN_CACHE_DIR=/path/to/tiktoken_cache
# Option 2: Add to ~/.bashrc or ~/.zshrc (persistent)
echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc
source ~/.bashrc
# Option 3: Copy to default location
cp -r /path/to/tiktoken_cache ~/.tiktoken_cache/
# 1. Install LightRAG with offline dependencies
pip install lightrag-hku[offline]
# 2. Download tiktoken cache
lightrag-download-cache --cache-dir ./offline_cache/tiktoken
# 3. Download all Python packages
pip download lightrag-hku[offline] -d ./offline_cache/packages
# 4. Create archive for transfer
tar -czf lightrag-offline-complete.tar.gz ./offline_cache
# 5. Verify contents
tar -tzf lightrag-offline-complete.tar.gz | head -20
# Using scp
scp lightrag-offline-complete.tar.gz user@offline-server:/tmp/
# Or using USB/physical media
# Copy lightrag-offline-complete.tar.gz to USB drive
# 1. Extract archive
cd /tmp
tar -xzf lightrag-offline-complete.tar.gz
# 2. Install Python packages
pip install --no-index \
--find-links=/tmp/offline_cache/packages \
lightrag-hku[offline]
# 3. Set up tiktoken cache
mkdir -p ~/.tiktoken_cache
cp -r /tmp/offline_cache/tiktoken/* ~/.tiktoken_cache/
export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache
# 4. Add to shell profile for persistence
echo 'export TIKTOKEN_CACHE_DIR=~/.tiktoken_cache' >> ~/.bashrc
# Test Python import
python -c "from lightrag import LightRAG; print('✓ LightRAG imported')"
# Test tiktoken
python -c "from lightrag.utils import TiktokenTokenizer; t = TiktokenTokenizer(); print('✓ Tiktoken working')"
# Test optional dependencies (if installed)
python -c "import docling; print('✓ Docling available')"
python -c "import redis; print('✓ Redis available')"
Problem: Unable to load tokenizer for model gpt-4o-mini
Solution:
# Ensure TIKTOKEN_CACHE_DIR is set
echo $TIKTOKEN_CACHE_DIR
# Verify cache files exist
ls -la ~/.tiktoken_cache/
# If empty, you need to download cache in online environment first
Problem: Error installing package xxx
Solution:
# Pre-install the specific package you need
# For API with document processing:
pip install lightrag-hku[api]
# For storage backends:
pip install lightrag-hku[offline-storage]
# For LLM providers:
pip install lightrag-hku[offline-llm]
Problem: ModuleNotFoundError: No module named 'xxx'
Solution:
# Check what you have installed
pip list | grep -i xxx
# Install missing component
pip install lightrag-hku[offline] # Install all offline deps
Problem: PermissionError: [Errno 13] Permission denied
Solution:
# Ensure cache directory has correct permissions
chmod 755 ~/.tiktoken_cache
chmod 644 ~/.tiktoken_cache/*
# Or use a user-writable directory
export TIKTOKEN_CACHE_DIR=~/my_tiktoken_cache
mkdir -p ~/my_tiktoken_cache
Test in Online Environment First: Always test your complete setup in an online environment before going offline.
Keep Cache Updated: Periodically update your offline cache when new models are released.
Document Your Setup: Keep notes on which optional dependencies you actually need.
Version Pinning: Consider pinning specific versions in production:
pip freeze > requirements-production.txt
Minimal Installation: Only install what you need:
# If you only need API with document processing
pip install lightrag-hku[api]
# Then manually add specific LLM: pip install openai
If you encounter issues not covered in this guide: