docs/docs/modules/rag.md
Multi-Source Enhanced Retrieval-Augmented Generation Framework (MS-RAG)
Large Language Models (LLMs) are powerful, but they can only answer based on the data they were trained on. When users need up-to-date or domain-specific information — such as internal documents, proprietary databases, or the latest reports — LLMs alone fall short.
Retrieval-Augmented Generation (RAG) bridges this gap by retrieving relevant information from external knowledge sources and feeding it as context to the LLM before generating a response. This ensures answers are grounded in real data rather than memorized patterns.
DB-GPT implements a Multi-Source RAG (MS-RAG) framework that goes beyond basic document Q&A. It supports multiple knowledge sources (documents, URLs, databases, knowledge graphs), multiple retrieval strategies (vector, keyword, graph, hybrid), and integrates deeply with the DB-GPT agent and workflow ecosystem.
The MS-RAG pipeline consists of four stages:
Knowledge Source → Chunking → Indexing → Retrieval → LLM Generation
KnowledgeFactory automatically routes data sources (files, URLs, text) to the appropriate Knowledge implementation based on type and file extension.ChunkManager splits loaded documents into manageable chunks using configurable strategies (by size, page, paragraph, separator, or markdown headers).Assembler classes (Embedding, BM25, Summary, DBSchema) persist chunks into the appropriate index store (vector database, full-text engine, or knowledge graph).Retriever fetches relevant chunks, optional QueryRewrite expands the query, and Ranker re-ranks results before the LLM generates the final answer.The BaseAssembler defines a unified pipeline that connects all stages:
Knowledge.load() → ChunkManager.split() → Assembler.persist() → Assembler.as_retriever()
DB-GPT provides four specialized assemblers:
| Assembler | Purpose | Index Backend |
|---|---|---|
| EmbeddingAssembler | Vector similarity RAG (most common) | Vector Store (Chroma, Milvus, etc.) |
| BM25Assembler | Keyword-based full-text retrieval | Elasticsearch |
| SummaryAssembler | Summary-based RAG for long documents | Vector Store |
| DBSchemaAssembler | Database schema retrieval for Text2SQL | Vector Store |
DB-GPT supports loading knowledge from multiple source types. In the Web UI, you can select a datasource type when uploading:
<p align="center"> </p>| Type | Description | Example |
|---|---|---|
| Document | Upload files in various formats | PDF, Word, Excel, CSV, Markdown, PowerPoint, TXT, HTML, JSON, ZIP |
| URL | Fetch and index web page content | Any accessible HTTP/HTTPS URL |
| Text | Directly input raw text | Paste text content in the UI |
| Yuque | Import from Yuque documentation platform | Yuque document links |
| Format | Extension | Knowledge Class |
|---|---|---|
.pdf | PDFKnowledge | |
| CSV | .csv | CSVKnowledge |
| Markdown | .md | MarkdownKnowledge |
| Word (docx) | .docx | DocxKnowledge |
| Word (legacy) | .doc | Word97DocKnowledge |
| Excel | .xlsx | ExcelKnowledge |
| PowerPoint | .pptx | PPTXKnowledge |
| Plain Text | .txt | TXTKnowledge |
| HTML | .html | HTMLKnowledge |
| JSON | .json | JSONKnowledge |
When creating a knowledge base, you can choose from three storage types:
<p align="center"> </p>| Storage Type | Description | Best For |
|---|---|---|
| Vector Store | Stores document embeddings for semantic similarity search | General-purpose document Q&A |
| Knowledge Graph | Stores entities and relationships as a graph structure | Domain knowledge with complex entity relationships |
| Full Text | Full-text index for keyword-based retrieval | Exact term matching and keyword search |
| Backend | Description | Install Extra |
|---|---|---|
| ChromaDB | Default embedded vector database, zero setup | storage_chromadb |
| Milvus | Distributed vector database for production scale | storage_milvus |
| PGVector | PostgreSQL extension for vector operations | storage_pgvector |
| Valkey | High-performance in-memory vector store with HNSW/FLAT indexing | storage_valkey |
| Weaviate | Cloud-native vector search engine | storage_weaviate |
| Elasticsearch | Full-text + vector hybrid search | storage_elasticsearch |
| OceanBase | Cloud-native distributed database | storage_oceanbase |
| Backend | Description |
|---|---|
| TuGraph | High-performance graph database by Ant Group |
| Neo4j | Popular open-source graph database |
| MemGraph | In-memory graph database for low-latency queries |
| Backend | Description |
|---|---|
| Elasticsearch | Industry-standard full-text search engine |
| OpenSearch | AWS-managed search and analytics suite |
DB-GPT offers multiple retrieval modes. You can configure the retrieve mode in the knowledge base settings:
<p align="center"> </p>| Strategy | Description | Backend Required |
|---|---|---|
| Semantic | Vector similarity search using embeddings | Vector Store |
| Keyword | BM25-based keyword matching | Elasticsearch |
| Hybrid | Combines vector + keyword search with Reciprocal Rank Fusion (RRF) | Vector Store + Elasticsearch |
| Tree | Tree-structured retrieval for hierarchical documents | Vector Store |
Beyond basic retrieval, DB-GPT provides advanced query processing:
| Reranker | Type | Description |
|---|---|---|
| CrossEncoderRanker | Local | Uses sentence-transformers CrossEncoder models |
| QwenRerankEmbeddings | Local | Qwen3-Reranker via transformers |
| OpenAPIRerankEmbeddings | API | Compatible with OpenAI-style rerank APIs |
| RRFRanker | Algorithm | Reciprocal Rank Fusion for merging multi-source results |
| DefaultRanker | Algorithm | Simple score-based sorting |
Document chunking is a critical step in RAG quality. DB-GPT supports multiple chunking strategies:
<p align="center"> </p>| Strategy | Splitter | Description |
|---|---|---|
| Chunk by Size | RecursiveCharacterTextSplitter | Split by character count with configurable size and overlap (default: 512 / 50) |
| Chunk by Page | PageTextSplitter | Split at page boundaries (useful for PDFs) |
| Chunk by Paragraph | ParagraphTextSplitter | Split at paragraph boundaries |
| Chunk by Separator | SeparatorTextSplitter | Split at custom separator strings |
| Chunk by Markdown Header | MarkdownHeaderTextSplitter | Split at markdown heading levels |
| Parameter | Description | Default |
|---|---|---|
| chunk_size | Maximum characters per chunk | 512 |
| chunk_overlap | Overlapping characters between adjacent chunks | 50 |
| topk | Number of chunks to retrieve per query | 5 |
| recall_score | Minimum relevance score threshold | 0 |
| recall_type | Recall strategy (TopK) | TopK |
| model | Embedding model to use | Depends on configuration |
DB-GPT supports a wide range of embedding models for converting text into vector representations:
| Model | Class | Description |
|---|---|---|
| HuggingFace | HuggingFaceEmbeddings | General-purpose HuggingFace models |
| BGE Series | HuggingFaceBgeEmbeddings | BAAI BGE models with instruction support (Chinese/English) |
| Instructor | HuggingFaceInstructEmbeddings | Instruction-following embedding models |
| Provider | Class | Description |
|---|---|---|
| OpenAI-compatible | OpenAPIEmbeddings | Any OpenAI-compatible embedding API |
| Jina | JinaEmbeddings | Jina AI embedding service |
| Ollama | OllamaEmbeddings | Local Ollama embedding server |
| Tongyi (Aliyun) | TongyiEmbeddings | Alibaba Cloud DashScope |
| Qianfan (Baidu) | QianfanEmbeddings | Baidu Wenxin platform |
| SiliconFlow | SiliconFlowEmbeddings | SiliconFlow embedding service |
Beyond traditional vector-based RAG, DB-GPT supports Knowledge Graph RAG for structured knowledge retrieval.
GraphRetriever combines four sub-strategies:
Navigate to the Knowledge section in the sidebar.
<p align="center"> </p>Select a datasource type and upload your content. Supported types include Document (PDF, Word, Excel, CSV, etc.), URL, Text, and Yuque.
Choose a chunking strategy and set parameters:
<p align="center"> </p>You can configure the retrieval strategy for your knowledge base. DB-GPT supports multiple retrieve modes — Semantic, Keyword, Hybrid, and Tree — to suit different query scenarios. Select the mode that best fits your use case in the knowledge base settings.
<p align="center"> </p>Go to Chat, click the knowledge base icon in the chat input toolbar, select your knowledge base from the dropdown, and start asking questions.
<p align="center"> </p>from dbgpt.rag import Chunk
from dbgpt_ext.rag.assembler import EmbeddingAssembler
from dbgpt_ext.rag.knowledge import KnowledgeFactory
# Load knowledge from a file
knowledge = KnowledgeFactory.create(file_path="your_document.pdf")
# Build the embedding index
assembler = await EmbeddingAssembler.aload_from_knowledge(
knowledge=knowledge,
index_store=your_vector_store,
embedding_model=your_embedding_model,
)
assembler.persist()
# Retrieve relevant chunks
retriever = assembler.as_retriever(top_k=5)
chunks = await retriever.aretrieve("What is the main topic?")
| Topic | Link |
|---|---|
| Knowledge Base Web UI Guide | Knowledge Base |
| RAG Concepts | RAG |
| Graph RAG Setup | Graph RAG |
| AWEL RAG Operators | AWEL |
| Source Code | GitHub |