docs/docs/getting-started/web-ui/knowledge-base.md
Build and manage knowledge bases for Retrieval-Augmented Generation (RAG). Upload documents, configure retrieval, and use them in chat.
Click Knowledge in the sidebar to open the knowledge management page.
:::info Supported file formats
| Format | Extensions |
|---|---|
| Documents | .pdf, .docx, .doc, .txt, .md |
| Spreadsheets | .xlsx, .xls, .csv |
| Web | .html, .htm |
| Data | .json |
| Code | .py, .java, .js, .ts, etc. |
| ::: |
Each knowledge base has configurable settings:
| Setting | Description | Default |
|---|---|---|
| Chunk Size | Maximum characters per chunk | 512 |
| Chunk Overlap | Overlap between consecutive chunks | 50 |
| Top K | Number of chunks to retrieve per query | 5 |
| Score Threshold | Minimum relevance score for retrieval | 0.3 |
:::tip Tuning retrieval
DB-GPT supports multiple vector storage backends:
| Backend | Description | Install Extra |
|---|---|---|
| ChromaDB | Default, embedded, no setup needed | storage_chromadb |
| Milvus | Distributed vector database for production | storage_milvus |
| OceanBase | Cloud-native distributed database | storage_oceanbase |
To use a non-default backend, add the corresponding extra to your install command:
uv sync --all-packages --extra "storage_milvus" ...
DB-GPT supports knowledge graphs for structured retrieval:
See Graph RAG for setup instructions.
</details> <details> <summary><strong>Keyword retrieval (BM25)</strong></summary>For hybrid retrieval combining vector and keyword search:
uv sync --all-packages --extra "rag_bm25" ...
This enables BM25 indexing alongside vector embeddings for improved recall.
</details>| Action | How |
|---|---|
| View | Click on a knowledge base to see its documents and settings |
| Add documents | Use the Upload button within the knowledge base |
| Delete documents | Select documents and click Delete |
| Delete knowledge base | Use the Delete button on the knowledge base card |
:::warning Deleting is permanent Deleting a knowledge base removes all associated vector embeddings and indexed data. The original uploaded files are not recoverable. :::
| Topic | Link |
|---|---|
| Use knowledge in chat | Chat |
| RAG concepts | RAG |
| Advanced RAG configuration | RAG Tutorial |