Documentation/architecture_overview.md
Last updated: 2025-07-06
This document explains how data and control flow through the Advanced RAG System — from a user's browser all the way to model inference and back. It is intended as the ground-truth reference for engineers and integrators.
flowchart LR
subgraph Client
U["👤 User (Browser)"]
FE["Next.js Front-end\nReact Components"]
U --> FE
end
subgraph Network
FE -->|HTTP/JSON| BE["Python HTTP Server\nbackend/server.py"]
end
subgraph Core["rag_system core package"]
BE --> LOOP["Agent Loop\n(rag_system/agent/loop.py)"]
BE --> IDX["Indexing Pipeline\n(pipelines/indexing_pipeline.py)"]
LOOP --> RP["Retrieval Pipeline\n(pipelines/retrieval_pipeline.py)"]
LOOP --> VER["Verifier (Grounding Check)"]
RP --> RET["Retrievers\nBM25 | Dense | Hybrid"]
RP --> RER["AI Reranker"]
RP --> SYNT["Answer Synthesiser"]
end
subgraph Storage
LDB[("LanceDB Vector Tables")]
SQL[("SQLite – chat & metadata")]
end
subgraph Models
OLLAMA["Ollama Server\n(qwen3, etc.)"]
HF["HuggingFace Hosted\nEmbedding/Reranker Models"]
end
%% data edges
IDX -->|chunks & embeddings| LDB
RET -->|vector search| LDB
LOOP -->|LLM calls| OLLAMA
RP -->|LLM calls| OLLAMA
VER -->|LLM calls| OLLAMA
RP -->|rerank| HF
BE -->|CRUD| SQL
src/lib/api.ts.rag_system.The table below links to deep-dives for each major component.
| Component | Documentation |
|---|---|
| Agent Loop | system_overview.md |
| Indexing Pipeline | indexing_pipeline.md |
| Retrieval Pipeline | retrieval_pipeline.md |
| Verifier | verifier.md |
| Triage System | triage_system.md |
Change-management: whenever architecture changes (new micro-service, different DB, etc.) update this overview diagram first, then individual component docs.