docs/architecture/ai-assistant-architecture.md
The AI Assistant provides conversational access to portfolio data through natural language queries. It uses LLM orchestration with tool calling to fetch and analyze financial data, presenting results through a streaming chat interface.
┌─────────────────────────────────────────────────────────────────────────────┐
│ Frontend (React) │
│ │
│ ┌───────────────────┐ ┌────────────────────┐ ┌───────────────────────┐ │
│ │ Thread List │ │ Chat Shell │ │ Tool Result Cards │ │
│ │ - Pinned │ │ - Messages │ │ - Holdings table │ │
│ │ - Recent │ │ - Streaming │ │ - Performance chart │ │
│ │ - Search │ │ - Tool calls │ │ - Account summary │ │
│ └───────────────────┘ └────────────────────┘ └───────────────────────┘ │
│ │ │
└────────────────────────────────────┼─────────────────────────────────────────┘
│
│ NDJSON Stream (AiStreamEvent)
│ POST /api/v1/ai/chat/stream
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ Transport Layer │
│ │
│ ┌─────────────────────────────┐ ┌─────────────────────────────────────┐ │
│ │ Tauri (Desktop) │ │ Axum (Web Server) │ │
│ │ - IPC Channel streaming │ │ - NDJSON HTTP streaming │ │
│ │ - TauriAiEnvironment │ │ - ServerAiEnvironment │ │
│ └─────────────────────────────┘ └─────────────────────────────────────┘ │
│ │ │
└────────────────────────────────────┼─────────────────────────────────────────┘
│
│ AiEnvironment trait
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ wealthfolio-ai crate │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ ChatService<E> │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ Thread Cache │ │ rig-core │ │ Tool Registry │ │ │
│ │ │ (LRU, 100) │ │ Agent │ │ - get_holdings │ │ │
│ │ │ │ │ - streaming │ │ - get_accounts │ │ │
│ │ │ Fast lookups │ │ - multi-turn │ │ - search_activity │ │ │
│ │ │ for recent │ │ - tool calls │ │ - get_performance │ │ │
│ │ │ threads │ │ │ │ - get_goals │ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────┘ │ │
│ │ │ │ │ │ │
│ │ │ Stream completes │ │ │
│ │ │ │ │ │ │
│ │ ▼ ▼ ▼ │ │
│ │ ┌────────────────────────────────────────────────────────────────┐ │ │
│ │ │ Persistence Actor (background tokio task) │ │ │
│ │ │ │ │ │
│ │ │ - Receives SaveThread/SaveMessage commands via channel │ │ │
│ │ │ - Batches writes for efficiency (500ms or 10 items) │ │ │
│ │ │ - Never blocks the streaming response │ │ │
│ │ │ - Retries on transient failures │ │ │
│ │ └────────────────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │ │
└─────────────────────────────────────┼────────────────────────────────────────┘
│
│ AiChatRepositoryTrait (async)
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ wealthfolio-core │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ Domain Types (ai module) │ │
│ │ │ │
│ │ AiThread AiMessage AiMessageContent │ │
│ │ ├─ id ├─ id ├─ schema_version │ │
│ │ ├─ title ├─ thread_id ├─ parts[] │ │
│ │ ├─ is_pinned ├─ role │ ├─ Text │ │
│ │ ├─ tags[] ├─ content │ ├─ Reasoning │ │
│ │ ├─ config ├─ created_at │ ├─ ToolCall │ │
│ │ ├─ created_at └───────────── │ ├─ ToolResult │ │
│ │ └─ updated_at │ └─ Error │ │
│ │ └─────────────── │ │
│ │ AiChatRepositoryTrait │ │
│ │ ├─ create_thread() ├─ create_message() ├─ add_tag() │ │
│ │ ├─ get_thread() ├─ get_message() ├─ remove_tag() │ │
│ │ ├─ list_threads() ├─ get_messages_by_thread() │ │
│ │ ├─ update_thread() ├─ update_message() │ │
│ │ └─ delete_thread() │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────┬────────────────────────────────────────┘
│
│ Implements trait
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ wealthfolio-storage-sqlite │
│ │
│ ┌───────────────────────────────────────────────────────────────────────┐ │
│ │ ai_chat module │ │
│ │ │ │
│ │ AiChatRepository implements AiChatRepositoryTrait │ │
│ │ ├─ pool: Arc<Pool<SqliteConnection>> │ │
│ │ └─ writer: WriteHandle (serialized writes) │ │
│ │ │ │
│ │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────────┐ │ │
│ │ │ ai_threads │ │ ai_messages │ │ ai_thread_tags │ │ │
│ │ │ ─────────── │ │ ─────────── │ │ ────────────── │ │ │
│ │ │ id PK │ │ id PK │ │ id PK │ │ │
│ │ │ title │ │ thread_id FK │ │ thread_id FK │ │ │
│ │ │ is_pinned │ │ role │ │ tag │ │ │
│ │ │ config_json │ │ content_json │ │ created_at │ │ │
│ │ │ created_at │ │ created_at │ │ │ │ │
│ │ │ updated_at │ │ │ │ UNIQUE(thread,tag)│ │ │
│ │ └─────────────────┘ └─────────────────┘ └─────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────┘ │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
The streaming response must never be blocked by database operations. We achieve this through:
Clear separation between streaming and persistence concerns:
| Layer | Types | Purpose |
|---|---|---|
| Streaming | AiStreamEvent, ToolResult, SendMessageRequest | Wire format for real-time updates |
| Domain | AiThread, AiMessage, AiMessageContent | Persistence and business logic |
| Storage | AiThreadDB, AiMessageDB | Database models (Diesel) |
Following rig-core's design, conversation history is passed per-request:
// rig-core API - history passed, not stored internally
agent.stream_chat(prompt, history: Vec<Message>).multi_turn(6)
This allows:
The main orchestrator that handles:
pub struct ChatService<E: AiEnvironment> {
env: Arc<E>,
tool_registry: ToolRegistry,
config: ChatConfig,
// LRU cache for fast thread lookups
thread_cache: Arc<RwLock<LruCache<String, AiThread>>>,
// Channel to persistence actor
persistence_tx: mpsc::Sender<PersistenceCommand>,
}
Background task that batches and executes DB writes:
enum PersistenceCommand {
SaveThread(AiThread),
SaveMessage(AiMessage),
UpdateThreadTitle { thread_id: String, title: String },
DeleteThread(String),
}
async fn persistence_actor(
rx: mpsc::Receiver<PersistenceCommand>,
repository: Arc<dyn AiChatRepositoryTrait>,
) {
// Batch writes every 500ms or when batch reaches 10 items
// Retry transient failures with exponential backoff
}
Dependency injection interface implemented by Tauri and Axum:
pub trait AiEnvironment: Send + Sync {
// Currency for formatting
fn base_currency(&self) -> String;
// Services for tool execution
fn account_service(&self) -> Arc<dyn AccountServiceTrait>;
fn activity_service(&self) -> Arc<dyn ActivityServiceTrait>;
fn holdings_service(&self) -> Arc<dyn HoldingsServiceTrait>;
fn valuation_service(&self) -> Arc<dyn ValuationServiceTrait>;
fn goal_service(&self) -> Arc<dyn GoalServiceTrait>;
// Settings and secrets
fn settings_service(&self) -> Arc<dyn SettingsServiceTrait>;
fn secret_store(&self) -> Arc<dyn SecretStore>;
// Chat persistence
fn chat_repository(&self) -> Arc<dyn AiChatRepositoryTrait>;
}
Manages available tools with allowlist support:
pub struct ToolRegistry {
tools: HashMap<String, Arc<dyn Tool>>,
}
impl ToolRegistry {
// Filter tools by allowlist for thread-specific restrictions
pub fn get_definitions(&self, allowlist: Option<&[String]>) -> Vec<ToolDefinition>;
// Execute with allowlist check
pub async fn execute(
&self,
name: &str,
args: Value,
ctx: &ToolContext,
allowlist: Option<&[String]>,
) -> Result<ToolResult, AiError>;
}
type AiStreamEvent =
| { type: "system"; threadId: string; runId: string; messageId: string }
| {
type: "textDelta";
threadId: string;
runId: string;
messageId: string;
delta: string;
}
| {
type: "reasoningDelta";
threadId: string;
runId: string;
messageId: string;
delta: string;
}
| {
type: "toolCall";
threadId: string;
runId: string;
messageId: string;
toolCall: ToolCall;
}
| {
type: "toolResult";
threadId: string;
runId: string;
messageId: string;
result: ToolResultData;
}
| {
type: "error";
threadId: string;
runId: string;
messageId?: string;
code: string;
message: string;
}
| {
type: "done";
threadId: string;
runId: string;
messageId: string;
message: AiMessage;
usage?: UsageStats;
};
Client Server
│ │
│ POST /ai/chat/stream │
│ { content: "Show holdings" } │
│ ─────────────────────────────>│
│ │
│ { type: "system", ... } │ ← Stream starts
│ <─────────────────────────────│
│ │
│ { type: "textDelta", ... } │ ← "Let me look up..."
│ <─────────────────────────────│
│ │
│ { type: "toolCall", ... } │ ← get_holdings called
│ <─────────────────────────────│
│ │
│ { type: "toolResult", ...} │ ← Holdings data + metadata
│ <─────────────────────────────│
│ │
│ { type: "textDelta", ... } │ ← "You have 15 holdings..."
│ <─────────────────────────────│
│ │
│ { type: "done", ... } │ ← Final message, stream ends
│ <─────────────────────────────│
│ │
All tool outputs use a consistent envelope for rich frontend rendering:
pub struct ToolResult {
pub data: serde_json::Value, // Structured result data
pub meta: HashMap<String, Value>, // Metadata for UI
}
// Metadata includes:
// - count: Number of items returned
// - originalCount: Total items before truncation
// - returnedCount: Items actually returned
// - truncated: Whether results were truncated
// - durationMs: Execution time
// - accountScope: Which account(s) were queried
Tools enforce maximum output sizes to prevent context overflow:
| Tool | Limit | Constant |
|---|---|---|
| get_holdings | 100 items | MAX_HOLDINGS |
| search_activities | 200 rows | MAX_ACTIVITIES_ROWS |
| get_valuations | 400 points | MAX_VALUATIONS_POINTS |
| get_income | 50 records | MAX_INCOME_RECORDS |
Messages store structured content with versioning for forward compatibility:
{
"schemaVersion": 1,
"parts": [
{ "type": "text", "content": "Here are your holdings:" },
{
"type": "toolCall",
"toolCallId": "tc-123",
"name": "get_holdings",
"arguments": { "accountId": "all" }
},
{
"type": "toolResult",
"toolCallId": "tc-123",
"success": true,
"data": { "holdings": [...] },
"meta": { "count": 15, "truncated": false }
},
{ "type": "text", "content": "You have 15 holdings worth $125,000." }
],
"truncated": false
}
| Code | HTTP Status | Description |
|---|---|---|
invalid_input | 400 | Malformed request |
missing_api_key | 400 | Provider API key not configured |
provider_error | 502 | LLM provider returned error |
tool_not_found | 400 | Unknown tool requested |
tool_not_allowed | 403 | Tool not in allowlist |
tool_execution_failed | 500 | Tool threw an error |
thread_not_found | 404 | Thread ID doesn't exist |
internal_error | 500 | Unexpected server error |
| Cache | Size | TTL | Purpose |
|---|---|---|---|
| Thread cache | 100 entries | LRU eviction | Fast thread lookups |
| Provider catalog | Static | Compile-time | Provider/model metadata |
ai_threads(updated_at DESC) for listingai_messages(thread_id, created_at) for history loadingai_thread_tags(thread_id, tag) for filteringAiThreadConfig.tools_allowlistTool trait for adding new toolsAiEnvironment trait for new service integrationsAiStreamEvent variants for new event types