docs/public/architecture/search-architecture.mdx
Claude-mem uses an MCP-based search architecture that provides intelligent memory retrieval through 4 streamlined tools following a 3-layer workflow pattern.
Architecture: MCP Tools → MCP Protocol → HTTP API → Worker Service
Key Components:
search, timeline, get_observations, __IMPORTANTplugin/scripts/mcp-server.cjs) - Thin wrapper over HTTP APIToken Efficiency: ~10x savings through 3-layer workflow pattern
Claude has access to 4 MCP tools. When searching memory, Claude follows the 3-layer workflow:
Step 1: search(query="authentication bug", type="bugfix", limit=10)
Step 2: timeline(anchor=<observation_id>, depth_before=3, depth_after=3)
Step 3: get_observations(ids=[123, 456, 789])
MCP server receives tool call via JSON-RPC over stdio:
{
"method": "tools/call",
"params": {
"name": "search",
"arguments": {
"query": "authentication bug",
"type": "bugfix",
"limit": 10
}
}
}
MCP server translates to HTTP request:
const url = `http://localhost:37777/api/search?query=authentication%20bug&type=bugfix&limit=10`;
const response = await fetch(url);
Worker service executes FTS5 query:
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = 'bugfix'
ORDER BY rank
LIMIT 10
Worker returns structured data → MCP server → Claude:
{
"content": [{
"type": "text",
"text": "| ID | Time | Title | Type |\n|---|---|---|---|\n| #123 | 2:15 PM | Fixed auth token expiry | bugfix |"
}]
}
Claude reviews the index, decides which observations are relevant, and can:
timeline to get contextget_observations to fetch full details for selected IDs__IMPORTANT - Workflow DocumentationAlways visible to Claude. Explains the 3-layer workflow pattern.
Description:
3-LAYER WORKFLOW (ALWAYS FOLLOW):
1. search(query) → Get index with IDs (~50-100 tokens/result)
2. timeline(anchor=ID) → Get context around interesting results
3. get_observations([IDs]) → Fetch full details ONLY for filtered IDs
NEVER fetch full details without filtering first. 10x token savings.
Purpose: Ensures Claude follows token-efficient pattern
search - Search Memory IndexTool Definition:
{
name: 'search',
description: 'Step 1: Search memory. Returns index with IDs. Params: query, limit, project, type, obs_type, dateStart, dateEnd, offset, orderBy',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true // Accepts any parameters
}
}
HTTP Endpoint: GET /api/search
Parameters:
query - Full-text search querylimit - Maximum results (default: 20)type - Filter by observation typeproject - Filter by project namedateStart, dateEnd - Date range filtersoffset - Pagination offsetorderBy - Sort orderReturns: Compact index with IDs, titles, dates, types (~50-100 tokens per result)
timeline - Get Chronological ContextTool Definition:
{
name: 'timeline',
description: 'Step 2: Get context around results. Params: anchor (observation ID) OR query (finds anchor automatically), depth_before, depth_after, project',
inputSchema: {
type: 'object',
properties: {},
additionalProperties: true
}
}
HTTP Endpoint: GET /api/timeline
Parameters:
anchor - Observation ID to center timeline around (optional if query provided)query - Search query to find anchor automatically (optional if anchor provided)depth_before - Number of observations before anchor (default: 3)depth_after - Number of observations after anchor (default: 3)project - Filter by project nameReturns: Chronological view showing what happened before/during/after
get_observations - Fetch Full DetailsTool Definition:
{
name: 'get_observations',
description: 'Step 3: Fetch full details for filtered IDs. Params: ids (array of observation IDs, required), orderBy, limit, project',
inputSchema: {
type: 'object',
properties: {
ids: {
type: 'array',
items: { type: 'number' },
description: 'Array of observation IDs to fetch (required)'
}
},
required: ['ids'],
additionalProperties: true
}
}
HTTP Endpoint: POST /api/observations/batch
Body:
{
"ids": [123, 456, 789],
"orderBy": "date_desc",
"project": "my-app"
}
Returns: Complete observation details (~500-1,000 tokens per observation)
Location: /Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs
Role: Thin wrapper that translates MCP protocol to HTTP API calls
Key Characteristics:
additionalProperties: trueHandler Example:
{
name: 'search',
handler: async (args: any) => {
const endpoint = '/api/search';
const searchParams = new URLSearchParams();
for (const [key, value] of Object.entries(args)) {
searchParams.append(key, String(value));
}
const url = `http://localhost:37777${endpoint}?${searchParams}`;
const response = await fetch(url);
return await response.json();
}
}
Location: src/services/worker-service.ts
Port: 37777
Search Endpoints:
GET /api/search # Main search (used by MCP search tool)
GET /api/timeline # Timeline context (used by MCP timeline tool)
POST /api/observations/batch # Fetch by IDs (used by MCP get_observations tool)
GET /api/health # Health check
Database Access:
SessionSearch service for FTS5 queriesSessionStore for structured queriesFTS5 Full-Text Search:
// search tool → HTTP GET → FTS5 query
SELECT * FROM observations_fts
WHERE observations_fts MATCH ?
AND type = ?
AND date >= ? AND date <= ?
ORDER BY rank
LIMIT ? OFFSET ?
The 3-layer workflow embodies progressive disclosure - a core principle of claude-mem's architecture.
Layer 1: Index (Search)
Layer 2: Context (Timeline)
Layer 3: Details (Get Observations)
Traditional RAG Approach:
Fetch 20 observations upfront: 10,000-20,000 tokens
Relevance: ~10% (only 2 observations actually useful)
Waste: 18,000 tokens on irrelevant context
3-Layer Workflow:
Step 1: search (20 results) ~1,000-2,000 tokens
Step 2: Review index, filter to 3 relevant IDs
Step 3: get_observations (3 IDs) ~1,500-3,000 tokens
Total: 2,500-5,000 tokens (50-75% savings)
10x Savings: By filtering at index level before fetching full details
Approach: 9 MCP tools with detailed parameter schemas
Token Cost: ~2,500 tokens in tool definitions per session
search_observations - Full-text searchfind_by_type - Filter by typefind_by_file - Filter by filefind_by_concept - Filter by conceptget_recent_context - Recent sessionsget_observation - Fetch single observationget_session - Fetch sessionget_prompt - Fetch prompthelp - API documentationProblems:
Code Size: ~2,718 lines in mcp-server.ts
Approach: 4 MCP tools following 3-layer workflow
Token Cost: ~312 lines of code, simplified tool definitions
Tools:
__IMPORTANT - Workflow guidance (always visible)search - Step 1 (index)timeline - Step 2 (context)get_observations - Step 3 (details)Benefits:
additionalProperties: true)Code Size: ~312 lines in mcp-server.ts (88% reduction)
Before: Progressive disclosure was something Claude had to remember
After: Progressive disclosure is enforced by tool design itself
The 3-layer workflow pattern makes it structurally difficult to waste tokens:
__IMPORTANT)Add to claude_desktop_config.json:
{
"mcpServers": {
"mcp-search": {
"command": "node",
"args": [
"/Users/YOUR_USERNAME/.claude/plugins/marketplaces/thedotmack/plugin/scripts/mcp-server.cjs"
]
}
}
}
MCP server is automatically configured via plugin installation. No manual setup required.
Both clients use the same MCP tools - the architecture works identically for Claude Desktop and Claude Code.
All search queries are escaped before FTS5 processing:
function escapeFTS5Query(query: string): string {
return query.replace(/"/g, '""');
}
Testing: 332 injection attack tests covering special characters, SQL keywords, quote escaping, and boolean operators.
FTS5 Full-Text Search: Sub-10ms for typical queries
MCP Overhead: Minimal - simple protocol translation
Caching: HTTP layer allows response caching (future enhancement)
Pagination: Efficient with offset/limit
Batching: get_observations accepts multiple IDs in single call
Traditional RAG:
3-Layer MCP:
Previous (9 tools):
Current (4 tools):
Skill approach:
MCP approach:
Migration: Skill-based search was removed in favor of streamlined MCP architecture.
Symptoms: Tools not appearing in Claude
Solution:
curl http://localhost:37777/api/healthSymptoms: MCP tools fail with connection errors
Solution:
npm run worker:status # Check status
npm run worker:restart # Restart worker
npm run worker:logs # View logs
Symptoms: search() returns no results
Troubleshooting:
curl "http://localhost:37777/api/search?query=test"ls ~/.claude-mem/claude-mem.dbcurl "http://localhost:37777/api/health"