doc/multi_agent_database_discovery.md
This document describes a multi-agent database discovery system implemented using Claude Code's autonomous agent capabilities. The system uses 4 specialized subagents that collaborate via the MCP (Model Context Protocol) catalog to perform comprehensive database analysis.
┌─────────────────────────────────────────────────────────────────────┐
│ Main Agent (Orchestrator) │
│ - Launches 4 specialized subagents in parallel │
│ - Coordinates via MCP catalog │
│ - Synthesizes final report │
└────────────────┬────────────────────────────────────────────────────┘
│
┌────────────┼────────────┬────────────┬────────────┐
│ │ │ │ │
▼ ▼ ▼ ▼ ▼
┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐
│Struct. │ │Statist.│ │Semantic│ │Query │ │ MCP │
│ Agent │ │ Agent │ │ Agent │ │ Agent │ │Catalog │
└────────┘ └────────┘ └────────┘ └────────┘ └────────┘
│ │ │ │ │
└────────────┴────────────┴────────────┴────────────┘
│
▼ ▼
┌─────────┐ ┌─────────────┐
│ Database│ │ Catalog │
│ (testdb)│ │ (Shared Mem)│
└─────────┘ └─────────────┘
Mission: Map tables, relationships, indexes, and constraints
Responsibilities:
Catalog Entries: structural_discovery
Key Deliverables:
Mission: Profile data distributions, patterns, and anomalies
Responsibilities:
Catalog Entries: statistical_discovery
Key Deliverables:
Mission: Infer business domain and entity types
Responsibilities:
Catalog Entries: semantic_discovery
Key Deliverables:
Mission: Analyze access patterns and optimization opportunities
Responsibilities:
Catalog Entries: query_discovery
Key Deliverables:
Each agent runs 4 rounds of analysis:
# Agent writes findings
catalog_upsert(
kind="structural_discovery",
key="table_customers",
document="...",
tags="structural,table,schema"
)
# Agent reads other agents' findings
findings = catalog_list(kind="statistical_discovery")
| Metric | Current | Actual |
|---|---|---|
| Customers | 15 | 5 |
| Products | 15 | 5 |
| Orders | 15 | 5 |
| Order Items | 27 | 9 |
| Revenue | $10,886.67 | $3,628.85 |
# In Claude Code, launch 4 agents in parallel:
Task(
description="Structural Discovery",
prompt=STRUCTURAL_AGENT_PROMPT,
subagent_type="general-purpose"
)
Task(
description="Statistical Discovery",
prompt=STATISTICAL_AGENT_PROMPT,
subagent_type="general-purpose"
)
Task(
description="Semantic Discovery",
prompt=SEMANTIC_AGENT_PROMPT,
subagent_type="general-purpose"
)
Task(
description="Query Discovery",
prompt=QUERY_AGENT_PROMPT,
subagent_type="general-purpose"
)
The agents use these MCP tools for database analysis:
list_schemas - List all databaseslist_tables - List tables in a schemadescribe_table - Get table schemasample_rows - Get sample data from tablecolumn_profile - Get column statisticsrun_sql_readonly - Execute read-only queriescatalog_upsert - Store findings in catalogcatalog_list / catalog_get - Retrieve findings from catalogDiscovery and catalog/LLM tools are target-scoped. Always pass target_id:
discovery.run_static(target_id=..., schema_filter=...)catalog.*(target_id=..., run_id=...)agent.run_start(target_id=..., run_id=...)llm.*(target_id=..., run_id=...)run_id resolution is no longer global. The same schema name can exist on multiple targets, so target_id is required to resolve the correct discovery run.
The system produces:
simple_discovery.py - Simplified demo of multi-agent patternmcp_catalog.db - Catalog database for storing findings