v3/docs/adr/ADR-099-dossier-investigator-recursive-parallel-research.md
ruflo-goalsStatus: Proposed
Date: 2026-05-03
Version: target v3.6.x (additive — plugins/ruflo-goals minor bump)
Supersedes: nothing
Related: ADR-098 (plugin capability sync), plugins/ruflo-goals/agents/deep-researcher.md, plugins/ruflo-knowledge-graph, plugins/ruflo-rag-memory
The ruflo-goals plugin currently ships three agents:
| Agent | Pattern | Output |
|---|---|---|
goal-planner | GOAP / A* over state space | Action plan |
deep-researcher | Linear multi-source synthesis with evidence grading | Graded findings document |
horizon-tracker | Long-horizon objective tracking with drift detection | Milestone state |
Inspired by maigret (3,000+ source parallel username enumeration with recursive expansion and structured dossier reporting), we identified a structurally distinct research pattern that none of the three existing agents implements as a first-class loop:
deep-researcher does evidence-graded synthesis but expects a human-curated source list and runs essentially linearly. It has no recursive seeding loop and no parallel fan-out primitive.
Ruflo already ships every primitive needed to assemble a maigret-style investigator without adding new external dependencies:
| Capability | Existing tool |
|---|---|
| Hybrid sparse+dense semantic search | mcp__claude-flow__memory_search_unified, ruflo-rag-memory:memory-search |
| Vector search (HNSW, RaBitQ) | mcp__claude-flow__embeddings_search, embeddings_rabitq_search |
| Pattern recall | mcp__claude-flow__agentdb_pattern-search, agentdb_hierarchical-recall |
| Knowledge-graph traversal + extraction | ruflo-knowledge-graph:kg-traverse, kg-extract |
| Web search & fetch | WebSearch, WebFetch |
| Codebase queries | Grep, Glob, Read |
| ADR index lookup | ruflo-adr:adr-index |
| Git intelligence | ruflo-jujutsu:diff-analyze |
| Parallel agent fan-out | ruflo-swarm:swarm-init (mesh topology) |
| Trajectory recording | mcp__claude-flow__hooks_intelligence_trajectory-* |
Add a new agent dossier-investigator and a companion skill dossier-collect to plugins/ruflo-goals.
dossier-investigatorsonnet (matches sibling agents; structured orchestration, not creative writing){ seed: string, sources?: string[], maxDepth?: number=2, maxBreadth?: number=8, budget?: { tokens?, usd? } }dossier namespace memory entry + markdown report + optional kg-extract ingest.dossier-collectUser-facing slash skill that drives the agent. Steps:
--sources.ruflo-knowledge-graph:kg-extract (or a lightweight regex pass for obvious cases) to surface new entities.maxDepth or budget is hit. Apply de-duplication via embedding similarity (threshold 0.92).kg-extract ingest.dossier namespace and record trajectory.| Option | Why rejected |
|---|---|
Extend deep-researcher | Would couple two structurally different loops (linear-graded vs parallel-recursive) into one prompt; would push the prompt past the 80-line guideline flagged in ADR-098. |
Bundle Python maigret as an MCP wrapper | Adds a Python runtime dependency, network egress to 3,000+ sites, and a privacy/abuse posture that's out of scope for a developer-research tool. We want maigret's pattern, not its target list. |
Build into ruflo-knowledge-graph | KG plugin is about graph operations on already-extracted data; investigator is about acquisition. Keeping it in ruflo-goals puts it next to its peers (deep-researcher, goal-planner). |
deep-researcherdeep-researcher and dossier-investigator will share the "query memory + KG + web" surface area. We accept this redundancy because:
deep-researcher when you have a question; use dossier-investigator when you have a seed and want to expand outward.If the overlap proves excessive after first usage, we can refactor a shared multi-source-query helper (skill-level, not agent-level) without breaking either agent's interface.
investigate this symbol / module / dependency / ADR) that today require manual fan-out.budget.tokens and budget.usd and abort cleanly when hit, not silently truncate.--exact mode for entity-identity-sensitive runs./ruflo-goals:dossier-collect is the only new entry point).dossier namespace under AgentDB memory.| Step | File | Owner |
|---|---|---|
| 1. Agent prompt | plugins/ruflo-goals/agents/dossier-investigator.md | coder |
| 2. Skill markdown | plugins/ruflo-goals/skills/dossier-collect/SKILL.md | coder |
| 3. Slash command | plugins/ruflo-goals/commands/goals.md (add dossier subcommand) | coder |
| 4. Plugin manifest bump | plugins/ruflo-goals/.claude-plugin/plugin.json (0.1.0 → 0.2.0) | coder |
| 5. README update | plugins/ruflo-goals/README.md | coder |
| 6. Smoke test | tests/plugins/ruflo-goals/dossier.spec.ts | tester |
| 7. Ship behind a flag | dossierInvestigator.enabled defaulting true for first release | coder |
Acceptance criteria:
--max-depth and --budget.dossier namespace with valid JSON schema.ADR-097, expected entities include federation, circuit-breaker, budget.plugins/ruflo-goals/agents/deep-researcher.md (sibling agent)