v3/docs/adr/ADR-095-architectural-gaps-from-april-audit.md
Status: Proposed (tracking only — no decisions yet on individual rows) Date: 2026-05-03 Version: targets v3.7.x and beyond Supersedes: nothing Related: ADR-093 (May audit remediation), ADR-094 (transformers migration), public audit gist by @roman-rr (2026-04-04)
The April 2026 audit by @roman-rr documented architectural gaps that ADR-093's "honesty patches" did not address. ADR-093 fixed the contract of the affected MCP tools (no more silent lies, no more bare hardcoded labels, schemas that round-trip what callers pass), but the execution layer underneath several of these tools is still missing.
This ADR is the canonical tracking record for those gaps. Each row below is a candidate for its own follow-up ADR with its own decision, scope, and validation plan. We are not deciding how to close any of them here — only naming them precisely so they cannot quietly fall off the backlog.
agent_spawn does not fork a subprocessCurrent state. agent_spawn writes a JSON record into an in-memory Map: { agentId, status: 'idle', taskCount: 0, lastResult: null }. No subprocess. No fork(). No LLM call. The status field never advances on its own. The schema-honesty work in ADR-093 made the lifecycle observable (the audit's taskCount: 0 forever is now reachable as the genuine state) but did not wire up an executor.
Wire that exists, unused. The AnthropicProvider class in v3/@claude-flow/providers/ makes real fetch calls to api.anthropic.com. The ProviderManager does round-robin and latency-based routing. Neither is imported by the agent spawn / task / swarm code paths.
What a real fix requires.
task_assign events and runs them against ProviderManager.agent_status and task_status.Why deferred. This is not a 5-minute fix. It's the missing wire between the registry and the LLM layer the audit correctly identified.
Current state. ADR-093 F3 made hive-mind_init accept consensus: 'raft' | 'byzantine' | 'gossip' | 'crdt' | 'quorum', persist consensusStrategy to state, and round-trip it through hive-mind_status. So the parameter is honest now.
The handler underneath is still EventEmitter-based and runs in a single Node process. byzantine-coordinator.ts's verifySignature() returns true unconditionally. RaftConsensus.requestVotes() does this.emit('vote_request') against a local emitter. There are no sockets, no gRPC, no inter-node transport.
What a real fix requires.
@noble/ed25519 keypairs already in tree.Why deferred. Distributed consensus is its own ADR — the security and correctness implications cannot be slotted into a /loop iteration.
Current state. workflow_create persists a workflow record to .claude-flow/workflows/store.json. workflow_execute returns {error: "Workflow not found"} even when called with a workflow ID that DOES exist in the store. The state machine definition (steps, conditions, deps) is present but no executor walks it.
What a real fix requires.
Why deferred. Depends on G1.
Current state. wasm_agent_prompt(input: "List 3 advantages of backtesting") returns "echo: List 3 advantages of backtesting". There is no WASM runtime, no LLM call, no sandbox. The MCP tool registers the agent definition and prints back what the user sent.
What a real fix requires.
wasmtime, wasmer, or browser WASM via Node's built-in support).Why deferred. Depends on G1 plus a WASM runtime decision.
@xenova/transformers → protobufjs critical RCE chainCurrent state. @xenova/[email protected] is the deprecated predecessor of @huggingface/transformers. It pins onnxruntime-web versions that depend on protobufjs <7.5.5, which has a critical RCE CVE (GHSA-h755-8qp9-cq85). npm overrides cannot resolve this because the version range required by xenova's manifests forbids the safer protobufjs.
Plan documented. ADR-094 — try-prefer-fallback loader (@huggingface/transformers → @xenova/transformers).
Status. Implementation landed on branch in iteration #14; verification queued for next /loop publish.
Current state. The auto-memory-hook.mjs reads MEMORY.md files from ~/.claude/projects/*/memory/, parses each section as a separate entry, and stores them in auto-memory-store.json. Then it builds a similarity graph using character-trigram Jaccard, runs PageRank for 30 iterations, and writes graph-state.json and ranked-context.json.
The audit measured: 5,706 entries, ~20 unique (5,686 are the same MEMORY.md sections duplicated across project directories). graph-state.json is 100 MB. ranked-context.json is 8.7 MB. The PageRank result is uniform (~0.02 across nodes) — meaningless because the graph is near-complete between near-identical duplicates. Trigram Jaccard isn't semantic — it scores character overlap, not meaning. The same entry is injected into Claude's context 5 times per message.
What a real fix requires.
Why deferred. This is its own cleanup track; touches the auto-memory hook, the trigram graph builder, and the runtime injection path. Worth its own ADR.
Current state. ADR-093 F9 probed and wired semanticRouter (when present in agentdb), and improved the actionable error for bridgeSemanticRoute. The other 6 disabled controllers ship off because each constructor needs something the registry doesn't currently expose:
| Controller | Why disabled |
|---|---|
mutationGuard | Needs write-policy config; turning on without config could break writes |
attestationLog | Needs a sqlite db handle the registry doesn't expose; constructor throws otherwise |
gnnService | Needs heavy deps (CUDA / WASM); not always available |
guardedVectorBackend | Needs key material for at-rest encryption |
rvfOptimizer | Needs RVF format storage configured |
graphAdapter | Needs a graph DB connection |
What a real fix requires.
controllers-config.json schema or env-var convention so users can opt in deliberately.Why deferred. Each controller activation is a security decision — turning them on in bulk would silently widen the attack surface.
Track each gap as a candidate ADR rather than letting them dilute through the issue tracker:
Numbers are reservations only; no decisions yet on any of them. The point of this ADR is to ensure the gaps are visible from the decisions log, not buried in PR comments.
This ADR closes when each row above has either landed in its own ADR (proposed/accepted) or been explicitly de-scoped with a recorded reason. The April audit gist (link) is the source of truth for what the audit named — re-read it before claiming any G# is done.
Independent re-audit by AlphaSignal AI (May 7, targeting v3.6.30) re-surfaced these gaps publicly. Verification pass on current main + the work in v3.7.0-alpha series shows the following status changes:
agent_execute wireThe execution wire shipped as a sibling MCP tool, not by modifying agent_spawn (which intentionally remains a registry-only write for cost attribution + swarm coordination). File: v3/@claude-flow/cli/src/mcp-tools/agent-execute-core.ts:117 — fetch('https://api.anthropic.com/v1/messages', ...). Workflow steps go through this wire (see G3).
v3/@claude-flow/cli/src/mcp-tools/workflow-tools.ts:308+ contains the step-walking executor with variable interpolation, step-output binding, pause/cancel, and persistence-after-each-step. The 'Workflow not found' error only fires when store.workflows[workflowId] is undefined (correct missing-ID handling, not a stub).
v3/@claude-flow/cli/src/ruvector/agent-wasm.ts:154 (promptWasmAgent) detects the bundled WASM agent's echo: <input> stub and routes through callAnthropicMessages when ANTHROPIC_API_KEY is set. When unset, surfaces the stub honestly with a [NOTE: …set ANTHROPIC_API_KEY to enable real responses] hint.
The 5,706-entries-per-message claim does not match current behavior:
~/.claude/settings.json UserPromptSubmit hook is [ -n "$PROMPT" ] && npx @claude-flow/cli@latest hooks route --task "$PROMPT" || true — single routing call, no bulk injection.plugins/ruflo-core/hooks/hooks.json defines only PreToolUse, PostToolUse, PreCompact, Stop — no UserPromptSubmit / SessionStart context injection.trigram / jaccard symbols in plugin/helper hooks (only one reference in plugins/ruflo-rag-memory/README.md documenting MMR diversity reranking — different code path).[AutoMemory] ✓ Imported 0 entries (0 skipped) Backend entries: 8.The 100 MB graph-state.json artifact may still exist on machines with long-lived ~/.claude/projects/ histories, but the runtime injection path no longer reads it. Recommend a separate one-shot cleanup script for users with bloated state files from earlier versions.
find . -name 'simulate_benchmarks*' returns zero results in current main.git grep "84.8.*SWE" in tracked .md files returns zero hits.Both specific artifacts called out in the AlphaSignal article have been cleaned up. Downstream marketing materials (gists, social posts) are outside this ADR's scope.
The first piece landed: v3/@claude-flow/swarm/src/consensus/transport.ts introduces a ConsensusTransport interface that separates the inter-node-message dimension from the observability-event dimension. The consensus protocols (raft/byzantine/gossip) historically used a local EventEmitter for both — the inter-node side never crossed a process boundary (a node "sent" a message by emitting it locally and synthesizing the peer's reply inline).
Landed (PR #1905, branch feat/adr-095-g2-hive-mind-ws-transport):
ConsensusTransport interface — send (request-response), broadcast, onMessage, peers, close. Separates inter-node messaging from observability events.LocalTransport — in-process registry; the default, matches current single-process behavior. Optional Ed25519 signing + per-sender monotonic-seq replay defense.crypto (no new deps): generateNodeKeyPair, signMessage, verifyMessage, canonicalizeForSigning (deep-sorted-key JSON for cross-host determinism), messageDigest.FederationTransport — ConsensusTransport over the federation plugin's ADR-104 WS wire (agentic-flow/transport/loader). Structural dep (swarm stays zero-dep — the wiring layer supplies the transport instance). Request-response layered on fire-and-forget WS via correlation ids; ADR-104 stream-mux; Ed25519 + replay defense; fail-closed.transport: RaftConsensus (real RequestVote/AppendEntries RPCs with proper receiver rules — term comparison, vote-once-per-term, log-up-to-date check, commitIndex advance), ByzantineConsensus (PBFT messages over the transport, sha256 digests replacing the 32-bit toy hash, inbound routing), GossipConsensus (gossip messages to neighbors over the transport, inbound merge with dedup-by-id). Legacy no-transport path preserved in all three.f now derived from the actual cluster size (floor((n-1)/3), clamped to ≥1) rather than hardcoded to 1; config.maxFaultyNodes (when set) acts as an upper cap. Quorum 2f+1 checks now use the cluster-derived f.plugins/ruflo-core/scripts/test-consensus-transport.mjs (in the mcp-roundtrip-smoke job) — asserts exports present (incl. FederationTransport), LocalTransport round-trips, Ed25519 verify is real (no return true stub regression).Remaining for G2:
LocalTransport cluster with simulated faulty/silent nodes; assert correct commits below threshold, no incorrect commits above.FederationTransport is wired into a real hive-mind + federation setup.Tracked in #1872 and PR #1905.
This update does NOT supersede the original ADR-095 problem statement; it records that 4 of the 7 originally-named gaps have been quietly fixed by the v3.7.0-alpha work that landed without ceremony.