Back to Ruflo

ADR-060: Remaining Bug & Wiring Fixes — March 2026

v3/implementation/adrs/ADR-060-remaining-bugs-2026-03.md

3.6.3015.6 KB
Original Source

ADR-060: Remaining Bug & Wiring Fixes — March 2026

Status: Accepted — Sprint 1+2 Verified Date: 2026-03-05 (updated 2026-03-05) Author: claude-flow Supersedes: Remaining items from ADR-059

Context

ADR-059 triaged 30 open issues and 11 were fixed in v3.5.3 (PR #1297, #1298). This ADR covers the 19 remaining open issues plus 5 additional issues discovered during the fix cycle that were not in the original triage. Total: 24 open items.

Priority Levels

LevelMeaningSLA
P0 — CriticalSecurity risk, data loss, or blocks all usersFix within 24h
P1 — HighCore functionality broken for a platform or major featureFix within 1 week
P2 — MediumIncorrect behavior, cosmetic UX bugs, missing configFix within 2 weeks
P3 — LowEnhancements, polish, feature requestsNext release cycle

P0 — Critical

1. Windows: daemon and memory init silently fail (#1282)

  • Impact: init --start-all reports success but creates nothing on Windows. Zero functionality for Windows users.
  • Root cause: child_process.spawn uses POSIX-only flags (detached, stdio: 'ignore'). Windows needs shell: true and windowsHide: true.
  • Fix: Audit all spawn calls for platform-specific flags; add Windows CI matrix.
  • Effort: Medium — requires platform detection in daemon.ts, memory-initializer.ts, and process manager.
  • Status:FIXED in v3.5.6/v3.5.7. Platform-aware spawn: windowsHide: true + shell: true on Windows, no detached on Windows. Verified 10/10 tests pass on Windows.

P1 — High

2. hook-handler.cjs ignores stdin — all hook data silently lost (#1211)

  • Impact: Every Claude Code hook sends structured JSON via stdin, but hook-handler.cjs never reads it. The entire learning, routing, and intelligence pipeline receives no input data.
  • Root cause: The handler reads process.argv but never process.stdin.
  • Fix: Add stdin buffering: let data = ''; process.stdin.on('data', c => data += c); process.stdin.on('end', () => { /* parse JSON, dispatch */ }).
  • Effort: Small — single file fix with high leverage.
  • Note: This is the single highest-leverage fix remaining. Without it, all hooks are effectively no-ops.
  • Status:FIXED in PR #1300 (v3.5.4). Added readStdin() async function with !process.stdin.isTTY detection, JSON parsing, and merged hookInput into prompt resolution. Updated in v3.5.7: replaced for await with event-based + 500ms timeout to prevent hanging when Claude Code leaves stdin in an ambiguous state.

3. macOS: daemon dies immediately after start (#1283)

  • Impact: Daemon always shows STOPPED on macOS. Background workers, learning hooks, and neural training are all non-functional.
  • Root cause: Likely PID file race condition or signal handling (SIGHUP on terminal close).
  • Fix: Add nohup wrapper or launchd plist support. Write PID file after fork, not before.
  • Effort: Medium.
  • Status:FIXED in v3.5.6. Added SIGHUP handler (process.on('SIGHUP', () => {})) in foreground daemon mode. PID file now written after child.unref() + 100ms delay to prevent race condition.

4. auto-memory-hook.mjs fails to resolve @claude-flow/memory (#1287)

  • Impact: Auto-memory import fails when installed as nested dependency (npx, monorepos). Memory persistence across sessions is broken.
  • Root cause: ES module import resolution doesn't traverse node_modules correctly from hook context.
  • Fix: Use createRequire(import.meta.url) for CommonJS-style resolution, or bundle the memory module inline.
  • Effort: Small.
  • Status:FIXED in PR #1300 (v3.5.4). Replaced loadMemoryPackage() with 4-strategy resolution: local dev path, createRequire, ESM import, walk-up search. Template in helpers-generator.ts also updated.

5. Settings-generator missing 13 hooks, 9 env vars, memory config (#1291)

  • Impact: Fresh installs get incomplete configuration. 13 hooks never fire, 9 env vars undocumented.
  • Fix: Reconcile settings-generator.ts against the actual hooks registry in hooks.ts. Add missing hook entries and env var documentation.
  • Effort: Medium — requires careful audit of all hook types vs generated settings.
  • Status:FIXED in v3.5.6. Added 4 missing hook matchers: PostToolUse:Bash (post-command tracking), PreToolUse:Write|Edit|MultiEdit (pre-edit validation), SubagentEnd (agent completion metrics), Notification (event logging). Total: 8 hook event types with 12 matchers.

6. recordFeedback() exposed by AgentDB v3 has zero callers (#1209)

  • Impact: AgentDB's feedback/reinforcement learning API exists but is never invoked. The learning loop (RETRIEVE → JUDGE → DISTILL → CONSOLIDATE) has no JUDGE step at runtime.
  • Fix: Wire recordFeedback() calls into post-task and post-edit hooks when --success flag is provided.
  • Effort: Small — add calls in hook handlers.
  • Status:FIXED in PR #1300 (v3.5.4). Wired bridgeRecordFeedback() into post-edit handler in hooks-tools.ts, completing the JUDGE step.

7. MemoryGraph class exported but never instantiated (#1214)

  • Impact: Graph-based relationship tracking (agent→task, file→pattern) is dead code. Semantic search misses relationship context.
  • Fix: Instantiate MemoryGraph in memory-initializer alongside ControllerRegistry.
  • Effort: Small.
  • Status:FIXED in PR #1300 (v3.5.4). Added memoryGraph: true to controllers config in memory-bridge.ts getRegistry().

P2 — Medium

8. workflow run and task assign call missing MCP tools (#1281)

  • Impact: CLI references workflow_run and task_assign MCP tools that aren't registered.
  • Fix: Register the missing tools in the MCP server, or remove the dead references.
  • Effort: Small.
  • Status:FIXED in PR #1300 (v3.5.4). Registered workflow_run in workflow-tools.ts, task_assign in task-tools.ts, task_summary and mcp_status in system-tools.ts.

9. CacheManager setInterval missing .unref() prevents process exit (#1256)

  • Impact: CLI process hangs after completion. Users must Ctrl+C to exit.
  • Fix: Add .unref() to all setInterval timers in CacheManager.
  • Effort: Trivial.
  • Status:FIXED in PR #1300 (v3.5.4). Added .unref() to 6 setInterval calls across mcp-server.ts, output.ts, worker-queue.ts, container-worker-pool.ts.

10. Zero swarms always: ruflo spawn hive-mind --claude (#1279)

  • Impact: Hive-mind spawning returns zero agents. Multi-agent feature is non-functional via CLI.
  • Fix: Debug agent spawn path — likely missing topology init or agent pool connection.
  • Effort: Medium.

11. MCP server status reports 'Stopped' in stdio mode (#1289)

  • Impact: status command shows STOPPED for a correctly-running stdio-mode MCP server.
  • Fix: Detect stdio transport mode and report status accordingly.
  • Effort: Small.
  • Status:FIXED in PR #1300 (v3.5.4). getStatus() in mcp-server.ts now detects stdio mode via !process.stdin.isTTY, env var, or options. CLI displays "Running (stdio mode)".

12. doctor: disk space check reports wrong capacity percentage (#1288)

  • Impact: Misleading health check output (bytes vs KB mismatch).
  • Fix: Fix arithmetic in disk space calculation.
  • Effort: Trivial.
  • Status:FIXED in PR #1300 (v3.5.4). Changed df -h to df -Ph for POSIX single-line output; added NaN guard for percentage parsing.

13. SonaTrajectoryService does not use native @ruvector/sona API (#1243)

  • Impact: SONA neural learning uses a stub. Learning is no-op.
  • Fix: Wire to actual @ruvector/sona package methods.
  • Effort: Medium.

14. Wire AgentMemoryScope 3-scope isolation (#1227)

  • Impact: Memory operations don't enforce agent/session/global scope isolation.
  • Fix: Add scope parameter to memory store/retrieve operations.
  • Effort: Medium.

15. Wire SolverBandit Thompson Sampling into hooks_route (#1217)

  • Impact: Agent selection uses random/round-robin instead of learned multi-armed bandit.
  • Fix: Integrate SolverBandit into the route hook's agent selection logic.
  • Effort: Medium.

16. npm ECOMPROMISED cache corruption (#1231)

  • Impact: Some users get ECOMPROMISED errors on npx ruflo. Related to the removed preinstall script.
  • Fix: Document cache clear workaround: npm cache clean --force. The preinstall removal in v3.5.3 should prevent new occurrences.
  • Effort: Trivial — documentation only.

17. AgentDB v2 → v3 upgrade (#1207)

  • Impact: RVF backend migration path not documented for existing users.
  • Fix: Add migration guide and automatic detection in init.
  • Effort: Medium.

P3 — Low (Next Release Cycle)

18. ruvi MCP server: Edge Functions failing (#1276)

  • Impact: Supabase Edge Functions intermittently fail (cloud-only).
  • Fix: Add retry logic and auth token refresh.

19. Rollback incident templates (#1238, #1262, #1267, #1268)

  • Impact: Four empty stubs cluttering the issue tracker.
  • Fix: Close all four. Create a proper incident template if needed.
  • Status:CLOSED in v3.5.4 cycle. All four stubs closed as housekeeping.

20. Context Optimization Engine — 95-98% compression (#1273)

  • Type: Feature request.

21. Multilingual embedding model support (#1272)

  • Type: Feature request for Chinese embeddings.

22. Ship dsp as bin entry (#1236)

  • Type: Feature request for convenience alias.

23. ADR-058: Self-Contained ruflo.rvf Appliance (#1245)

  • Type: Enhancement. Phase 3-4 implementation exists.

24. ADR-057: Replace sql.js with RVF native storage (#1242)

  • Type: Enhancement. Architectural improvement.

Summary Matrix

PriorityCountFixedRemainingKey Themes
P0110Windows platform support (fixed v3.5.6)
P1660All fixed: hook stdin, memory resolution, learning loop, macOS daemon, settings-generator
P21055MCP tools, .unref(), stdio status, doctor disk, rollback stubs (fixed); SONA, scope, bandit, hive-mind, ECOMPROMISED (remaining)
P3716Rollback stubs closed; feature requests remain
Total241311

Sprint 1 (This Week) — Unblock Intelligence Pipeline

  1. #1211 hook-handler.cjs stdin — highest leverage, unblocks all learning
  2. #1209 recordFeedback() wiring — completes the JUDGE step
  3. #1214 MemoryGraph instantiation — enables relationship tracking
  4. #1287 auto-memory-hook resolution — fixes memory persistence
  5. #1256 CacheManager .unref() — trivial, fixes UX annoyance

Sprint 2 (Next Week) — Platform & Configuration

  1. #1282 Windows daemon/memory — unblocks Windows users
  2. #1283 macOS daemon — unblocks macOS users
  3. #1291 Settings-generator completeness — fixes fresh installs
  4. #1281 Missing MCP tools — removes dead references
  5. #1289 MCP stdio status — fixes confusing UX

Sprint 3 (Following Week) — Neural & Advanced Wiring

  1. #1243 SONA wiring — enables real neural learning
  2. #1227 AgentMemoryScope — enables scope isolation
  3. #1217 SolverBandit — enables learned agent routing
  4. #1279 Hive-mind zero swarms — fixes multi-agent CLI
  5. #1288 Doctor disk space — trivial arithmetic fix

Backlog

  • #1231 (document cache workaround)
  • #1207 (migration guide)
  • P3 feature requests and housekeeping

Decision

  1. Immediately fix #1211 (hook stdin) — ✅ Done in v3.5.4, refined in v3.5.7 (timeout fix).
  2. Sprint 1 targets the learning loop: stdin → feedback → memory graph → memory resolution. — ✅ Done (5/5 items fixed in v3.5.4).
  3. Sprint 2 targets platform parity (Windows, macOS) and configuration completeness. — ✅ Done (5/5 items fixed across v3.5.4–v3.5.6). Windows verified 10/10 tests pass.
  4. Sprint 3 wires the advanced neural and routing features. — 1 of 5 items fixed (#1288 doctor). 4 remain (#1243, #1227, #1217, #1279).
  5. P3 items go to backlog for roadmap planning. — 1 closed (rollback stubs). 6 remain.

Resolution Log

v3.5.7 (2026-03-05) — Hook Stdin Timeout

#IssueFix Summary
2#1211 (update)Replaced for await stdin with event-based + 500ms timeout in hook-handler.cjs and template

v3.5.6 (2026-03-05) — Platform Parity & Settings

#IssueFix Summary
1#1282 — Windows daemonPlatform-aware spawn: windowsHide, shell: true on Win; no detached on Win
3#1283 — macOS daemonSIGHUP handler ignores terminal close; PID written after unref() + 100ms delay
5#1291 — Settings-generatorAdded PostToolUse:Bash, PreToolUse:Write|Edit, SubagentEnd, Notification hooks
Daemon branding"Worker Daemon" → "RuFlo Daemon" in 3 status displays

v3.5.5 (2026-03-05) — Branding Sweep

All "Claude Flow V3" → "RuFlo V3" across 30+ files (CLI source, helpers, statusline).

v3.5.4 (2026-03-05) — Sprint 1: Intelligence Pipeline

Fixes delivered in PR #1300, merged to main on 2026-03-05.

Issues Fixed in v3.5.4 (10)

#IssueFix Summary
2#1211 — Hook stdinreadStdin() with !process.stdin.isTTY detection in hook-handler.cjs
4#1287 — auto-memory-hook4-strategy resolution: local dev, createRequire, ESM import, walk-up
6#1209 — recordFeedback()Wired bridgeRecordFeedback() into post-edit in hooks-tools.ts
7#1214 — MemoryGraphmemoryGraph: true in memory-bridge.ts getRegistry()
8#1281 — Missing MCP toolsRegistered workflow_run, task_assign, task_summary, mcp_status
9#1256 — CacheManager .unref()6 setInterval calls across 4 files
11#1289 — MCP stdio statusStdio mode detection via !process.stdin.isTTY
12#1288 — Doctor disk spacedf -Ph + NaN guard
19#1238, #1262, #1267, #1268 — Rollback stubsAll 4 closed as housekeeping

Validation Results (2026-03-05)

  • ✅ TypeScript compilation clean (tsc --noEmit)
  • ✅ CLI build clean (npm run build)
  • ✅ Hook stdin smoke test: JSON parsed correctly from piped input
  • ✅ Backward compat: process.argv still works when no stdin
  • ✅ Dangerous command detection: rm -rf / blocked
  • ✅ Docker: all 4 containers healthy (mongodb, mcp-bridge, chat-ui, nginx)
  • ✅ MCP bridge: 54/54 test harness pass
  • ✅ Conversation isolation: two distinct IDs returned
  • ✅ Published v3.5.4: all 3 npm packages, all dist-tags verified

Remaining Open Issues (11)

P0 (0): All critical issues resolved. P1 (0): All high-priority issues resolved. P2 (5): #1279 zero swarms, #1243 SONA wiring, #1227 AgentMemoryScope, #1217 SolverBandit, #1231 ECOMPROMISED docs, #1207 AgentDB migration P3 (6): #1276 Edge Functions, #1273 context optimization, #1272 multilingual, #1236 dsp bin, #1245 ADR-058, #1242 ADR-057

Consequences

  • Sprint 1 fixes should ship as v3.5.4 (patch — learning pipeline). ✅ Shipped as v3.5.4 (PR #1300).
  • Sprint 2 remaining fixes (#1282, #1283, #1291) should ship as v3.6.0. ✅ Shipped as v3.5.5 (branding), v3.5.6 (platform+settings), v3.5.7 (stdin timeout).
  • Sprint 3 fixes should ship as v3.7.0 (minor — neural features). 4 items remain: #1243, #1227, #1217, #1279.
  • The 4 rollback incident stubs (#1238, #1262, #1267, #1268) should be closed immediately as housekeeping. ✅ Closed.