v3/docs/adr/ADR-172-fable-advisor-harness.md
claude -p (Cost-Disciplined Judge + Reflector)ID: ADR-172
Status: Proposed — implemented on feat/agenticow-integration (ships in 3.21.0)
Date: 2026-07-04
Authors: rUv (drafted with Claude Code)
Related ADRs:
Two SOTA loops want a frontier model in the loop, not just at the endpoints:
Both are the same primitive: a headless frontier-model advisor. Fable 5 (claude-fable-5) via claude -p provides it.
The load-bearing constraint is cost. Measured: a trivial claude -p --model claude-fable-5 call from the project directory costs $1.56 — it loads ruflo's CLAUDE.md context (56k cache-creation tokens). At a 100-trajectory corpus that is ~$150. Naive integration is unusable.
A single cost-disciplined fable-harness.ts service wrapping claude -p, with two entry points and three mandatory cost controls.
claude -p from a fresh empty temp directory so no project CLAUDE.md / settings load. Measured effect: $1.56 → $0.34 (cache-creation 56k → 3.7k tokens).--append-system-prompt for the role + --max-budget-usd cap + --output-format json. Opt-in, off by default.Combined: $1.56/item naive → ~$0.02/item disciplined — the difference between a demo and a usable loop.
judgeCompletions(items[]) → [{id, resolved, confidence, reason}] — ADR-171 Tier 2. (Probe: correctly returned resolved:true for an applied patch, false for "I am not sure how to do this.")reflectFailures(items[]) → [{failureClass, diagnosis, mutationHint}] — GEPA reflective-mutation feed for evolve/optimization.claude binary / budget exhausted / non-JSON reply → structured degraded result, never a throw.claude -p (child_process) — CI never spends. One live smoke behind RUFLO_FABLE_LIVE=1 for humans.