plugins/ruflo-metaharness/skills/harness-evolve/SKILL.md
Surfaces the upstream metaharness-darwin evolve CLI as a ruflo skill. The
write layer that pairs with ADR-150's read layer (score / genome /
mcp-scan / threat-model / oia-audit). Use when you have a harness whose
readiness scores are flat and you want to discover which surface mutation
moves them — without retraining the foundation model.
harness-score result is below target and you don't know which policy
surface is responsible.Implementation: scripts/evolve.mjs.
--repo exists, caps on --generations ≤ 50, --children
≤ 20, --concurrency ≤ 8, sandbox/selection/mutator are known values).--confirm: print plan + exit 0 (mirrors harness-mint safety
convention; defense in depth over the upstream safety.ts checks).--confirm: shell to npx -y @metaharness/darwin@~0.3.1 metaharness-darwin evolve <repo> ...
via the shared _darwin.mjs async helper. Per-generation progress is
forwarded to stderr; final champion JSON is captured from stdout.generations × children × per-variant (per-variant
≈ 60s real, ≈ 2s mock). Caller may override with --timeout-ms.inspectVariant
for secrets / shell-out / network / dynamic-eval). See ADR-153 §"Safety model".--alert-on-no-improvement: exit 1 when champion ≤ parent.| Surface | What it owns |
|---|---|
planner | task decomposition / step ordering |
contextBuilder | what gets fed into the prompt |
reviewer | self-critique / output verification |
retryPolicy | when + how to retry on failure |
toolPolicy | which tools the agent may use, under which conditions |
memoryPolicy | what to persist, recall, forget |
scorePolicy | how the agent grades its own output |
One mutation per variant. Multi-surface mutations are not allowed (causal attribution stays clean).
Reports land under <repo>/.metaharness/:
.metaharness/
archive.json # full lineage tree (sampling next gen draws from this)
lineage.json # parent→child edges only
variants/<id>/ # per-variant code (kept for audit)
runs/<id>/ # per-variant sandbox test output
reports/winner.json # final champion + score delta vs parent
Skill stdout = JSON {success, data: {champion, plan, durationMs, improved}}.
| Code | Meaning |
|---|---|
| 0 | Evolved OK, or dry-run, or degraded (Darwin absent) |
| 1 | --alert-on-no-improvement and champion did not beat parent |
| 2 | Config error or evolution infrastructure failure |
| 99 | Upstream "safety-disqualified" (PROPAGATED, not remapped) |
When @metaharness/darwin is not installed, the script emits
{degraded: true, reason: 'metaharness-darwin-not-available', hint: ...}
and exits 0. ruflo continues to function. CI's
no-metaharness-smoke.yml-style job asserts this path.