MetaHarness User Guide (ADR-150)

MetaHarness integration in ruflo 3.12.1+. Ten CLI subcommands, nine MCP tools, three CI workflows, and a dedicated ruflo eject command — all wired to the upstream metaharness / @metaharness/* ecosystem with graceful degradation when those optional packages aren't installed.

Quick links: Quick start · 10 CLI subcommands · 9 MCP tools · Architectural constraints · Workflows · Troubleshooting · ADR-152 similarity search · Eject

What is MetaHarness?

metaharness is a sibling agent-harness scaffolding system designed by the same author as ruflo. Where ruflo is a harness, metaharness analyzes harnesses — scoring readiness, mapping MCP surfaces, threat-modeling, fingerprinting genome characteristics, and detecting drift over time. ADR-150 integrates it as a first-class subsystem so you can audit and characterize ruflo (or any harness) from the same CLI.

The integration is strictly optional. Per ADR-150 constraint #4, ruflo remains fully operational even when every @metaharness/* package is uninstalled — every command degrades gracefully with a clear degraded: true payload instead of crashing.

Quick start

bash

# Install (metaharness ships bundled in @claude-flow/cli's plugins/)
npm i ruflo@latest

# Score the current repo's harness readiness
npx ruflo metaharness score --path .

# 7-section categorical genome report
npx ruflo metaharness genome --path .

# Static security scan of the declared MCP surface
npx ruflo metaharness mcp-scan --path . --fail-on high

# Composite audit (oia-manifest + threat-model + mcp-scan + score + genome)
npx ruflo metaharness oia-audit --path . --alert-on-worst high

# Detect drift from the last audit
npx ruflo metaharness drift-from-history --threshold 0.95

# Score two harnesses' similarity (ADR-152 §3.1)
npx ruflo metaharness similarity --a harnessA.json --b harnessB.json

All commands accept --format json|table and --help.

CLI subcommands

npx ruflo metaharness <subcommand> [flags]

#	Subcommand	One-line	Output shape
1	`score`	5-dim readiness scorecard	`{harnessFit, compileConfidence, taskCoverage, toolSafety, memoryUsefulness, estCostPerRunUsd, recommendedMode, archetype, template}`
2	`genome`	7-section categorical report	`{repo_type, agent_topology, risk_score, mcp_surface, test_confidence, publish_readiness}`
3	`mcp-scan`	Static MCP security findings	`{findings: [{severity, message, ...}], summary, alert}`
4	`threat-model`	Enterprise threat report	`{worst, findings: [{category, severity, ...}]}`
5	`oia-audit`	Composite audit → memory	`{timing, composite: {worst}, components, fingerprint, alert, persisted}`
6	`audit-list`	Enumerate audit records	`{namespace, filters, records: [{key, startedAt, ...}], generatedAt}`
7	`audit-trend`	Diff two audits (drift)	`{verdict, structuralDistance, introduced, cleared, alert}`
8	`similarity`	ADR-152 §3.1 weighted similarity	`{overall, components: {cosine, categorical, jaccard}, perDimension?}`
9	`drift-from-history`	One-command drift detection	`{timing, baseline, current, drift, alert}`
10	`mint`	Scaffold a custom harness	dry-run by default; refuses in-repo target

`score` — 5-dimension readiness

bash

npx ruflo metaharness score --path . --format json
npx ruflo metaharness score --path . --alert-on-fit-below 70

Returns five numeric dimensions (0–100):

harnessFit — overall readiness composite
compileConfidence — build/test signal strength
taskCoverage — breadth of declared agent roles
toolSafety — MCP policy posture
memoryUsefulness — persistence + retrieval characteristics

Plus estCostPerRunUsd, recommendedMode (CLI / CLI + MCP), archetype, template.

`genome` — 7-section categorical

bash

npx ruflo metaharness genome --path . --alert-on-risk-above 0.5

Returns categorical (string/enum) classifications that complement score's numerics. Pair them: score is how ready, genome is what kind.

`mcp-scan` — MCP security

bash

npx ruflo metaharness mcp-scan --path . --fail-on high

Reads .mcp/servers.json + .harness/claims.json and runs static analysis. Finding shape is normalized to {severity, message, title?, detail?, id?} — same fields whether upstream emitted JSON or our text-parser fell back.

--fail-on {low|medium|high} sets the alert.triggered floor.

`threat-model` — Enterprise threat report

bash

npx ruflo metaharness threat-model --path . --fail-on high

Returns {worst, findings: [...]} suitable for sharing with infosec. Findings are categorized; the worst-severity rollup is the operationally-useful summary.

`oia-audit` — Composite audit → memory

bash

npx ruflo metaharness oia-audit --path . \
  --alert-on-worst high \
  --format json

Bundles 5 sub-audits in parallel (oia-manifest + threat-model + mcp-scan + score + genome) into one timestamped record. Persists to the metaharness-audit memory namespace by default, or pass --dry-run to skip persistence.

Output includes a denormalized fingerprint: {score, genome} field designed for downstream similarity() and audit-trend consumption.

`audit-list` — Enumerate records

bash

npx ruflo metaharness audit-list --limit 20 --since 30d --format json

Discover which audit keys exist before running audit-trend or drift-from-history --baseline-key <k>.

`audit-trend` — Diff two audits

bash

npx ruflo metaharness audit-trend \
  --baseline-key audit-2026-06-01... \
  --current-key  audit-2026-06-15... \
  --alert-on-distance-below 0.85

Returns composite worst-severity delta + per-component status change + introduced/cleared findings + (ADR-152 §3.1) structural distance when both records carry a fingerprint.

Accepts memory keys OR direct file paths (--baseline /path/to/json.json) — useful for diffing CI artifacts.

`similarity` — ADR-152 §3.1 weighted similarity

bash

npx ruflo metaharness similarity \
  --a harnessA.json --b harnessB.json \
  --per-dimension \
  --alert-below 0.5

Returns overall ∈ [0,1] plus per-component breakdown:

cosine over 9 numerics (harnessFit, riskScore, etc.)
categorical over 4 enums (repo_type, recommendedMode, archetype, template)
jaccard over agent_topology (set of declared roles)

See ADR-152 §3.1 below for math + use cases.

`drift-from-history` — One-command drift

bash

# Slowest path — discovers the most recent audit in memory
npx ruflo metaharness drift-from-history --threshold 0.95

# Fast path — skip audit-list (~14× faster)
npx ruflo metaharness drift-from-history \
  --baseline-key audit-2026-06-15T... \
  --threshold 0.95

# Fastest path — skip memory entirely (~19× faster)
npx ruflo metaharness drift-from-history \
  --baseline-file /tmp/last-audit.json \
  --threshold 0.95 \
  --alert-on-new-severity high \
  --dry-run

Composes audit-list + oia-audit + audit-trend into one structured report. Three tiers of execution speed:

Tier	Flag	Wall time	When to use
Slow	(none)	~26 s	Interactive — let it discover the baseline
Fast	`--baseline-key`	~1.8 s	When you already know the key (e.g., from `audit-list`)
Fastest	`--baseline-file`	~1.4 s	CI artifact pipelines (diff this run vs downloaded prior artifact)

--alert-on-new-severity is orthogonal to --threshold: a CRITICAL finding triggers even if structural similarity stays above the threshold.

`mint` — Scaffold a harness

bash

npx ruflo metaharness mint --name foo --template vertical:coding --confirm

Dry-run by default. Pass --confirm to actually write.

MCP tools

Nine MCP tools registered under the metaharness category, callable by Claude Code / any MCP-aware agent:

mcp__claude-flow__metaharness_score
mcp__claude-flow__metaharness_genome
mcp__claude-flow__metaharness_mcp_scan
mcp__claude-flow__metaharness_threat_model
mcp__claude-flow__metaharness_oia_audit
mcp__claude-flow__metaharness_audit_list
mcp__claude-flow__metaharness_audit_trend
mcp__claude-flow__metaharness_similarity
mcp__claude-flow__metaharness_drift_from_history

Every handler returns the {success, data, degraded, exitCode} contract:

type MCPHandlerResult = {
  success: boolean;   // false on alert.triggered OR exitCode != 0
  data: any;          // the wrapped JSON payload
  degraded: boolean;  // true when metaharness is uninstalled
  exitCode: number;   // mirrors the CLI exit code
}

success === false is the source of truth for "this should block downstream action" — exitCode is also surfaced for shell-script consumers but the MCP layer uses success.

Each tool description includes Use when ... guidance per ADR-112 so a model can pick the right one without reading source.

Architectural constraints (ADR-150)

The integration enforces four constraints as load-bearing invariants:

#	Constraint	Enforced by
1	Removable	`npm ls --without @metaharness/*` produces a working CLI
2	Optional in `package.json`	`@metaharness/*` packages MUST be in `optionalDependencies`, never `dependencies`
3	Graceful degradation	Every code path catches `MODULE_NOT_FOUND` and falls back to a `degraded: true` payload
4	CI gate	`.github/workflows/no-metaharness-smoke.yml` enforces 1–3 by static grep + runtime drill on every PR

If @metaharness/router, metaharness, or @metaharness/kernel are absent, every command emits:

json

{
  "degraded": true,
  "reason": "metaharness-not-installed",
  "hint": "Install metaharness manually with `npm i -D metaharness` or run `npx metaharness@latest --version` to verify network access.",
  "generatedAt": "2026-06-17T..."
}

…and exits 0. Downstream tooling can branch on degraded to fall back or skip.

Common workflows

Daily drift check

bash

# Once: seed with a baseline audit
npx ruflo metaharness oia-audit --path . --alert-on-worst high

# Daily: detect drift vs the last baseline
npx ruflo metaharness drift-from-history --threshold 0.95 \
  --alert-on-new-severity high

The composite audit writes a record keyed by ISO timestamp. drift-from-history discovers it via audit-list, runs a fresh audit, diffs the fingerprints via ADR-152 §3.1 similarity, and alerts when:

Structural similarity falls below --threshold OR
Any introduced finding meets --alert-on-new-severity (orthogonal gate)

Weekly cron (CI)

The repo ships .github/workflows/oia-audit-weekly.yml which runs the composite audit every Sunday 04:17 UTC, uploads the result as a 90-day-retained artifact, and diffs against the previous week's artifact using the fastest --baseline-file path.

Adapt for your repo:

yaml

- name: composite audit
  run: |
    npx ruflo metaharness oia-audit --path . --dry-run \
      --alert-on-worst high --format json > /tmp/audit.json
- uses: actions/upload-artifact@v4
  with:
    name: oia-audit-${{ github.run_id }}
    path: /tmp/audit.json
    retention-days: 90

- name: drift vs prior week
  if: always() && steps.prior-artifact.outputs.has_prior == 'true'
  run: |
    npx ruflo metaharness drift-from-history \
      --baseline-file /tmp/prior/audit.json \
      --threshold 0.95 \
      --alert-on-new-severity high \
      --format json > /tmp/drift.json

PR audit gate

bash

# In .github/workflows/metaharness-ci.yml
npx ruflo metaharness score --path . --alert-on-fit-below 70
npx ruflo metaharness mcp-scan --path . --fail-on high
npx ruflo metaharness threat-model --path . --fail-on high

Any of these exits 1 when the alert fires; standard CI failure semantics.

Template ranking (ADR-151 §3.2)

bash

# Compare current repo against N candidate templates
for t in templates/*.json; do
  npx ruflo metaharness similarity \
    --a current-genome.json --b "$t" --format json \
    | jq "{template: \"$t\", overall: .overall}"
done | jq -s 'sort_by(-.overall)'

The Recommender surfaces the closest-fit templates for a given target repo.

ADR-152 §3.1 Genome Similarity Search

A pure-TS, zero-@metaharness/*-dep similarity engine. Weighted blend:

Component	Weight	What it compares
cosine	0.4	9 numerics: `harnessFit`, `compileConfidence`, `taskCoverage`, `toolSafety`, `memoryUsefulness`, `risk_score`, `test_confidence`, `publish_readiness`, `estCostPerRunUsd`
categorical	0.3	4 enums: `repo_type`, `recommendedMode`, `archetype`, `template`
jaccard	0.3	`agent_topology` (set of declared roles)

overall = w_c · cosine + w_k · categorical + w_j · jaccard, all in [0, 1].

Verdict thresholds:

overall	verdict
≥ 0.95	`near-identical`
≥ 0.85	`minor-drift`
≥ 0.5	`moderate-drift`
< 0.5	`major-drift`

These are the structural-distance verdicts surfaced by audit-trend and drift-from-history.

Router integration (ADR-148/149)

@metaharness/router@~0.3.2 is wired as the cost-optimal model router behind the CLAUDE_FLOW_ROUTER_NEURAL=1 triple-gate. When the neural path is active, the routedBy field carries 'metaharness-knn' | 'metaharness-krr' | 'fastgrnn' so you can audit which engine made each decision.

Parallel-logging (ADR-150 Phase 2)

bash

export CLAUDE_FLOW_ROUTER_PARALLEL_LOG=1
# … run your normal workload …
node plugins/ruflo-metaharness/scripts/router-parallel-analyze.mjs \
  --input .swarm/router-parallel.jsonl --strict

Every route() call writes a paired-decision row (bandit pick + neural-augmented pick + outcome). The analyzer enforces the 3-criteria AND-gate from ADR-150 review-round-1:

quality > 2%   AND   cost < 1%   AND   latency < 5%

--strict exit 1 if any criterion fails — the promotion gate before swapping the bandit out for the neural router in production.

`ruflo eject`

A dedicated CLI command (not under metaharness) that lifts a ruflo project into a renamed standalone harness via metaharness --from-existing.

bash

# Dry-run (default) — prints the plan and exits without writing
npx ruflo eject --name my-harness

# Eject for real
npx ruflo eject --name my-harness --confirm

# Eject to a specific dir (must be OUTSIDE the calling repo)
npx ruflo eject --name my-harness --target /abs/path --confirm

Safety gate: refuses any --target inside the calling repo. The default target is /tmp/ruflo-eject-<ts>-<name>/ — a fresh location to prevent eject-on-top-of-source accidents.

Use case: you've prototyped agent workflows on top of ruflo and want a renamed harness with its own identity, ready to publish or distribute independently.

`ruflo doctor`

Verify metaharness availability:

bash

npx ruflo doctor --component metaharness

Reports installed/missing status for @metaharness/router, metaharness, @metaharness/kernel, plus the plugin script directory location. Always exits 0 — doctor reports state, never blocks.

Troubleshooting

"metaharness: plugins/ruflo-metaharness/scripts/ not found"

Shipped fixed in [email protected]+. The CLI dispatcher locates its plugin scripts under node_modules/@claude-flow/cli/plugins/ruflo-metaharness/scripts/. If you're on 3.12.0, upgrade:

bash

npm install ruflo@latest

"degraded: true, reason: metaharness-not-installed"

The optional metaharness / @metaharness/* packages aren't in node_modules. Per ADR-150 constraint #3 this is a valid degraded mode — ruflo still works, you just won't get score/genome/etc. results. To enable them:

bash

npm install -D metaharness@latest @metaharness/router@latest

(Or accept the degraded mode — ruflo doesn't require metaharness for any non-metaharness command.)

Drift report exits 2 with "no audit records found"

You haven't seeded a baseline yet. Run one composite audit first:

bash

npx ruflo metaharness oia-audit --path .
# Then drift detection becomes meaningful
npx ruflo metaharness drift-from-history --threshold 0.95

`audit-list` shows zero records but I ran audits

Check the namespace — oia-audit persists to metaharness-audit by default. If you've overridden AUDIT_LIST_NAMESPACE, set it for audit-list too:

bash

AUDIT_LIST_NAMESPACE=my-custom-ns npx ruflo metaharness audit-list

Composite audit takes 30+ seconds on CI

Expected — oia-audit spawns 5 sub-audits in parallel and each shells out to npx metaharness <cmd>. Cold-cache npx warmup is ~25 s per process. Mitigations:

Pre-install metaharness in the runner (skips npx fetch)
Use --dry-run to skip the memory-store roundtrip
Pin a CI cache for the npm/npx store

"ELIFECYCLE Command failed with exit code 1" on `pnpm install`

Usually transient network ECONNRESET on sharp / onnxruntime-node postinstall. Retry the install — the cron-fire workflows ship with npm_config_fetch_retries=5 so most flakes auto-recover.

Internals

Source: plugins/ruflo-metaharness/ in the repo
Bundled location at runtime: node_modules/@claude-flow/cli/plugins/ruflo-metaharness/scripts/
CLI dispatcher: v3/@claude-flow/cli/src/commands/metaharness.ts
MCP tools: v3/@claude-flow/cli/src/mcp-tools/metaharness-tools.ts
Eject command: v3/@claude-flow/cli/src/commands/eject.ts
ADR: v3/docs/adr/ADR-150-metaharness-integration-surfaces.md
ADR-152 §3.1 similarity: v3/docs/adr/ADR-152-genome-similarity-search.md
Tracking issue: #2399
Upstream: github.com/ruvnet/agent-harness-generator

Cross-references

Filed upstream issues (open):

ruvnet/agent-harness-generator#15 — CLI schema mismatch (downstream workaround via runMetaharness routing in place)
ruvnet/agent-harness-generator#16 — mcp-scan text-only output (downstream parseMcpScanText parser donated as MIT contribution)

Both are tracked in ADR-150 §"Cross-references".

MetaHarness User Guide (ADR-150)

MetaHarness User Guide (ADR-150)

What is MetaHarness?

Quick start

CLI subcommands

score — 5-dimension readiness

genome — 7-section categorical

mcp-scan — MCP security

threat-model — Enterprise threat report

oia-audit — Composite audit → memory

audit-list — Enumerate records

audit-trend — Diff two audits

similarity — ADR-152 §3.1 weighted similarity

drift-from-history — One-command drift

mint — Scaffold a harness

MCP tools

Architectural constraints (ADR-150)

Common workflows

Daily drift check

Weekly cron (CI)

PR audit gate

Template ranking (ADR-151 §3.2)

ADR-152 §3.1 Genome Similarity Search

Router integration (ADR-148/149)

Parallel-logging (ADR-150 Phase 2)

ruflo eject

ruflo doctor

Troubleshooting

"metaharness: plugins/ruflo-metaharness/scripts/ not found"

"degraded: true, reason: metaharness-not-installed"

Drift report exits 2 with "no audit records found"

audit-list shows zero records but I ran audits

Composite audit takes 30+ seconds on CI

"ELIFECYCLE Command failed with exit code 1" on pnpm install

Internals

Cross-references

`score` — 5-dimension readiness

`genome` — 7-section categorical

`mcp-scan` — MCP security

`threat-model` — Enterprise threat report

`oia-audit` — Composite audit → memory

`audit-list` — Enumerate records

`audit-trend` — Diff two audits

`similarity` — ADR-152 §3.1 weighted similarity

`drift-from-history` — One-command drift

`mint` — Scaffold a harness

`ruflo eject`

`ruflo doctor`

`audit-list` shows zero records but I ran audits

"ELIFECYCLE Command failed with exit code 1" on `pnpm install`