Back to Ruflo

Ruflo Federation — User Guide

docs/federation/README.md

3.10.012.5 KB
Original Source

Ruflo Federation — User Guide

Cross-installation agent peering with built-in cost limits, circuit breaker, signed envelopes, and (as of alpha.14) opt-in WireGuard mesh layer governed by federation trust.

This guide walks through what federation is, when to use it, and how to set it up. For the architectural backstory and per-phase release notes see the companion gist.

What federation does

Federation lets two or more Ruflo installations — your mac, a server, a teammate's laptop — discover each other, exchange signed manifests, and send messages between them with bounded cost and per-peer trust gates. Key properties:

  • Ed25519 identity — each node holds a private key; peers exchange Ed25519-signed manifests. No central directory.
  • Five-level trust ladderUNTRUSTED → VERIFIED → ATTESTED → TRUSTED → PRIVILEGED. Each level unlocks a wider set of operations (discovery, send, share-context, remote-spawn, …).
  • Per-peer budget + circuit breaker — bounded tokens/USD per peer. Sustained failures auto-SUSPEND; further failures EVICT. ADR-097.
  • PII pipeline + audit trail — every cross-peer envelope passes through PII detection. Every state transition is auditable.
  • Real wire transport — WSS with permessage-deflate compression, optional cert pinning, stream multiplexing. ADR-104.
  • Optional WG mesh layer — opt-in opaque packet-layer reachability that follows federation trust changes. Compromised peer auto-isolated at L3 when the breaker fires. ADR-111.

When to use federation

Use caseFit
Two laptops collaborating on a project, want bounded cost sharing + audit
Personal home server agent ↔ travel laptop
Team of 5 engineers sharing memory/skills across machines
Mobile / Windows / sub-50 peers with NAT issues✅ over Tailscale + federation
Public-internet exposed agent endpoints✅ with TLS cert pinning (ADR-107)
Internal HR/finance multi-agent workflow with strict access tiers✅ with PRIVILEGED gating + audit
Replacing Slack/Discord — no❌ federation is for agent-to-agent, not human chat
Untrusted-internet messaging without identity vetting❌ — trust ladder must be bootstrapped out-of-band

Quick start

1. Install the plugin

bash
npx ruflo@latest                                     # if you don't have it yet
npx ruflo plugins install @claude-flow/plugin-agent-federation

Or directly via npm:

bash
npm i @claude-flow/plugin-agent-federation@latest    # currently 1.0.0-alpha.14

2. Initialize a node

bash
npx claude-flow@v3alpha agent spawn -t federation --name fed-1

Or via the MCP tool federation_init:

json
{
  "tool": "federation_init",
  "params": {
    "nodeId": "my-mac",
    "endpoint": "ws://my-mac.tailnet:9100",
    "agentTypes": ["coder", "tester"]
  }
}

This generates an Ed25519 keypair (persisted to .claude-flow/federation/keys-<nodeId>.json, mode 0600), publishes a signed manifest, and starts the discovery service.

3. Join a peer

json
{
  "tool": "federation_join",
  "params": {
    "endpoint": "ws://other-host.tailnet:9100"
  }
}

The handshake exchanges manifests, verifies Ed25519 signatures, and establishes the initial trust level (UNTRUSTED until the trust-evaluator records enough successful interactions to promote).

4. Send a message

json
{
  "tool": "federation_send",
  "params": {
    "targetNodeId": "other-host",
    "messageType": "agent-handoff",
    "payload": { "task": "Investigate the failing integration test" }
  }
}

Budget and trust gates apply: if the peer is below ATTESTED, only discovery/status/ping go through. If the breaker has the peer SUSPENDED, the send short-circuits with PEER_SUSPENDED.

MCP tools at a glance

ToolWhat it doesTrust gate
federation_initInitialize this node
federation_joinJoin a peer by endpoint
federation_peersList discovered peers
federation_sendSend a typed message to a peerper-peer (varies)
federation_querySynchronous query → responseATTESTED+
federation_statusCurrent node + peer trust summary
federation_trustView / adjust trust levelsoperator
federation_auditRead audit logoperator
federation_breaker_statusPer-peer state, when changed, why
federation_evictOperator manual evictoperator
federation_reactivateOperator manual reactivateoperator
federation_report_spendReport cost of a completed callintegrator
federation_consensusFederated proposal across peersvaries
federation_wg_status(ADR-111) Per-peer mesh state
federation_wg_attest(ADR-111) Operator-signed witness entryoperator
federation_wg_keyrotate(ADR-111) Rotate WG keypairoperator + confirm:true

Trust levels — what each unlocks

LevelCapabilities (federation)WG reachability (if ADR-111 active)
UNTRUSTEDdiscoveryExcluded from mesh — drop all
VERIFIED+ status, pingDiscovery port (9100) only
ATTESTED+ send, receive, query-redacted+ federation messaging (9101-9199)
TRUSTED+ share-context, collaborative-task+ ssh (22), services (80/443)
PRIVILEGED+ full-memory, remote-spawnFull mesh

Trust is earned via repeated successful interactions (the TrustEvaluator tracks score + interaction count). Promotion thresholds are documented in domain/entities/trust-level.ts.

Circuit breaker

If a peer's failure ratio or cost spend exceeds the policy:

ACTIVE → SUSPENDED → EVICTED
   ↑          │
   └─────────operator-only─────────┘
  • SUSPEND: peer's outbound sends short-circuit (PEER_SUSPENDED). Existing sessions continue; new sends rejected. Auto-eviction on continued failures.
  • EVICT: peer's outbound sends short-circuit (PEER_EVICTED). Session terminated. Operator must explicitly reactivate.
  • The breaker does not auto-reactivate. The integrator's health probe is responsible for confirming the peer is healthy and calling federation_reactivate.

Policy is tunable; defaults are conservative (50+ samples needed before suspend, high failure ratio threshold).

ADR-111 — WireGuard mesh (opt-in, since alpha.14)

Federation today treats network connectivity as the integrator's problem (Tailscale, LAN, wss://+pinning). That works but trust changes don't propagate to the L3 layer — a peer EVICTED in federation stays in the tailnet until an admin manually removes it.

ADR-111 closes that gap with an optional in-tree WG mesh:

  • WG keypair generated alongside the federation key
  • Mesh IP derived deterministically from nodeId (sha256 → 10.50.0.0/16 host portion, with collision-handling probe loop)
  • WG identity published inside the same Ed25519-signed manifest
  • Federation breaker SUSPEND → wg set ... allowed-ips "" (soft-block at L3)
  • Federation breaker EVICT → wg set ... remove (terminal)
  • Operator reactivate → AllowedIPs restored
  • Each mutation entered into an append-only Ed25519-signed witness chain

Phases shipped in alpha.14:

  • Phase 1 — Manifest extension + key generation
  • Phase 2 — WgMeshService (no shell — emits configs + commands)
  • Phase 3 — Coordinator/breaker wiring
  • Phase 4 — Firewall projection (nftables / pf) — PR #1895
  • Phase 5 — Witness attestation chain — PR #1895
  • Phase 6 — Operator MCP tools — PR #1895

Phase 7 — operator-mediated: See docs/federation/phase7-mesh-bringup.md for the cross-OS bringup procedure (mac ↔ ruvultra over Tailscale).

Enabling ADR-111

Set config.wgMesh: true in your federation plugin config, then run the staging helper:

bash
node v3/@claude-flow/plugin-agent-federation/scripts/phase7-stage.mjs \
  <localNodeId> <peerNodeId> <peerPubkey> <peerMeshIP> <peerEndpoint>

The script generates /tmp/adr-111-stage/:

  • wg-key-<nodeId>.json (mode 0600 — your private WG key)
  • ruflo-fed.conf (the wg-quick interface config)
  • ruflo-fed.nft or ruflo-fed.pf (firewall projection)

After review, the operator manually activates:

bash
sudo install -m 0600 /tmp/adr-111-stage/ruflo-fed.conf /etc/wireguard/ruflo-fed.conf
# Linux:
sudo nft -f /tmp/adr-111-stage/ruflo-fed.nft
# macOS:
sudo pfctl -a ruflo-fed -f /tmp/adr-111-stage/ruflo-fed.pf
sudo wg-quick up ruflo-fed

Using claude -p headless mode

claude -p (print/pipe mode) can drive federation MCP tools non-interactively. Each invocation processes its prompt and exits — for a persistent federation listener, run a long-lived MCP server instead.

On the originating host (mac mini):

bash
claude -p --model haiku --max-budget-usd 0.20 --output-format text \
  "Federation MCP tools to verify cross-machine peering health: name 3. One line."
# → Three Federation MCP tools for peering health verification:
#   federation-init (keypair generation), federation-status (peers/trust/metrics),
#   federation-audit (compliance filtering).

On the peer host (ruvultra) — requires interactive /login first since claude -p reads stored credentials:

bash
# First time only — operator-mediated:
claude   # then /login

# After that:
claude -p --model haiku --max-budget-usd 0.20 \
  "Run federation_status MCP and report peer count."

Workflow for cross-machine task handoff:

bash
# Host A — kick off a federated task:
claude -p --model sonnet --output-format json --resume <session> \
  "Use federation_send to dispatch this task to ruvultra: analyze the failing test"

# Host B (ruvultra) — receive + work:
claude -p --resume <session> "Continue handling the federated task"

The federation plugin handles signing, PII gating, breaker, and audit on every send. The two claude -p invocations don't share state directly — they communicate exclusively through the federation envelope channel.

Anti-goals — when NOT to use federation

  • Replacement for Slack/Discord. Federation moves agent envelopes, not human chat.
  • Public internet without identity vetting. Trust ladder bootstrapping is your responsibility.
  • NAT traversal magic. Use Tailscale (or Headscale) for that; federation rides on top.
  • A general-purpose RPC framework. It's specifically for AI agents with cost-aware budgeting.

Where things live

WhatPath
Plugin sourcev3/@claude-flow/plugin-agent-federation/src/
Testsv3/@claude-flow/plugin-agent-federation/__tests__/
ADRsv3/docs/adr/ADR-{097,104,105,106,107,109,110,111}-*.md
Phase 7 staging scriptv3/@claude-flow/plugin-agent-federation/scripts/phase7-stage.mjs
Witness signingplugins/ruflo-core/scripts/witness/

Releases

VersionWhat landed
1.0.0-alpha.9First user-visible release — see announcement gist
1.0.0-alpha.10ADR-097 Phases 2.a-4 + ADR-104 transport + ADR-109 inbound dispatcher
1.0.0-alpha.11-12ADR-109 sig verify, ADR-104 compression, ADR-107 TLS cert pinning
1.0.0-alpha.13ADR-104 stream multiplexing + ADR-110 MemorySpendReporter
1.0.0-alpha.14ADR-111 Phases 1-3 (WG mesh foundation)
1.0.0-alpha.15 (in flight)ADR-111 Phases 4-6 (firewall + witness + MCP tools) — PR #1895
  • ADR-097 — budget + circuit breaker
  • ADR-104 — WSS transport + multiplexing
  • ADR-105 — state snapshot/replay
  • ADR-106 — discovery mechanisms
  • ADR-107 — TLS + cert pinning
  • ADR-109 — inbound dispatch + sig verify
  • ADR-110 — production SpendReporter
  • ADR-111WG mesh layer

Support