docs/adr/ADR-105-federated-csi-training.md
Status: Proposed · Date: 2026-05-22 · Author: SOTA research loop tick-13 · Supersedes: none
RuView's per-occupant features (R14 empathic appliances, R3 cross-room re-ID, R8 per-person counting) require personalised models that learn the household's specific subjects, motion patterns, and environmental quirks. Personalisation requires training data, but the privacy framework from R14 + R3 explicitly forbids sending raw CSI off-device:
These constraints rule out centralised training on user CSI. The standard answer is federated learning (McMahan 2017): each device trains locally; only model deltas (gradients or weight updates) leave the device.
CSI has three properties that change the standard FedAvg recipe:
This ADR specifies the federation protocol.
Adopt MERIDIAN-FedAvg with byzantine-robust aggregation as the RuView federated training protocol.
local_epochs epochs. Local data is never transmitted off-device.| Cog | Suggested federation frequency | Reason |
|---|---|---|
cog-person-count (R8/R5 work) | Weekly | Counting model is well-trained; only need updates when household composition shifts |
| AETHER re-ID head (R3) | Daily | Re-ID drifts with seasonal multipath changes |
cog-pose-estimation | Monthly | Base pose is stable; finetune only for new room geometries |
cog-maritime-watch (R11) | Per-vessel-deployment | Vessel motion regimes vary; ship-specific fine-tune |
Per round (typical RuView 4-seed installation):
| Phase | Bytes per node | Total |
|---|---|---|
| Coordinator → node: global checkpoint | 8 MB | 4 × 8 = 32 MB (multicast: 8 MB) |
| Local training (no transmission) | 0 | 0 |
| Node → coordinator: int8+LoRA delta | 1 MB | 4 × 1 = 4 MB |
| Aggregation + push: new global checkpoint | 8 MB | 8 MB |
| Total per round | ~ 5 MB / node | ~12-44 MB |
At weekly cadence × 4-week month, that's ~50-180 MB / month / installation. Well under typical home broadband caps (300 GB/month standard cap = 0.06% of bandwidth budget).
ruview-fed crate — protocol implementation, ~500 lines Rust, library only (no daemon).Status: rejected. Violates R14 (data stays on-device) and R3 (no cross-installation linkage).
Status: rejected. A single compromised seed can shift the global model arbitrarily. R7 mincut adversarial work showed this is a real attack surface; Krum (or any byzantine-robust replacement) is required.
Status: deferred to a future ADR. Cross-installation federation requires:
A worked design needs ~6 person-months of legal + crypto work. Not in scope for this ADR.
Status: alternative path for small deployments. A single-seed installation has no peers to federate with. Use on-device-only fine-tune of pre-trained base model. The federation protocol gracefully degrades to "no federation = local training only".
| Threat | Mitigation (within this ADR) |
|---|---|
| Compromised seed poisons global model | Krum aggregation + mincut consistency check (R7) |
| Coordinator (cognitum-v0) compromised | Multi-coordinator fallback; signed model checkpoints (Ed25519, ADR-100 pattern) |
| Eavesdropper recovers training data from deltas | LoRA rank-8 + int8 quantisation is information-theoretically lossy; differential privacy noise (σ=0.01) on deltas if higher assurance needed |
| Adversarial training signal injection (via crafted CSI) | R7 multi-link consistency (across antennas in same seed) catches this; federated mincut adds inter-seed consistency layer |
| Member inference attack on the trained model | LoRA + DP-SGD on local training, see future ADR-106 for the formal DP budget |
cog-pose, cog-count, AETHER head, future cogs all use the same federation surface.ruview-fed crate).ruview-fed ships.ruview_fed_status, ruview_fed_pause) — out of scope for this ADR but in the natural MCP roadmap.| Step | Owner | LOC | Notes |
|---|---|---|---|
1. ruview-fed crate scaffold | TBD | 100 | Workspace member, no external deps initially |
| 2. Krum aggregator | TBD | 80 | Pure Rust, no GPU |
| 3. LoRA+int8 delta codec | TBD | 120 | Reuse ruvllm-microlora |
| 4. MERIDIAN centroid hook | TBD | 50 | Extend AgentDB hierarchical store |
| 5. Inter-seed mincut consistency | TBD | 100 | Reuse ruvector-mincut |
6. CLI surface (wifi-densepose-cli fed status / fed pause) | TBD | 80 | Add to existing CLI |
| 7. End-to-end test on 4-seed cognitum-cluster (the Pi+Hailo fleet from CLAUDE.local.md) | TBD | — | Real-hardware test |
Total ~500 lines + tests. A reasonable 2-week effort once ruview-fed is unblocked.
local_train(); ADR-105 only specifies the wire format and aggregation rules.This ADR's threat model and update-level mincut design are direct outputs of the loop's two negative results:
ruview-fed ships.f < (K-2)/2 byzantine nodes. For K=4, that means 1 byzantine; for K=10, 4. RuView installations rarely have K>10 seeds, so the practical bound is ~4 byzantine.