docs/research/sota-2026-05-22/R3-crossroom-reid.md
Status: simulation + ADR-024/027 synthesis + privacy framing · 2026-05-22
AETHER (ADR-024) gives us contrastive CSI embeddings that achieve ~95% within-room 1-shot re-identification on MM-Fi. Can the same embeddings identify the same person across a different room?
This question has two answers — a technical one and an ethical one. R3 takes both seriously.
A CSI embedding from any frame is approximately:
embedding = person_signature + environment_signature + noise
The environment signature includes multipath geometry, AP placement, furniture, walls. It is constant per (room, antenna placement), and changes by O(1) between rooms — empirically larger than the per-person signature variation. This is exactly the structure that ADR-027 (MERIDIAN) targets.
examples/research-sota/r3_crossroom_reid.py simulates the problem with physics-realistic parameters: 10 subjects, 3 rooms, 128-dim embeddings, person-signature scale 0.35, environment scale 1.5 (env ≈ 4.7× person), noise 0.3.
| Configuration | 1-shot accuracy | Δ from baseline |
|---|---|---|
| Within-room baseline | 100.0% | (matches AETHER ~95% target) |
| Cross-room, raw cosine K-NN | 70.0% | -30 pp |
| Cross-room, MERIDIAN 100% env subtraction | 100.0% | recovered |
| Cross-room, MERIDIAN 70% env subtraction (realistic) | 100.0% | recovered |
| Chance | 10.0% | floor |
Three observations:
MERIDIAN's core idea (ADR-027) is to estimate environment_signature from labelled samples in the new room and subtract it. The estimator works because:
mean(embeddings in room R) ≈ environment_signature[R]Subtracting the per-room centroid gives embedding_clean ≈ person_signature + noise, which is the room-invariant signature.
Trade-off: MERIDIAN needs labelled (or at least clustered) examples in the new room to estimate its centroid. Pure zero-shot transfer to an unobserved room is much harder — without any anchor, you can't distinguish "person A in new room" from "person B in old room" robustly.
R6's Fresnel forward model tells us where the env_sig lives in the embedding: it's the contribution from the multipath / reflector geometry. A 5 m bedroom has 4-6 dominant reflector positions; the env_sig is a function of those.
If we could predict the env_sig from the forward model + a room geometry (R6's A matrix + a coarse map of the room), we wouldn't need labelled examples. This is the next-tier sophistication: physics-informed domain invariance rather than statistically estimated.
This isn't built. It's the right next step in the AETHER + MERIDIAN line.
The same primitive that enables "RuView greets you by name in your bedroom" enables a building-level adversary to track every individual's movement through every WiFi-CSI-sensing surface. This is a stronger surveillance primitive than face recognition because:
The R14 ethical framework (opt-in by default, data stays on-device, override is one tap) applies, but with additional constraints specific to re-ID:
These constraints make some use cases impossible (e.g. "automatic global biometric ID" — yes, that's the point) and some clearly aligned with the user (e.g. "remember which family member is in which room").
person + env + noise decomposition. Real CSI has multiplicative environment effects in the multipath domain — env modulates person signature amplitude in subcarrier-specific ways. A more realistic forward model would multiply the per-subcarrier slot transfer function with the person signature, which makes env-removal harder (not just subtraction).