docs/adr/ADR-037-multi-person-pose-detection.md
The current signal-derived pose estimation pipeline (derive_pose_from_sensing() in the sensing server) generates at most one skeleton per frame from aggregate CSI features. When multiple people are present, only a single blended skeleton is produced. Live testing with ESP32 hardware confirmed: 2 people in the room yields 1 detected person.
A single ESP32 node provides 1 TX × 1 RX × 56 subcarriers of CSI data per frame. While this is limited spatial resolution compared to camera-based systems, the signal contains composite reflections from all scatterers in the environment. The challenge is decomposing these composite signals into per-person contributions.
Implement multi-person pose detection in four phases, progressively improving accuracy from heuristic to neural approaches.
Estimate occupancy count from CSI signal statistics without decomposition.
Approach: Eigenvalue analysis of the CSI covariance matrix across subcarriers.
Accuracy target: > 80% for 0-3 people with single ESP32 node.
Integration point: signal/src/ruvsense/field_model.rs already computes SVD eigenstructure. Extend with a estimate_occupancy() method.
Separate per-person signal contributions using blind source separation.
Approach: Non-negative Matrix Factorization (NMF) on the CSI spectrogram.
Alternative: Independent Component Analysis (ICA) on complex CSI (amplitude + phase). More powerful but requires phase calibration (see ruvsense/phase_align.rs).
Integration point: New module signal/src/ruvsense/separation.rs.
Generate distinct pose skeletons per decomposed component.
Approach: Per-component feature extraction → per-person skeleton synthesis.
ruvsense/pose_tracker.rs) with AETHER re-ID embeddings (ADR-024)Integration point: Modify derive_pose_from_sensing() in sensing-server/src/main.rs to return Vec<Person> with length > 1.
Train a dedicated multi-person model using the RVF pipeline (ADR-036).
Accuracy target: [email protected] > 60% for 2-person scenarios.
persons[] array| Component | Phase | Change |
|---|---|---|
signal/src/ruvsense/field_model.rs | 1 | Add estimate_occupancy() |
signal/src/ruvsense/separation.rs | 2 | New module: NMF decomposition |
sensing-server/src/main.rs | 3 | derive_pose_from_sensing() multi-person output |
signal/src/ruvsense/pose_tracker.rs | 3 | Multi-target tracking |
nn/ | 4 | Multi-person inference head |
train/ | 4 | Multi-person training pipeline |
| Operation | Budget | Phase |
|---|---|---|
| Person count estimation | < 2ms | 1 |
| NMF decomposition (k=3) | < 10ms | 2 |
| Multi-skeleton synthesis | < 3ms | 3 |
| Neural inference (multi-person) | < 50ms | 4 |
| Total pipeline | < 65ms (15 FPS) | All |