docs/research/sota-2026-05-22/R1-toa-crlb.md
Status: closed-form CRLB analysis + numpy demo · 2026-05-22
R6 gave us the spatial sensitivity envelope (Fresnel-zone forward model) but said nothing about how precisely we can place a scatterer in 3-space. The two questions are independent: an antenna pair can be sensitive to motion within a 40 cm ellipsoid (R6) but only able to localise the cause of motion to ±50 cm (R1). For multistatic localisation, target tracking, and any per-occupant geometry, the ranging precision floor is the foundational physics.
WiFi gives us two ways to estimate range:
This thread quantifies both via the Cramér-Rao Lower Bound — the best any unbiased estimator could ever do — and compares them. Pure NumPy demo: examples/research-sota/r1_toa_crlb.py.
For a matched-filter ToA estimator at bandwidth B and SNR ρ:
σ_ToA ≥ 1 / (2π · β_rms · √ρ) (Kay 1993, eq. 3.14)
σ_d = c · σ_ToA
Where β_rms = B / √3 for a brick-wall (sinc) pulse. The matched-filter is the optimal known-signal receiver; CRLB is the precision floor at infinite samples.
| Bandwidth | SNR 0 dB | 10 dB | 20 dB | 30 dB | 40 dB |
|---|---|---|---|---|---|
| 20 MHz (HT20) | 4.13 | 1.31 | 0.41 | 0.13 | 0.04 |
| 40 MHz (HT40) | 2.07 | 0.65 | 0.21 | 0.07 | 0.02 |
| 80 MHz (VHT80) | 1.03 | 0.33 | 0.10 | 0.03 | 0.01 |
| 160 MHz (VHT160) | 0.52 | 0.16 | 0.05 | 0.02 | 0.01 |
| 320 MHz (EHT320) | 0.26 | 0.08 | 0.03 | 0.01 | 0.00 |
The relevant cell for ESP32-S3 + commodity APs is 20 MHz HT20 @ 20 dB SNR → 41 cm single-shot precision. 100× averaging gets us to 4 cm.
That's the absolute best WiFi-bandwidth ToA can ever do for room-scale localisation. Below that floor is physically forbidden.
The same demo computes single-subcarrier phase-derived ranging. At carrier f_c with phase noise σ_φ (radians):
σ_d_phi = (c / 2π · f_c) · σ_φ = λ · σ_φ / 2π
| Carrier | σ_φ = 0.5° | 1° | 2° | 5° | 10° |
|---|---|---|---|---|---|
| 2.4 GHz | 0.17 | 0.35 | 0.69 | 1.73 | 3.47 |
| 5.0 GHz | 0.08 | 0.17 | 0.33 | 0.83 | 1.67 |
| 6.0 GHz | 0.07 | 0.14 | 0.28 | 0.69 | 1.39 |
The reference 5° phase-noise figure is what ESP32-S3 typically achieves after phase_align.rs's LO-offset correction.
Same scenario: 20 MHz HT20, 20 dB SNR, 100 averaged frames.
| Metric | ToA | Phase | Ratio |
|---|---|---|---|
| Single-shot | 0.413 m | 1.73 mm | 238× phase advantage |
| 100× averaged | 0.041 m | 0.17 mm | 240× |
Phase ranging is two orders of magnitude more precise than ToA at WiFi bandwidths. This is the fundamental reason the WiFi-sensing field went to CSI/phase instead of ToA.
Phase ranging is only relative. The 2.4 GHz wavelength is 12.5 cm — so an absolute phase measurement of 30° could mean 1.04 cm, 13.54 cm, 26.04 cm, 38.54 cm, … with no way to disambiguate from one subcarrier alone. This is the integer-ambiguity (cycle-slip) problem of phase-based ranging, and it's why GPS RTK is harder than GPS.
Resolution methods:
The right system combines ToA (for absolute disambiguation) and phase (for precision). This is exactly what 802.11mc FTM (Fine Timing Measurement) does on top of standard WiFi hardware — and what RTK GPS does at L-band.
A typical "tight" 4-anchor convex-hull installation (anchors at 4 corners of a 5 m × 5 m room) has Geometric Dilution of Precision (GDOP) ≈ 1.5. Position-error CRLB scales as:
σ_pos = σ_range · √(GDOP / N_anchors)
Practical result (20 MHz, 20 dB SNR, single-shot):
| Method | Position precision |
|---|---|
| ToA (4 anchors, GDOP 1.5) | 25.3 cm |
| Phase (4 anchors, GDOP 1.5) | 1.06 mm |
This bounds what's possible for SOTA WiFi multistatic localisation. 25 cm with raw ToA is room-pose-quality; 1 mm with phase is RTK-quality but only after ambiguity resolution.
The current multistatic.rs uses learned attention weights over raw CSI. The CRLB analysis suggests an explicit decomposition would do better:
This is closer to the GPS pipeline than to the current learning-based attention. The trade-off: lower flexibility (less ability to learn around hardware imperfections) but higher interpretability and provable optimality.
phase_align.rs the phase advantage shrinks to ~5×.