docs/adr/ADR-011-python-proof-of-reality-mock-elimination.md
Proposed (URGENT)
2026-02-28
The WiFi-DensePose Python codebase contains real, mathematically sound signal processing (FFT, phase unwrapping, Doppler extraction, correlation features) alongside mock/placeholder code that fatally undermines credibility. External reviewers who encounter any mock path in the default execution flow conclude the entire system is synthetic. This is not a technical problem - it is a perception problem with technical root causes.
The following code paths produce fake data in the default configuration or are easily mistaken for indicating fake functionality:
| File | Line | Issue | Impact |
|---|---|---|---|
v1/src/core/csi_processor.py | 390 | doppler_shift = np.random.rand(10) # Placeholder | Real feature extractor returns random Doppler - kills credibility of entire feature pipeline |
v1/src/hardware/csi_extractor.py | 83-84 | amplitude = np.random.rand(...) in CSI extraction fallback | Random data silently substituted when parsing fails |
v1/src/hardware/csi_extractor.py | 129-135 | _parse_atheros() returns np.random.rand() with comment "placeholder implementation" | Named as if it parses real data, actually random |
v1/src/hardware/router_interface.py | 211-212 | np.random.rand(3, 56) in fallback path | Silent random fallback |
v1/src/services/pose_service.py | 431 | mock_csi = np.random.randn(64, 56, 3) # Mock CSI data | Mock CSI in production code path |
v1/src/services/pose_service.py | 293-356 | _generate_mock_poses() with random.randint throughout | Entire mock pose generator in service layer |
v1/src/services/pose_service.py | 489-607 | Multiple random.randint for occupancy, historical data | Fake statistics that look real in API responses |
v1/src/api/dependencies.py | 82, 408 | "return a mock user for development" | Auth bypass in default path |
| File | Line | Issue |
|---|---|---|
v1/src/config/settings.py | 144-145 | mock_hardware=False, mock_pose_data=False defaults - correct, but mock infrastructure exists |
v1/src/core/router_interface.py | 27-300 | 270+ lines of mock data generation infrastructure in production code |
v1/src/services/pose_service.py | 84-88 | Silent conditional: if not self.settings.mock_pose_data with no logging of real-mode |
v1/src/services/hardware_service.py | 72-375 | Interleaved mock/real paths throughout |
| File | Line | Issue |
|---|---|---|
v1/src/core/router_interface.py | 198 | "Collect real CSI data from router (placeholder implementation)" |
v1/src/api/routers/health.py | 170-171 | uptime_seconds = 0.0 # TODO |
v1/src/services/pose_service.py | 739 | "uptime_seconds": 0.0 # TODO |
No separation between mock and real: Mock generators live in the same modules as real processors. A reviewer reading csi_processor.py hits np.random.rand(10) at line 390 and stops trusting the 400 lines of real signal processing above it.
Silent fallbacks: When real hardware isn't available, the system silently falls back to random data instead of failing loudly. This means the default docker compose up produces plausible-looking but entirely fake results.
No proof artifact: There is no shipped CSI capture file, no expected output hash, no way for a reviewer to verify that the pipeline produces deterministic results from real input.
Build environment fragility: The Dockerfile references requirements.txt which doesn't exist as a standalone file. The setup.py hardcodes 87 dependencies. ONNX Runtime and BLAS are not in the container. A docker build may or may not succeed depending on the machine.
No CI verification: No GitHub Actions workflow runs the pipeline on a real or deterministic input and verifies the output.
We will eliminate the credibility gap through five concrete changes:
Every path that currently returns np.random.rand() will either be replaced with real computation or will raise an explicit error.
# BEFORE (csi_processor.py:390)
doppler_shift = np.random.rand(10) # Placeholder
# AFTER
def _extract_doppler_features(self, csi_data: CSIData) -> tuple:
"""Extract Doppler and frequency domain features from CSI temporal history."""
if len(self.csi_history) < 2:
# Not enough history for temporal analysis - return zeros, not random
doppler_shift = np.zeros(self.window_size)
psd = np.abs(scipy.fft.fft(csi_data.amplitude.flatten(), n=128))**2
return doppler_shift, psd
# Real Doppler extraction from temporal CSI differences
history_array = np.array([h.amplitude for h in self.get_recent_history(self.window_size)])
# Compute phase differences over time (proportional to Doppler shift)
temporal_phase_diff = np.diff(np.angle(history_array + 1j * np.zeros_like(history_array)), axis=0)
# Average across antennas, FFT across time for Doppler spectrum
doppler_spectrum = np.abs(scipy.fft.fft(temporal_phase_diff.mean(axis=1), axis=0))
doppler_shift = doppler_spectrum.mean(axis=1)
psd = np.abs(scipy.fft.fft(csi_data.amplitude.flatten(), n=128))**2
return doppler_shift, psd
# BEFORE (csi_extractor.py:129-135)
def _parse_atheros(self, raw_data):
"""Parse Atheros CSI format (placeholder implementation)."""
# For now, return mock data for testing
return CSIData(amplitude=np.random.rand(3, 56), ...)
# AFTER
def _parse_atheros(self, raw_data: bytes) -> CSIData:
"""Parse Atheros CSI Tool format.
Format: https://dhalperi.github.io/linux-80211n-csitool/
"""
if len(raw_data) < 25: # Minimum Atheros CSI header
raise CSIExtractionError(
f"Atheros CSI data too short ({len(raw_data)} bytes). "
"Expected real CSI capture from Atheros-based NIC. "
"See docs/hardware-setup.md for capture instructions."
)
# Parse actual Atheros binary format
# ... real parsing implementation ...
All mock code moves to a dedicated module. Default execution NEVER touches mock paths.
v1/src/
├── core/
│ ├── csi_processor.py # Real processing only
│ └── router_interface.py # Real hardware interface only
├── testing/ # NEW: isolated mock module
│ ├── __init__.py
│ ├── mock_csi_generator.py # Mock CSI generation (moved from router_interface)
│ ├── mock_pose_generator.py # Mock poses (moved from pose_service)
│ └── fixtures/ # Test fixtures, not production paths
│ ├── sample_csi_capture.bin # Real captured CSI data (tiny sample)
│ └── expected_output.json # Expected pipeline output for sample
Runtime enforcement:
import os
import sys
MOCK_MODE = os.environ.get("WIFI_DENSEPOSE_MOCK", "").lower() == "true"
if MOCK_MODE:
# Print banner on EVERY log line
_original_log = logging.Logger._log
def _mock_banner_log(self, level, msg, args, **kwargs):
_original_log(self, level, f"[MOCK MODE] {msg}", args, **kwargs)
logging.Logger._log = _mock_banner_log
print("=" * 72, file=sys.stderr)
print(" WARNING: RUNNING IN MOCK MODE - ALL DATA IS SYNTHETIC", file=sys.stderr)
print(" Set WIFI_DENSEPOSE_MOCK=false for real operation", file=sys.stderr)
print("=" * 72, file=sys.stderr)
A small real CSI capture file + one-command verification pipeline:
v1/data/proof/
├── README.md # How to verify
├── sample_csi_capture.bin # Real CSI data (1 second, ~50 KB)
├── sample_csi_capture_meta.json # Capture metadata (hardware, env)
├── expected_features.json # Expected feature extraction output
├── expected_features.sha256 # SHA-256 hash of expected output
└── verify.py # One-command verification script
verify.py:
#!/usr/bin/env python3
"""Verify WiFi-DensePose pipeline produces deterministic output from real CSI data.
Usage:
python v1/data/proof/verify.py
Expected output:
PASS: Pipeline output matches expected hash
SHA256: <hash>
If this passes, the signal processing pipeline is producing real,
deterministic results from real captured CSI data.
"""
import hashlib
import json
import sys
import os
# Ensure reproducibility
os.environ["PYTHONHASHSEED"] = "42"
import numpy as np
np.random.seed(42) # Only affects any remaining random elements
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "../.."))
from src.core.csi_processor import CSIProcessor
from src.hardware.csi_extractor import CSIExtractor
def main():
# Load real captured CSI data
capture_path = os.path.join(os.path.dirname(__file__), "sample_csi_capture.bin")
meta_path = os.path.join(os.path.dirname(__file__), "sample_csi_capture_meta.json")
expected_hash_path = os.path.join(os.path.dirname(__file__), "expected_features.sha256")
with open(meta_path) as f:
meta = json.load(f)
# Extract CSI from binary capture
extractor = CSIExtractor(format=meta["format"])
csi_data = extractor.extract_from_file(capture_path)
# Process through feature pipeline
config = {
"sampling_rate": meta["sampling_rate"],
"window_size": meta["window_size"],
"overlap": meta["overlap"],
"noise_threshold": meta["noise_threshold"],
}
processor = CSIProcessor(config)
features = processor.extract_features(csi_data)
# Serialize features deterministically
output = {
"amplitude_mean": features.amplitude_mean.tolist(),
"amplitude_variance": features.amplitude_variance.tolist(),
"phase_difference": features.phase_difference.tolist(),
"doppler_shift": features.doppler_shift.tolist(),
"psd_first_16": features.power_spectral_density[:16].tolist(),
}
output_json = json.dumps(output, sort_keys=True, separators=(",", ":"))
output_hash = hashlib.sha256(output_json.encode()).hexdigest()
# Verify against expected hash
with open(expected_hash_path) as f:
expected_hash = f.read().strip()
if output_hash == expected_hash:
print(f"PASS: Pipeline output matches expected hash")
print(f"SHA256: {output_hash}")
print(f"Features: {len(output['amplitude_mean'])} subcarriers processed")
return 0
else:
print(f"FAIL: Hash mismatch")
print(f"Expected: {expected_hash}")
print(f"Got: {output_hash}")
return 1
if __name__ == "__main__":
sys.exit(main())
Option A (recommended): Deterministic Dockerfile that works on fresh machine
FROM python:3.11-slim
# System deps that actually matter
RUN apt-get update && apt-get install -y --no-install-recommends \
libopenblas-dev \
libfftw3-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
# Pinned requirements (not a reference to missing file)
COPY v1/requirements-lock.txt ./requirements.txt
RUN pip install --no-cache-dir -r requirements.txt
COPY v1/ ./v1/
# Proof of reality: verify pipeline on build
RUN cd v1 && python data/proof/verify.py
EXPOSE 8000
# Default: REAL mode (mock requires explicit opt-in)
ENV WIFI_DENSEPOSE_MOCK=false
CMD ["uvicorn", "v1.src.api.main:app", "--host", "0.0.0.0", "--port", "8000"]
Key change: RUN python data/proof/verify.py during build means the Docker image cannot be created unless the pipeline produces correct output from real CSI data.
Requirements lockfile (v1/requirements-lock.txt):
# Core (required)
fastapi==0.115.6
uvicorn[standard]==0.34.0
pydantic==2.10.4
pydantic-settings==2.7.1
numpy==1.26.4
scipy==1.14.1
# Signal processing (required)
# No ONNX required for basic pipeline verification
# Optional (install separately for full features)
# torch>=2.1.0
# onnxruntime>=1.17.0
# .github/workflows/verify-pipeline.yml
name: Verify Signal Pipeline
on:
push:
paths: ['v1/src/**', 'v1/data/proof/**']
pull_request:
paths: ['v1/src/**']
jobs:
verify:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install minimal deps
run: pip install numpy scipy pydantic pydantic-settings
- name: Verify pipeline determinism
run: python v1/data/proof/verify.py
- name: Verify no random in production paths
run: |
# Fail if np.random appears in production code (not in testing/)
! grep -r "np\.random\.\(rand\|randn\|randint\)" v1/src/ \
--include="*.py" \
--exclude-dir=testing \
|| (echo "FAIL: np.random found in production code" && exit 1)
| File | Action | Description |
|---|---|---|
v1/src/core/csi_processor.py:390 | Replace | Real Doppler extraction from temporal CSI history |
v1/src/hardware/csi_extractor.py:83-84 | Replace | Hard error with descriptive message when parsing fails |
v1/src/hardware/csi_extractor.py:129-135 | Replace | Real Atheros CSI parser or hard error with hardware instructions |
v1/src/hardware/router_interface.py:198-212 | Replace | Hard error for unimplemented hardware, or real iwconfig + CSI tool integration |
v1/src/services/pose_service.py:293-356 | Move | Move _generate_mock_poses() to v1/src/testing/mock_pose_generator.py |
v1/src/services/pose_service.py:430-431 | Remove | Remove mock CSI generation from production path |
v1/src/services/pose_service.py:489-607 | Replace | Real statistics from database, or explicit "no data" response |
v1/src/core/router_interface.py:60-300 | Move | Move mock generator to v1/src/testing/mock_csi_generator.py |
v1/src/api/dependencies.py:82,408 | Replace | Real auth check or explicit dev-mode bypass with logging |
v1/data/proof/ | Create | Proof bundle (sample capture + expected hash + verify script) |
v1/requirements-lock.txt | Create | Pinned minimal dependencies |
.github/workflows/verify-pipeline.yml | Create | CI verification |
v1/docs/hardware-setup.md (to be created)
# Supported Hardware Matrix
| Chipset | Tool | OS | Capture Command |
|---------|------|----|-----------------|
| Intel 5300 | Linux 802.11n CSI Tool | Ubuntu 18.04 | `sudo ./log_to_file csi.dat` |
| Atheros AR9580 | Atheros CSI Tool | Ubuntu 14.04 | `sudo ./recv_csi csi.dat` |
| Broadcom BCM4339 | Nexmon CSI | Android/Nexus 5 | `nexutil -m1 -k1 ...` |
| ESP32 | ESP32-CSI | ESP-IDF | `csi_recv --format binary` |
# Calibration
1. Place router and receiver 2m apart, line of sight
2. Capture 10 seconds of empty-room baseline
3. Have one person walk through at normal pace
4. Capture 10 seconds during walk-through
5. Run calibration: `python v1/scripts/calibrate.py --baseline empty.dat --activity walk.dat`
docker build . && docker run --rm wifi-densepose python v1/data/proof/verify.py produces a deterministic PASSnp.random in production paths fail automaticallyv1/src/testing/, never imported by production modulesA stranger can:
git clone the repositorydocker build . or python v1/data/proof/verify.py)PASS: Pipeline output matches expected hash with a specific SHA-256np.random in any non-test file via CI badgeIf this works 100% over 5 runs on a clean machine, the "fake" narrative dies.
Q1: Docker or Nix first? Recommendation: Docker first. The Dockerfile already exists, just needs fixing. Nix is higher quality but smaller audience. Docker gives the widest "clone and verify" coverage.
Q2: Are external crates public and versioned?
The Python dependencies are all public PyPI packages. The Rust ruvector-core and ruvector-data-framework crates are currently commented out in Cargo.toml (lines 83-84: # ruvector-core = "0.1") and are not yet published to crates.io. They are internal to ruvnet. This is a blocker for the Rust path but does not affect the Python proof-of-reality work in this ADR.