ADR-070: Self-Supervised Pretraining from Live ESP32 CSI + Cognitum Seed

Field	Value
Status	Accepted
Date	2026-04-02
Authors	rUv, claude-flow
Drivers	README limitation "No pre-trained model weights provided"
Related	ADR-069 (Cognitum Seed pipeline), ADR-027 (MERIDIAN), ADR-024 (AETHER contrastive), ADR-015 (MM-Fi dataset)

Context

The README lists "No pre-trained model weights are provided; training from scratch is required" as a known limitation. Users must collect their own CSI dataset and train from scratch, which is a significant barrier to adoption.

We now have the infrastructure to generate pre-trained weights directly from live hardware:

2 ESP32-S3 nodes (COM8 node_id=2 at 192.168.1.104, COM9 node_id=1 at 192.168.1.105) streaming CSI + vitals + 8-dim feature vectors at 1 Hz each
Cognitum Seed (Pi Zero 2 W) with RVF vector store, kNN search, witness chain, and environmental sensors (BME280, PIR, vibration)
Recording API in sensing-server (POST /api/v1/recording/start) that saves CSI frames to .csi.jsonl
Self-supervised training via rapid_adapt.rs (contrastive TTT + entropy minimization)
AETHER contrastive embeddings (ADR-024) for environment-independent representations

Why Self-Supervised?

No cameras or labels are needed. The system learns from:

Temporal coherence — Frames close in time should have similar embeddings (positive pairs), frames far apart should differ (negative pairs)
Multi-node consistency — The same person seen from 2 nodes should produce correlated features, different people should produce decorrelated features
Cognitum Seed ground truth — PIR sensor, BME280 environment changes, and kNN cluster transitions provide weak supervision without human labeling
Physical constraints — Breathing 6-30 BPM, heart rate 40-150 BPM, person count 0-4, RSSI physics

Decision

Implement a 4-phase pretraining pipeline that collects CSI from 2 ESP32 nodes, stores feature vectors in the Cognitum Seed, and produces distributable pre-trained weights.

Phase 1: Data Collection (30 min)

Capture labeled scenarios using the sensing-server recording API and Cognitum Seed:

Scenario	Duration	Label	Activity
Empty room	5 min	`empty`	No one present, establish baseline
1 person stationary	5 min	`1p-still`	Sit at desk, normal breathing
1 person walking	5 min	`1p-walk`	Walk around room, varied paths
1 person varied	5 min	`1p-varied`	Stand, sit, wave arms, turn
2 people	5 min	`2p`	Both moving in room
Transitions	5 min	`transitions`	Enter/exit room, appear/disappear

Data rate per scenario:

2 nodes × 100 Hz CSI = 200 frames/sec = 60,000 frames per 5 min
2 nodes × 1 Hz features = 2 vectors/sec = 600 vectors per 5 min
Total: 360,000 CSI frames + 3,600 feature vectors per collection run

Cognitum Seed role:

Stores all feature vectors with witness chain attestation
PIR sensor provides binary presence ground truth
BME280 tracks environmental conditions during collection
kNN graph clusters naturally emerge from the vector distribution

Phase 2: Contrastive Pretraining

Train a contrastive encoder on the collected CSI data:

Input: Raw CSI frame (128 subcarriers × 2 I/Q = 256 features)
       ↓
    TCN temporal encoder (3 layers, kernel=7)
       ↓
    Projection head → 128-dim embedding
       ↓
    Contrastive loss (InfoNCE):
      positive: frames within 0.5s window from same node
      negative: frames >5s apart or from different scenario
      cross-node positive: same timestamp, different node

Self-supervised signals:

Temporal adjacency (frames within 500ms = positive pair)
Cross-node agreement (same person seen from 2 viewpoints)
PIR consistency (embedding should cluster by PIR state)
Scenario boundary (embeddings should shift at label transitions)

Phase 3: Downstream Head Training

Attach lightweight heads for each task:

Head	Architecture	Output	Supervision
Presence	Linear(128→1) + sigmoid	0.0-1.0	PIR sensor (free)
Person count	Linear(128→4) + softmax	0-3 people	Scenario labels
Activity	Linear(128→4) + softmax	still/walk/varied/empty	Scenario labels
Vital signs	Linear(128→2)	BR, HR (BPM)	ESP32 edge vitals

Phase 4: Package & Distribute

Produce distributable artifacts:

Artifact	Format	Size	Description
`pretrained-encoder.onnx`	ONNX	~2 MB	Contrastive encoder (TCN backbone)
`pretrained-heads.onnx`	ONNX	~100 KB	Task-specific heads
`pretrained.rvf`	RVF	~500 KB	RuVector format with metadata
`room-profiles.json`	JSON	~10 KB	Environment calibration profiles
`collection-witness.json`	JSON	~5 KB	Seed witness chain attestation proving data provenance

Include in GitHub release alongside firmware binaries. Users download and run:

bash

# Use pre-trained model (no training needed)
cargo run -p wifi-densepose-sensing-server -- --model pretrained.rvf --http-port 3000

Hardware Setup

                    192.168.1.20 (Host laptop)
                    ┌──────────────────────────┐
                    │  sensing-server           │
                    │    Recording API          │
                    │    Training pipeline      │
                    │                           │
                    │  seed_csi_bridge.py       │
                    │    Feature → Seed ingest  │
                    └────┬──────────┬───────────┘
                         │          │
          UDP:5006       │          │  HTTPS:8443
     ┌───────────────────┤          ├───────────────┐
     │                   │          │               │
     ▼                   ▼          ▼               │
┌──────────┐    ┌──────────┐    ┌──────────────┐    │
│ ESP32 #1 │    │ ESP32 #2 │    │Cognitum Seed │◄───┘
│ COM9     │    │ COM8     │    │ Pi Zero 2W   │
│ node=1   │    │ node=2   │    │ USB          │
│ .1.105   │    │ .1.104   │    │ .42.1/8443   │
│ v0.5.4   │    │ v0.5.4   │    │ v0.8.1       │
└──────────┘    └──────────┘    │ PIR, BME280  │
                                │ RVF store    │
                                │ Witness chain│
                                └──────────────┘

Data Collection Protocol

Step 1: Start Seed ingest (background)

bash

export SEED_TOKEN="your-token"
python scripts/seed_csi_bridge.py \
  --seed-url https://169.254.42.1:8443 --token "$SEED_TOKEN" \
  --udp-port 5006 --batch-size 10 --validate &

Step 2: Start sensing-server with recording

bash

cargo run -p wifi-densepose-sensing-server -- \
  --source esp32 --udp-port 5006 --http-port 3000

Step 3: Record each scenario

bash

# Empty room (leave room for 5 min)
curl -X POST http://localhost:3000/api/v1/recording/start \
  -H 'Content-Type: application/json' \
  -d '{"session_name":"pretrain-empty","label":"empty","duration_secs":300}'

# 1 person stationary (sit at desk for 5 min)
curl -X POST http://localhost:3000/api/v1/recording/start \
  -d '{"session_name":"pretrain-1p-still","label":"1p-still","duration_secs":300}'

# ... repeat for each scenario

Step 4: Verify with Seed

bash

python scripts/seed_csi_bridge.py --token "$SEED_TOKEN" --stats
# Should show 3,600+ vectors from the collection run

Risks

Risk	Likelihood	Impact	Mitigation
2 nodes insufficient for spatial diversity	Medium	Lower pretraining quality	Place nodes 3-5m apart at different heights
PIR sensor has limited range	Low	Weak presence labels	BME280 temp changes + kNN clusters as backup
Contrastive pretraining collapses	Low	Useless embeddings	Temperature scheduling, hard negative mining
Model too large for ESP32 inference	N/A	N/A	Inference on host/Seed, not on ESP32
Room-specific overfitting	Medium	Poor generalization	MERIDIAN domain randomization (ADR-027), LoRA adaptation

ADR-070: Self-Supervised Pretraining from Live ESP32 CSI + Cognitum Seed

ADR-070: Self-Supervised Pretraining from Live ESP32 CSI + Cognitum Seed

Context

Why Self-Supervised?

Decision

Phase 1: Data Collection (30 min)

Phase 2: Contrastive Pretraining

Phase 3: Downstream Head Training

Phase 4: Package & Distribute

Hardware Setup

Data Collection Protocol

Step 1: Start Seed ingest (background)

Step 2: Start sensing-server with recording

Step 3: Record each scenario

Step 4: Verify with Seed

Risks

Consequences

Positive

Negative