docs/adr/ADR-023-trained-densepose-model-ruvector-pipeline.md
| Field | Value |
|---|---|
| Status | Proposed |
| Date | 2026-02-28 |
| Deciders | ruv |
| Relates to | ADR-003 (RVF Cognitive Containers), ADR-005 (SONA Self-Learning), ADR-015 (Public Dataset Strategy), ADR-016 (RuVector Integration), ADR-017 (RuVector-Signal-MAT), ADR-020 (Rust AI Migration), ADR-021 (Vital Sign Detection) |
The WiFi-DensePose system currently operates in two distinct modes:
WiFi CSI sensing (working): ESP32 streams CSI frames → Rust aggregator → feature extraction → presence/motion classification. 41 tests passing, verified at ~20 Hz with real hardware.
Heuristic pose derivation (working but approximate): The Rust sensing server generates 17 COCO keypoints from WiFi signal properties using hand-crafted rules (derive_pose_from_sensing() in sensing-server/src/main.rs). This is not a trained model — keypoint positions are derived from signal amplitude, phase variance, and motion metrics rather than learned from labeled data.
Neither mode produces DensePose-quality body surface estimation. The CMU "DensePose From WiFi" paper (arXiv:2301.00250) demonstrated that a neural network trained on paired WiFi CSI + camera pose data can produce dense body surface UV coordinates from WiFi alone. However, that approach requires:
The Rust workspace already has the complete model architecture ready for training:
| Component | Crate | File | Status |
|---|---|---|---|
WiFiDensePoseModel | wifi-densepose-train | model.rs | Implemented (random weights) |
ModalityTranslator | wifi-densepose-train | model.rs | Implemented with RuVector attention |
KeypointHead | wifi-densepose-train | model.rs | Implemented (17 COCO heatmaps) |
DensePoseHead | wifi-densepose-nn | densepose.rs | Implemented (25 parts + 48 UV) |
WiFiDensePoseLoss | wifi-densepose-train | losses.rs | Implemented (keypoint + part + UV + transfer) |
MmFiDataset loader | wifi-densepose-train | dataset.rs | Planned (ADR-015) |
WiFiDensePosePipeline | wifi-densepose-nn | inference.rs | Implemented (generic over Backend) |
| Training proof verification | wifi-densepose-train | proof.rs | Implemented (deterministic hash) |
| Subcarrier resampling (114→56) | wifi-densepose-train | subcarrier.rs | Planned (ADR-016) |
The vendor/ruvector/ subtree provides 90+ crates. The following are directly relevant to a trained DensePose pipeline:
Already integrated (5 crates, ADR-016):
| Crate | Algorithm | Current Use |
|---|---|---|
ruvector-mincut | Subpolynomial dynamic min-cut O(n^{o(1)}) | Multi-person assignment in metrics.rs |
ruvector-attn-mincut | Attention-gated min-cut | Noise-suppressed spectrogram in model.rs |
ruvector-attention | Scaled dot-product + geometric attention | Spatial decoder in model.rs |
ruvector-solver | Sparse Neumann solver O(√n) | Subcarrier resampling in subcarrier.rs |
ruvector-temporal-tensor | Tiered temporal compression | CSI frame buffering in dataset.rs |
Newly proposed for DensePose pipeline (6 additional crates):
| Crate | Description | Proposed Use |
|---|---|---|
ruvector-gnn | Graph neural network on HNSW topology | Spatial body-graph reasoning |
ruvector-graph-transformer | Proof-gated graph transformer (8 modules) | CSI-to-pose cross-attention |
ruvector-sparse-inference | PowerInfer-style sparse inference engine | Edge deployment with neuron activation sparsity |
ruvector-sona | Self-Optimizing Neural Architecture (LoRA + EWC++) | Online environment adaptation |
ruvector-fpga-transformer | FPGA-optimized transformer | Hardware-accelerated inference path |
ruvector-math | Optimal transport, information geometry | Domain adaptation loss functions |
The RuVector Format (RVF) is a segment-based binary container format designed to package intelligence artifacts — embeddings, HNSW indexes, quantized weights, WASM runtimes, witness proofs, and metadata — into a single self-contained file. Key properties:
SegmentHeader, magic 0x52564653 "RVFS") with type discriminator, content hash, compression, and timestampVec (embeddings), Index (HNSW), Overlay (min-cut witnesses), Quant (codebooks), Witness (proof-of-computation), Wasm (self-bootstrapping runtime), Dashboard (embedded UI), AggregateWeights (federated SONA deltas), Crypto (Ed25519 signatures), and morervf-quant): f32 / f16 / u8 / binary per-segment, with SIMD-accelerated distance computationagi_container.rs): packages kernel + WASM + world model + orchestrator + evaluation harness + witness chains into a single deployable fileThe trained DensePose model will be packaged as an .rvf container, making it a single
self-contained artifact that includes model weights, HNSW-indexed embedding tables, min-cut
graph overlays, quantization codebooks, SONA adaptation deltas, and the WASM inference
runtime — deployable to any host without external dependencies.
Implement a fully trained DensePose model using RuVector signal intelligence as the backbone signal processing layer, packaged in the RVF container format. The pipeline has three stages: (1) offline training on public datasets, (2) teacher-student distillation for DensePose UV labels, and (3) online SONA adaptation for environment-specific fine-tuning. The trained model, its embeddings, indexes, and adaptation state are serialized into a single .rvf file.
┌─────────────────────────────────────────────────────────────────────────────┐
│ TRAINED DENSEPOSE PIPELINE │
│ │
│ ┌─────────────┐ ┌──────────────────────┐ ┌──────────────────────┐ │
│ │ ESP32 CSI │ │ RuVector Signal │ │ Trained Neural │ │
│ │ Raw I/Q │───▶│ Intelligence Layer │───▶│ Network │ │
│ │ [ant×sub×T] │ │ (preprocessing) │ │ (inference) │ │
│ └─────────────┘ └──────────────────────┘ └──────────────────────┘ │
│ │ │ │
│ ┌─────────┴─────────┐ ┌────────┴────────┐ │
│ │ 5 RuVector crates │ │ 6 RuVector │ │
│ │ (signal processing)│ │ crates (neural) │ │
│ └───────────────────┘ └─────────────────┘ │
│ │ │
│ ┌──────────────────────────┘ │
│ ▼ │
│ ┌──────────────────────────────────────┐ │
│ │ Outputs │ │
│ │ • 17 COCO keypoints [B,17,H,W] │ │
│ │ • 25 body parts [B,25,H,W] │ │
│ │ • 48 UV coords [B,48,H,W] │ │
│ │ • Confidence scores │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘
Raw CSI frames from ESP32 (56–192 subcarriers × N antennas × T time frames) are processed through the RuVector signal intelligence stack before entering the neural network. This replaces hand-crafted feature extraction with learned, graph-aware preprocessing.
Raw CSI [ant, sub, T]
│
▼
┌─────────────────────────────────────────────────────┐
│ 1. ruvector-attn-mincut: gate_spectrogram() │
│ Input: Q=amplitude, K=phase, V=combined │
│ Effect: Suppress multipath noise, keep motion- │
│ relevant subcarrier paths │
│ Output: Gated spectrogram [ant, sub', T] │
├─────────────────────────────────────────────────────┤
│ 2. ruvector-mincut: mincut_subcarrier_partition() │
│ Input: Subcarrier coherence graph │
│ Effect: Partition into sensitive (motion- │
│ responsive) vs insensitive (static) │
│ Output: Partition mask + per-subcarrier weights │
├─────────────────────────────────────────────────────┤
│ 3. ruvector-attention: attention_weighted_bvp() │
│ Input: Gated spectrogram + partition weights │
│ Effect: Compute body velocity profile with │
│ sensitivity-weighted attention │
│ Output: BVP feature vector [D_bvp] │
├─────────────────────────────────────────────────────┤
│ 4. ruvector-solver: solve_fresnel_geometry() │
│ Input: Amplitude + known TX/RX positions │
│ Effect: Estimate TX-body-RX ellipsoid distances │
│ Output: Fresnel geometry features [D_fresnel] │
├─────────────────────────────────────────────────────┤
│ 5. ruvector-temporal-tensor: compress + buffer │
│ Input: Temporal CSI window (100 frames) │
│ Effect: Tiered quantization (hot/warm/cold) │
│ Output: Compressed tensor, 50-75% memory saving │
└─────────────────────────────────────────────────────┘
│
▼
Feature tensor [B, T*tx*rx, sub] (preprocessed, noise-suppressed)
The neural network follows the CMU teacher-student architecture with RuVector enhancements at three critical points.
CSI features [B, T*tx*rx, sub]
│
├──amplitude──┐
│ ├─► Encoder (Conv1D stack, 64→128→256)
└──phase──────┘ │
▼
┌──────────────────────────────┐
│ ruvector-graph-transformer │
│ │
│ Treat antenna-pair×time as │
│ graph nodes. Edges connect │
│ spatially adjacent antenna │
│ pairs and temporally │
│ adjacent frames. │
│ │
│ Proof-gated attention: │
│ Each layer verifies that │
│ attention weights satisfy │
│ physical constraints │
│ (Fresnel ellipsoid bounds) │
└──────────────────────────────┘
│
▼
Decoder (ConvTranspose2d stack, 256→128→64→3)
│
▼
Visual features [B, 3, 48, 48]
RuVector enhancement: Replace standard multi-head self-attention in the bottleneck with ruvector-graph-transformer. The graph structure encodes the physical antenna topology — nodes that are closer in space (adjacent ESP32 nodes in the mesh) or time (consecutive frames) have stronger edge weights. This injects domain-specific inductive bias that standard attention lacks.
Visual features [B, 3, 48, 48]
│
▼
ResNet18 backbone → feature maps [B, 256, 12, 12]
│
▼
┌─────────────────────────────────────────┐
│ ruvector-gnn: Body Graph Network │
│ │
│ 17 COCO keypoints as graph nodes │
│ Edges: anatomical connections │
│ (shoulder→elbow, hip→knee, etc.) │
│ │
│ GNN message passing (3 rounds): │
│ h_i^{l+1} = σ(W·h_i^l + Σ_j α_ij·h_j)│
│ α_ij = attention(h_i, h_j, edge_ij) │
│ │
│ Enforces anatomical constraints: │
│ - Limb length ratios │
│ - Joint angle limits │
│ - Left-right symmetry priors │
└─────────────────────────────────────────┘
│
├──────────────────┬──────────────────┐
▼ ▼ ▼
KeypointHead DensePoseHead ConfidenceHead
[B,17,H,W] [B,25+48,H,W] [B,1]
heatmaps parts + UV quality score
RuVector enhancement: ruvector-gnn replaces the flat spatial decoder with a graph neural network that operates on the human body graph. WiFi CSI is inherently noisy — GNN message passing between anatomically connected joints enforces that predicted keypoints maintain plausible body structure even when individual joint predictions are uncertain.
Trained model weights (full precision)
│
▼
┌─────────────────────────────────────────────┐
│ ruvector-sparse-inference │
│ │
│ PowerInfer-style activation sparsity: │
│ - Profile neuron activation frequency │
│ - Partition into hot (always active, 20%) │
│ and cold (conditionally active, 80%) │
│ - Hot neurons: GPU/SIMD fast path │
│ - Cold neurons: sparse lookup on demand │
│ │
│ Quantization: │
│ - Backbone: INT8 (4x memory reduction) │
│ - DensePose head: FP16 (2x reduction) │
│ - ModalityTranslator: FP16 │
│ │
│ Target: <50ms inference on ESP32-S3 │
│ <10ms on x86 with AVX2 │
└─────────────────────────────────────────────┘
Primary dataset: MM-Fi (NeurIPS 2023) — 40 subjects, 27 actions, 114 subcarriers, 3 RX antennas, 17 COCO keypoints + DensePose UV annotations.
Secondary dataset: Wi-Pose — 12 subjects, 12 actions, 30 subcarriers, 3×3 antenna array, 18 keypoints.
┌──────────────────────────────────────────────────────────┐
│ Data Loading Pipeline │
│ │
│ MM-Fi .npy ──► Resample 114→56 subcarriers ──┐ │
│ (ruvector-solver NeumannSolver) │ │
│ ├──► Batch│
│ Wi-Pose .mat ──► Zero-pad 30→56 subcarriers ──┘ [B,T*│
│ ant, │
│ Phase sanitize ──► Hampel filter ──► unwrap sub] │
│ (wifi-densepose-signal::phase_sanitizer) │
│ │
│ Temporal buffer ──► ruvector-temporal-tensor │
│ (100 frames/sample, tiered quantization) │
└──────────────────────────────────────────────────────────┘
For samples with 3D keypoints but no DensePose UV maps:
(part_labels [H,W], u_coords [H,W], v_coords [H,W]) pseudo-labels.npy alongside original dataL_total = λ_kp · L_keypoint // MSE on predicted vs GT heatmaps
+ λ_part · L_part // Cross-entropy on 25-class body part segmentation
+ λ_uv · L_uv // Smooth L1 on UV coordinate regression
+ λ_xfer · L_transfer // MSE between CSI features and teacher visual features
+ λ_ot · L_ot // Optimal transport regularization (ruvector-math)
+ λ_graph · L_graph // GNN edge consistency loss (ruvector-gnn)
RuVector enhancement: ruvector-math provides optimal transport (Wasserstein distance) as a regularization term. This penalizes predicted body part distributions that are far from the ground truth in the Wasserstein metric, which is more geometrically meaningful than pixel-wise cross-entropy for spatial body part segmentation.
| Parameter | Value | Rationale |
|---|---|---|
| Optimizer | AdamW | Weight decay regularization |
| Learning rate | 1e-3, cosine decay to 1e-5 | Standard for modality translation |
| Batch size | 32 | Fits in 24GB GPU VRAM |
| Epochs | 100 | With early stopping (patience=15) |
| Warmup | 5 epochs | Linear LR warmup |
| Train/val split | Subjects 1-32 / 33-40 | Subject-disjoint for generalization |
| Augmentation | Time-shift ±5 frames, amplitude noise ±2dB, antenna dropout 10% | CSI-domain augmentations |
| Hardware | Single RTX 3090 or A100 | ~8 hours on A100 |
| Checkpoint | Every epoch, keep best-by-validation-PCK | Deterministic seed |
| Metric | Target | Description |
|---|---|---|
| [email protected] | >70% on MM-Fi val | Percentage of correct keypoints (threshold = 0.2 × torso diameter) |
| OKS mAP | >0.50 on MM-Fi val | Object Keypoint Similarity, COCO-standard |
| DensePose GPS | >0.30 on MM-Fi val | Geodesic Point Similarity for UV accuracy |
| Inference latency | <50ms per frame | On x86 with ONNX Runtime |
| Model size | <25MB (FP16) | Suitable for edge deployment |
After offline training produces a base model, SONA enables continuous adaptation to new environments without retraining from scratch.
┌──────────────────────────────────────────────────────────┐
│ SONA Online Adaptation Loop │
│ │
│ Base model (frozen weights W) │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ LoRA Adaptation Matrices │ │
│ │ W_effective = W + α · A·B │ │
│ │ │ │
│ │ Rank r=4 for translator layers │ │
│ │ Rank r=2 for backbone layers │ │
│ │ Rank r=8 for DensePose head │ │
│ │ │ │
│ │ Total trainable params: ~50K │ │
│ │ (vs ~5M frozen base) │ │
│ └──────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────┐ │
│ │ EWC++ Regularizer │ │
│ │ L = L_task + λ·Σ F_i(θ-θ*)² │ │
│ │ │ │
│ │ Prevents forgetting base model │ │
│ │ knowledge when adapting to new │ │
│ │ environment │ │
│ └──────────────────────────────────┘ │
│ │ │
│ ▼ │
│ Adaptation triggers: │
│ • First deployment in new room │
│ • PCK drops below threshold (drift detection) │
│ • User manually initiates calibration │
│ • Furniture/layout change detected (CSI baseline shift) │
│ │
│ Adaptation data: │
│ • Self-supervised: temporal consistency loss │
│ (pose at t should be similar to t-1 for slow motion) │
│ • Semi-supervised: user confirmation of presence/count │
│ • Optional: brief camera calibration session (5 min) │
│ │
│ Convergence: 10-50 gradient steps, <5 seconds on CPU │
└──────────────────────────────────────────────────────────┘
ESP32 CSI (UDP :5005)
│
▼
Rust Axum server (port 8080)
│
├─► RuVector signal preprocessing (Stage 1)
│ 5 crates, ~2ms per frame
│
├─► ONNX Runtime inference (Stage 2)
│ Quantized model, ~10ms per frame
│ OR ruvector-sparse-inference, ~8ms per frame
│
├─► GNN post-processing (ruvector-gnn)
│ Anatomical constraint enforcement, ~1ms
│
├─► SONA adaptation check (Stage 4)
│ <0.05ms per frame (gradient accumulation only)
│
└─► Output: DensePose results
│
├──► /api/v1/stream/pose (WebSocket, 17 keypoints)
├──► /api/v1/pose/current (REST, full DensePose)
└──► /ws/sensing (WebSocket, raw + processed)
Total inference budget: <15ms per frame at 20 Hz on x86, <50ms on ESP32-S3 (with sparse inference).
The trained model is packaged as a single .rvf file that contains everything needed for
inference — no external weight files, no ONNX runtime, no Python dependencies.
wifi-densepose-v1.rvf (single file, ~15-30 MB)
┌───────────────────────────────────────────────────────────────┐
│ SEGMENT 0: Manifest (0x05) │
│ ├── Model ID: "wifi-densepose-v1.0" │
│ ├── Training dataset: "mmfi-v1+wipose-v1" │
│ ├── Training config hash: SHA-256 │
│ ├── Target hardware: x86_64, aarch64, wasm32 │
│ ├── Segment directory (offsets to all segments) │
│ └── Level-1 TLV manifest with metadata tags │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 1: Vec (0x01) — Model Weight Embeddings │
│ ├── ModalityTranslator weights [64→128→256→3, Conv1D+ConvT] │
│ ├── ResNet18 backbone weights [3→64→128→256, residual blocks] │
│ ├── KeypointHead weights [256→17, deconv layers] │
│ ├── DensePoseHead weights [256→25+48, deconv layers] │
│ ├── GNN body graph weights [3 message-passing rounds] │
│ └── Graph transformer attention weights [proof-gated layers] │
│ Format: flat f32 vectors, 768-dim per weight tensor │
│ Total: ~5M parameters → ~20MB f32, ~10MB f16, ~5MB INT8 │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 2: Index (0x02) — HNSW Embedding Index │
│ ├── Layer A: Entry points + coarse routing centroids │
│ │ (loaded first, <5ms, enables approximate search) │
│ ├── Layer B: Hot region adjacency for frequently │
│ │ accessed weight clusters (100ms load) │
│ └── Layer C: Full adjacency graph for exact nearest │
│ neighbor lookup across all weight partitions │
│ Use: Fast weight lookup for sparse inference — │
│ only load hot neurons, skip cold neurons via HNSW routing │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 3: Overlay (0x03) — Dynamic Min-Cut Graph │
│ ├── Subcarrier partition graph (sensitive vs insensitive) │
│ ├── Min-cut witnesses from ruvector-mincut │
│ ├── Antenna topology graph (ESP32 mesh spatial layout) │
│ └── Body skeleton graph (17 COCO joints, 16 edges) │
│ Use: Pre-computed graph structures loaded at init time. │
│ Dynamic updates via ruvector-mincut insert/delete_edge │
│ as environment changes (furniture moves, new obstacles) │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 4: Quant (0x06) — Quantization Codebooks │
│ ├── INT8 codebook for backbone (4x memory reduction) │
│ ├── FP16 scale factors for translator + heads │
│ ├── Binary quantization tables for SIMD distance compute │
│ └── Per-layer calibration statistics (min, max, zero-point) │
│ Use: rvf-quant temperature-tiered quantization — │
│ hot layers stay f16, warm layers u8, cold layers binary │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 5: Witness (0x0A) — Training Proof Chain │
│ ├── Deterministic training proof (seed, loss curve, hash) │
│ ├── Dataset provenance (MM-Fi commit hash, download URL) │
│ ├── Validation metrics ([email protected], OKS mAP, GPS scores) │
│ ├── Ed25519 signature over weight hash │
│ └── Attestation: training hardware, duration, config │
│ Use: Verifiable proof that model weights match a specific │
│ training run. Anyone can re-run training with same seed │
│ and verify the weight hash matches the witness. │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 6: Meta (0x07) — Model Metadata │
│ ├── COCO keypoint names and skeleton connectivity │
│ ├── DensePose body part labels (24 parts + background) │
│ ├── UV coordinate range and resolution │
│ ├── Input normalization statistics (mean, std per subcarrier)│
│ ├── RuVector crate versions used during training │
│ └── Environment calibration profiles (named, per-room) │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 7: AggregateWeights (0x36) — SONA LoRA Deltas │
│ ├── Per-environment LoRA adaptation matrices (A, B per layer)│
│ ├── EWC++ Fisher information diagonal │
│ ├── Optimal θ* reference parameters │
│ ├── Adaptation round count and convergence metrics │
│ └── Named profiles: "lab-a", "living-room", "office-3f" │
│ Use: Multiple environment adaptations stored in one file. │
│ Server loads the matching profile or creates a new one. │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 8: Profile (0x0B) — RVDNA Domain Profile │
│ ├── Domain: "wifi-csi-densepose" │
│ ├── Input spec: [B, T*ant, sub] CSI tensor format │
│ ├── Output spec: keypoints [B,17,H,W], parts [B,25,H,W], │
│ │ UV [B,48,H,W], confidence [B,1] │
│ ├── Hardware requirements: min RAM, recommended GPU │
│ └── Supported data sources: esp32, wifi-rssi, simulation │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 9: Crypto (0x0C) — Signature and Keys │
│ ├── Ed25519 public key for model publisher │
│ ├── Signature over all segment content hashes │
│ └── Certificate chain (optional, for enterprise deployment) │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 10: Wasm (0x10) — Self-Bootstrapping Runtime │
│ ├── Compiled WASM inference engine │
│ │ (ruvector-sparse-inference-wasm) │
│ ├── WASM microkernel for RVF segment parsing │
│ └── Browser-compatible: load .rvf → run inference in-browser │
│ Use: The .rvf file is fully self-contained — a WASM host │
│ can execute inference without any external dependencies. │
├───────────────────────────────────────────────────────────────┤
│ SEGMENT 11: Dashboard (0x11) — Embedded Visualization │
│ ├── Three.js-based pose visualization (HTML/JS/CSS) │
│ ├── Gaussian splat renderer for signal field │
│ └── Served at http://localhost:8080/ when model is loaded │
│ Use: Open the .rvf file → get a working UI with no install │
└───────────────────────────────────────────────────────────────┘
1. Read tail → find_latest_manifest() → SegmentDirectory
2. Load Manifest (seg 0) → validate magic, version, model ID
3. Load Profile (seg 8) → verify input/output spec compatibility
4. Load Crypto (seg 9) → verify Ed25519 signature chain
5. Load Quant (seg 4) → prepare quantization codebooks
6. Load Index Layer A (seg 2) → entry points ready (<5ms)
↓ (inference available at reduced accuracy)
7. Load Vec (seg 1) → hot weight partitions via Layer A routing
8. Load Index Layer B (seg 2) → hot adjacency ready (100ms)
↓ (inference at full accuracy for common poses)
9. Load Overlay (seg 3) → min-cut graphs, body skeleton
10. Load AggregateWeights (seg 7) → apply matching SONA profile
11. Load Index Layer C (seg 2) → complete graph loaded
↓ (full inference with all weight partitions)
12. Load Wasm (seg 10) → WASM runtime available (optional)
13. Load Dashboard (seg 11) → UI served (optional)
Progressive availability: Inference begins after step 6 (~5ms) with approximate results. Full accuracy is reached by step 9 (~500ms). This enables instant startup with gradually improving quality — critical for real-time applications.
After training completes, the model is packaged into an .rvf file:
# Build the RVF container from trained checkpoint
cargo run -p wifi-densepose-train --bin build-rvf -- \
--checkpoint checkpoints/best-pck.pt \
--quantize int8,fp16 \
--hnsw-build \
--sign --key model-signing-key.pem \
--include-wasm \
--include-dashboard ../../ui \
--output wifi-densepose-v1.rvf
# Verify the built container
cargo run -p wifi-densepose-train --bin verify-rvf -- \
--input wifi-densepose-v1.rvf \
--verify-signature \
--verify-witness \
--benchmark-inference
The sensing server loads the .rvf container at startup:
# Load model from RVF container
./target/release/sensing-server \
--model wifi-densepose-v1.rvf \
--source auto \
--ui-from-rvf # serve Dashboard segment instead of --ui-path
// In sensing-server/src/main.rs
use rvf_runtime::RvfContainer;
use rvf_index::layers::IndexLayer;
use rvf_quant::QuantizedVec;
let container = RvfContainer::open("wifi-densepose-v1.rvf")?;
// Progressive load: Layer A first for instant startup
let index = container.load_index(IndexLayer::A)?;
let weights = container.load_vec_hot(&index)?; // hot partitions only
// Full load in background
tokio::spawn(async move {
container.load_index(IndexLayer::B).await?;
container.load_index(IndexLayer::C).await?;
container.load_vec_cold().await?; // remaining partitions
});
// SONA environment adaptation
let sona_deltas = container.load_aggregate_weights("office-3f")?;
model.apply_lora_deltas(&sona_deltas);
// Serve embedded dashboard
let dashboard = container.load_dashboard()?;
// Mount at /ui/* routes in Axum
MmFiDataset in wifi-densepose-train/src/dataset.rs.npy files with antenna correction (1TX/3RX → 3×3 zero-padding)ruvector-solver::NeumannSolverwifi-densepose-signal::phase_sanitizerWiPoseDataset for secondary datasetruvector-temporal-tensorcargo test -p wifi-densepose-train with dataset loading testsruvector-graph-transformer dependency to wifi-densepose-trainModalityTranslator with proof-gated graph transformerruvector-gnn dependency for body graph reasoning.npy for Rust loader consumptionWiFiDensePoseTrainer with full loss function (6 terms)ruvector-math optimal transport loss termproof.rs) with weight hashruvector-sona into inference pipelineruvector-sparse-inference hot/cold neuron partitioningruvector-sparse-inference-wasm for browser inferencebuild-rvf binary in wifi-densepose-trainVec segment (SegmentType::Vec, 0x01)rvf-quant (SegmentType::Quant, 0x06)verify-rvf binary for container validationwifi-densepose-v1.rvf single-file container, verifiable and self-contained.rvf container in wifi-densepose-sensing-server via rvf-runtimederive_pose_from_sensing() heuristic with trained model inference--model CLI flag accepting .rvf path (or legacy .onnx)AggregateWeights segment based on --env flag/ui/* when --ui-from-rvf is set.rvf file| File | Purpose |
|---|---|
rust-port/.../wifi-densepose-train/src/dataset_mmfi.rs | MM-Fi dataset loader with subcarrier resampling |
rust-port/.../wifi-densepose-train/src/dataset_wipose.rs | Wi-Pose dataset loader |
rust-port/.../wifi-densepose-train/src/graph_transformer.rs | Graph transformer integration |
rust-port/.../wifi-densepose-train/src/body_gnn.rs | GNN body graph reasoning |
rust-port/.../wifi-densepose-train/src/adaptation.rs | SONA LoRA + EWC++ adaptation |
rust-port/.../wifi-densepose-train/src/trainer.rs | Training loop with multi-term loss |
scripts/generate_densepose_labels.py | Teacher-student UV label generation |
scripts/benchmark_inference.py | Inference latency benchmarking |
rust-port/.../wifi-densepose-train/src/rvf_builder.rs | RVF container build pipeline |
rust-port/.../wifi-densepose-train/src/bin/build_rvf.rs | CLI binary for building .rvf containers |
rust-port/.../wifi-densepose-train/src/bin/verify_rvf.rs | CLI binary for verifying .rvf containers |
| File | Change |
|---|---|
rust-port/.../wifi-densepose-train/Cargo.toml | Add ruvector-gnn, graph-transformer, sona, sparse-inference, math, rvf-types, rvf-wire, rvf-manifest, rvf-index, rvf-quant, rvf-crypto, rvf-runtime deps |
rust-port/.../wifi-densepose-train/src/model.rs | Integrate graph transformer + GNN layers |
rust-port/.../wifi-densepose-train/src/losses.rs | Add optimal transport + GNN edge consistency loss terms |
rust-port/.../wifi-densepose-train/src/config.rs | Add training hyperparameters for new components |
rust-port/.../sensing-server/Cargo.toml | Add rvf-runtime, rvf-types, rvf-index, rvf-quant deps |
rust-port/.../sensing-server/src/main.rs | Add --model flag, load .rvf container, progressive startup, serve embedded dashboard |
.rvf file — deploy by copying a single file--features trained-model).| Risk | Mitigation |
|---|---|
| MM-Fi 114→56 interpolation loses accuracy | Train at native 114 as alternative; ESP32 mesh can collect 56-sub data natively |
| GNN overfits to training body types | Augment with diverse body proportions; Wi-Pose adds subject diversity |
| SONA adaptation diverges in adversarial environments | EWC++ regularization caps parameter drift; rollback to base weights on detection |
| Sparse inference degrades accuracy | Benchmark INT8 vs FP16 vs FP32; fall back to full precision if quality drops |
| Training proof hash changes with RuVector version updates | Pin ruvector crate versions in Cargo.toml; regenerate hash on version bumps |
ruQu ("Classical nervous system for quantum machines") provides real-time coherence
assessment via dynamic min-cut. While primarily designed for quantum error correction
(syndrome decoding, surface code arbitration), its core primitive — the CoherenceGate —
is architecturally relevant to WiFi CSI processing:
CoherenceGate uses ruvector-mincut to make real-time gate/pass decisions on
signal streams based on structural coherence thresholds. In quantum computing, this
gates qubit syndrome streams. For WiFi CSI, the same mechanism could gate CSI
subcarrier streams — passing only subcarriers whose coherence (phase stability across
antennas) exceeds a dynamic threshold.
Syndrome filtering (filters.rs) implements Kalman-like adaptive filters that
could be repurposed for CSI noise filtering — treating each subcarrier's amplitude
drift as a "syndrome" stream.
Min-cut gated transformer integration (optional feature) provides coherence-optimized
attention with 50% FLOP reduction — directly applicable to the ModalityTranslator
bottleneck.
Decision: ruQu is not included in the initial pipeline (Phase 1-8) but is marked as a
Phase 9 exploration candidate for coherence-gated CSI filtering. The CoherenceGate
primitive maps naturally to subcarrier quality assessment, and the integration path is
clean since ruQu already depends on ruvector-mincut.
The pipeline supports three data sources for training, used in combination:
| Source | Subcarriers | Pose Labels | Volume | Cost | When |
|---|---|---|---|---|---|
| MM-Fi (public) | 114 → 56 (interpolated) | 17 COCO + DensePose UV | 40 subjects, 320K frames | Free (CC BY-NC) | Phase 1 — bootstrap |
| Wi-Pose (public) | 30 → 56 (zero-padded) | 18 keypoints | 12 subjects, 166K packets | Free (research) | Phase 1 — diversity |
| ESP32 self-collected | 56 (native) | Teacher-student from camera | Unlimited, environment-specific | Hardware only ($54) | Phase 4+ — fine-tuning |
Recommended approach: Both public + ESP32 data.
Pre-train on MM-Fi + Wi-Pose (public data, Phase 1-4): Provides the base model with diverse subjects and actions. The 114→56 subcarrier interpolation is acceptable for learning general CSI-to-pose mappings.
Fine-tune on ESP32 self-collected data (Phase 5+, SONA adaptation): Collect 5-30 minutes of paired ESP32 CSI + camera data in each target environment. The camera serves as the teacher model (Detectron2 generates pseudo-labels). SONA LoRA adaptation takes <50 gradient steps to converge.
Continuous adaptation (runtime): SONA's self-supervised temporal consistency loss refines the model without any camera, using the assumption that poses change smoothly over short time windows.
This three-tier strategy gives you: