Files
rUv 9b07dff298 feat(beyond-sota): ADR-155 metric unification + ADR-156 RaBitQ Pass-2 (honest negative + latent topk bugfix) (#1053)
* refactor(train): hoist canonical PCK/OKS to un-gated metrics_core; fold test_metrics onto production (ADR-155 M1 §8)

ADR-155 §8 deferred item: test_metrics.rs reference kernels validated
production against their OWN reimplementation — a test that cannot catch a
canonical-impl bug (both could be wrong the same way).

- Extract canonical_torso_size / pck_canonical / oks_canonical / sigmas /
  bounding_box_diagonal into a new NON-tch-gated `metrics_core` module, so
  the single metric definition is reachable under
  `cargo test --no-default-features` (the `metrics` module is tch-gated).
  `metrics` re-exports every item → still exactly ONE implementation.
- Rewrite tests/test_metrics.rs to assert the PRODUCTION pck_canonical /
  oks_canonical equal hand-computed fixtures (not a reimplementation):
  canonical_pck_matches_hand_computed_fixture (corr=3/total=4/pck=0.75),
  hip↔hip normalizer pin, zero-visible⇒0.0, OKS perfect⇒1.0, fake-Gold pin.
- Keep an INDEPENDENT raw-threshold reference kernel only as a differential
  cross-check: test_kernel_agrees_with_canonical asserts it AGREES with
  canonical where torso==1.0 (genuine cross-check, not duplication).

Grade: MEASURED. test_metrics 10→12 tests, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(sensing-server): relabel divergent live PCK/OKS so they're never conflated with canonical (ADR-155 M1 §2.1/§8 Goal C)

Goal C named training_api.rs:804 (torso-HEIGHT PCK). Auditing it surfaced
TWO findings the ADR-155 §1 table missed:

1. training_api.rs is an ORPHAN file — not declared `mod` in lib.rs OR main.rs,
   so it does NOT compile into the crate. It does not drive the live server.
2. The REAL live `best_pck`/`best_oks` (main.rs training path → RVF metadata
   JSON read by model_manager.rs) come from trainer.rs:
   - `pck_at_threshold` = RAW-threshold PCK, NO torso normalization (the most
     divergent kind), printed/serialized as bare "PCK@0.2".
   - `oks_map` calls `oks_single(area=1.0)` = the EXACT fake-Gold pattern
     ADR-155 §2.1 claimed closed elsewhere — still live here, inflating best_oks.

Resolution = RELABEL (torso/raw math is load-bearing on different data; the
pub fns can't be renamed without breaking API; sensing-server has no train/
ndarray dep). Honest unify is a tracked §8 backlog item.

- training_api.rs: `compute_pck` → `compute_pck_torso_height` + divergence doc;
  val_pck/best_pck/val_oks struct fields documented as torso-HEIGHT proxies;
  logs say `pck_torso_h@0.2`. Test torso_pck_is_labelled_distinctly_from_canonical.
- trainer.rs (LIVE): `pck_at_threshold` documented raw-unnormalized; `oks_map`
  area=1.0 flagged fake-Gold; test pck_at_threshold_is_raw_unnormalized_not_canonical.
- main.rs: live print relabelled `pck_raw@0.2` / `oks_map(area=1.0 proxy)`.

No wire-format field renames (back-compat); no pub-API rename (no silent break).
Grade: MEASURED (relabel + divergence pinned). sensing-server 450→451 lib tests, 0 failed.

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(adr-155): mark §8 metric items RESOLVED + audit map + honest §1 under-count correction (M1b Goals A/D)

- §8.1: full PCK/OKS audit map (every def: file:line, basis, canonical/
  legacy/distinct), the two §8 items marked RESOLVED with resolution+why.
- Honest finding: §1's "seven divergent metrics" was an UNDER-count —
  sensing-server's LIVE trainer.rs has a raw-unnormalized PCK and an
  area=1.0 fake-Gold OKS the table omitted, and the file §8 named
  (training_api.rs) is orphaned dead code. §9 honest-limits updated.
- Goal D: metrics.rs *_v2 variants confirmed caller-less + deprecated;
  noted for future cleanup, NOT deleted (public API, tch-gated).
- CHANGELOG [Unreleased] Fixed entry.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(ruvector): RaBitQ Pass-2 randomized rotation + topk bugfix (ADR-156 §8)

Implements the deferred "Multi-bit / Extended RaBitQ Pass 2" backlog item
from ADR-156 §8: a deterministic randomized orthogonal rotation applied
before sign-quantization, the published RaBitQ construction (Gao & Long,
SIGMOD 2024).

Rotation construction: Fast Hadamard Transform + seeded ±1 sign flips
("HD" / randomized Hadamard), O(d log d) time and O(d) memory — a dense
d×d rotation is O(d²) and infeasible at the 65,535-d the wire format
provisions for. Pads to the next power of two; SplitMix64 seeds the sign
stream so index-time and query-time rotations are bit-identical.

API is additive and backward-compatible: Pass 1 (`from_embedding`) is
untouched; Pass 2 is opt-in via `Sketch::from_embedding_rotated` and
`SketchBank::with_rotation` (+ `insert_embedding` / `topk_embedding` /
`novelty_embedding` helpers that rotate consistently). Default behaviour
is unchanged.

While building the Pass-2 coverage harness, found and fixed a PRE-EXISTING
correctness bug in `SketchBank::topk`: the n>k heap path used
`BinaryHeap<Reverse<(d,id)>>` (a min-heap) but treated its peek as the
max, so it returned the k FARTHEST sketches as "nearest". The shipped unit
tests only exercised the n≤k fast path, so it went unnoticed. Fixed to a
plain max-heap; pinned by `topk_heap_path_returns_nearest` and
`tight_clusters_give_high_coverage_with_overfetch` (the latter measured
0.072 on the old code).

New tests (+17, 100→117 in the crate): rotation determinism/norm-preservation
(`rotation_is_deterministic_for_seed`, `rotation_preserves_norm`), Pass-2
shape-compatibility, `pass2_coverage_not_worse_than_pass1`, and a
deterministic coverage report.

MEASURED top-K coverage (anisotropic planted-cluster fixture, cosine ground
truth; dim=128 N=2048 K=8 64 clusters noise=0.35 128 queries):
  candidate_k=K=8 : Pass1 36.13% -> Pass2 46.39%  (both << 90% bar)
  candidate_k=24  : Pass1 83.89% -> Pass2 91.60%  (Pass2 clears 90%)
  candidate_k=32  : Pass1/Pass2 100%
Honest result: rotation consistently helps (+10pp at strict K), but neither
pass clears the ADR-084 90% bar at candidate_k==K on this distribution.
Pass 2 reaches 90% only with ~3x over-fetch (the ADR-084 "candidate set"
deployment pattern). Multi-bit Pass 3 evaluated separately.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(ruvector): multi-bit Pass-3 experiment + ADR-156/084 measured results

Adds the multi-bit half of the ADR-156 §8 "Multi-bit / Extended RaBitQ"
item as a MEASURED experiment (coverage::measure_multibit): rotate, then
b-bit uniform scalar-quantize each coord, rank by L1 over codes — the
natural multi-bit generalization of hamming. Measures the bit/coverage
tradeoff the backlog item asked for.

MEASURED at the strict bar (candidate_k=K=8, anisotropic planted-cluster
fixture, cosine ground truth):
  Pass1 (1-bit, no rot)  36.13%   16 B/vec
  Pass2 (1-bit, rot)     46.39%   16 B/vec
  Pass3 (rot, 2-bit)     54.39%   32 B/vec
  Pass3 (rot, 3-bit)     66.70%   48 B/vec
  Pass3 (rot, 4-bit)     74.22%   64 B/vec
Honest: multi-bit monotonically helps but even 4-bit (4x memory) reaches
only 74% at the strict bar — neither rotation nor <=4-bit multi-bit clears
the strict-K 90% bar on this distribution. The bar is met via over-fetch
(Pass2 @ candidate_k=24). Tests: multibit_tradeoff_report,
multibit_1bit_matches_pass2_approx (+ sanity that 1-bit ~= Pass-2).

Docs:
- ADR-156 §8 item #2 marked RESOLVED-PARTIAL; §5 #2 grade CLAIMED ->
  MEASURED-on-our-hardware; new §10 with full measured tables, the topk
  bugfix disclosure, and graded deferred sub-items.
- ADR-084: "Pass 2" section answering the rotation open-question with
  measured numbers + the topk bug note.
- CHANGELOG [Unreleased]: Added (Pass-2 milestone) + Fixed (topk heap).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-13 16:02:18 -04:00
..

Architecture Decision Records

This folder contains 45 Architecture Decision Records (ADRs) that document every significant technical choice in the RuView / WiFi-DensePose project.

Why ADRs?

Building a system that turns WiFi signals into human pose estimation involves hundreds of non-obvious decisions: which signal processing algorithms to use, how to bridge ESP32 firmware to a Rust pipeline, whether to run inference on-device or on a server, how to handle multi-person separation with limited subcarriers.

ADRs capture the context, options considered, decision made, and consequences for each of these choices. They serve three purposes:

  1. Institutional memory — Six months from now, anyone (human or AI) can read why we chose IIR bandpass filters over FIR for vital sign extraction, not just see the code.

  2. AI-assisted development — When an AI agent works on this codebase, ADRs give it the constraints and rationale it needs to make changes that align with the existing architecture. Without them, AI-generated code tends to drift — reinventing patterns that already exist, contradicting earlier decisions, or optimizing for the wrong tradeoffs.

  3. Review checkpoints — Each ADR is a reviewable artifact. When a proposed change touches the architecture, the ADR forces the author to articulate tradeoffs before writing code, not after.

ADRs and Domain-Driven Design

The project uses Domain-Driven Design (DDD) to organize code into bounded contexts — each with its own language, types, and responsibilities. ADRs and DDD work together:

  • ADRs define boundaries: ADR-029 (RuvSense) established multistatic sensing as a separate bounded context from single-node CSI. ADR-042 (CHCI) defined a new aggregate root for coherent channel imaging.
  • DDD models define the language: The RuvSense domain model defines terms like "coherence gate", "dwell time", and "TDM slot" that ADRs reference precisely.
  • Together they prevent drift: An AI agent reading ADR-039 knows that edge processing tiers are configured via NVS keys, not compile-time flags — because the ADR says so. The DDD model tells it which aggregate owns that configuration.

How ADRs are structured

Each ADR follows a consistent format:

  • Context — What problem or gap prompted this decision
  • Decision — What we chose to do and how
  • Consequences — What improved, what got harder, and what risks remain
  • References — Related ADRs, papers, and code paths

Statuses: Proposed (under discussion), Accepted (approved and/or implemented), Superseded (replaced by a later ADR).


ADR Index

Hardware and firmware

ADR Title Status
ADR-012 ESP32 CSI Sensor Mesh for Distributed Sensing Accepted (partial)
ADR-018 ESP32 Development Implementation Path Proposed
ADR-028 ESP32 Capability Audit and Witness Record Accepted
ADR-029 RuvSense Multistatic Sensing Mode (TDM, channel hopping) Proposed
ADR-032 Multistatic Mesh Security Hardening Accepted
ADR-039 ESP32-S3 Edge Intelligence Pipeline (on-device vitals) Accepted (hardware-validated)
ADR-040 WASM Programmable Sensing (Tier 3) Accepted
ADR-041 WASM Module Collection (65 edge modules) Accepted (hardware-validated)
ADR-044 Provisioning Tool Enhancements Proposed
ADR-110 ESP32-C6 firmware extension — Wi-Fi 6 / 802.15.4 / TWT / LP-core Accepted, P1-P10 complete, firmware-side substrate closed at v0.7.0-esp32. Companion docs: WITNESS-LOG-110 (13 §A0.x entries · 99.56 % cross-board RX · 104.1 µs smoothed sync stdev · ≤100 µs target met), ADR-110-REVIEW-GUIDE (one-page reviewer tour), ADR-110-BRANCH-STATE (coordination map vs feat/adr-115-ha-mqtt-matter). Host decoders + tests: Python SyncPacketParser (10) + Rust wifi_densepose_hardware::SyncPacket (15), cross-language hex pin gates drift.

Signal processing and sensing

ADR Title Status
ADR-013 Feature-Level Sensing on Commodity Gear Accepted
ADR-014 SOTA Signal Processing Algorithms Accepted
ADR-021 Vital Sign Detection (breathing, heart rate) Partial
ADR-030 Persistent Field Model and Drift Detection Proposed
ADR-033 CRV Signal Line Sensing Integration Proposed
ADR-037 Multi-Person Pose Detection from Single ESP32 Proposed
ADR-042 Coherent Human Channel Imaging (beyond CSI) Proposed
ADR-134 First-Class Channel Impulse Response (CIR) Support Proposed
ADR-135 Empty-Room Baseline Calibration (per-subcarrier Welford statistics) Proposed

Machine learning and training

ADR Title Status
ADR-005 SONA Self-Learning for Pose Estimation Partial
ADR-006 GNN-Enhanced CSI Pattern Recognition Partial
ADR-015 Public Dataset Strategy (MM-Fi, Wi-Pose) Accepted
ADR-016 RuVector Training Pipeline Integration Accepted
ADR-017 RuVector Signal + MAT Integration Proposed
ADR-020 Migrate AI Inference to Rust (ONNX Runtime) Accepted
ADR-023 Trained DensePose Model with RuVector Pipeline Proposed
ADR-024 Project AETHER: Contrastive CSI Embeddings Required
ADR-027 Project MERIDIAN: Cross-Environment Generalization Proposed
ADR-149 AetherArena: public spatial-intelligence benchmark on Hugging Face Proposed
ADR-150 RF Foundation Encoder: pose-preserving, subject/room/device-invariant CSI embedding Proposed
ADR-151 Per-Room Calibration & Specialized Model Training (room-first → bank of small ruVector specialists) Proposed
ADR-152 WiFi-Pose SOTA 2026 Intake: geometry-conditioned calibration, external benchmarks, foundation-encoder recipe Proposed

Platform and UI

ADR Title Status
ADR-019 Sensing-Only UI with Gaussian Splats Accepted
ADR-022 Windows WiFi Enhanced Fidelity (multi-BSSID) Partial
ADR-025 macOS CoreWLAN WiFi Sensing Proposed
ADR-031 RuView Sensing-First RF Mode Proposed
ADR-034 Expo React Native Mobile App Accepted
ADR-035 Live Sensing UI Accuracy and Data Transparency Accepted
ADR-036 Training Pipeline UI Integration Proposed
ADR-043 Sensing Server UI API Completion (14 endpoints) Accepted
ADR-115 Home Assistant integration via MQTT auto-discovery + Matter bridge (HA-DISCO + HA-FABRIC + HA-MIND) Accepted (MQTT track) / Proposed (Matter SDK P8b)
ADR-169 adam-mode — light theme toggle for the three.js realtime demo Proposed
ADR-170 yoga-mode — yoga pose detection, classification, and scoring for the three.js realtime demo Proposed

Architecture and infrastructure

ADR Title Status
ADR-001 WiFi-Mat Disaster Detection Architecture Accepted
ADR-002 RuVector RVF Integration Strategy Superseded
ADR-003 RVF Cognitive Containers for CSI Proposed
ADR-004 HNSW Vector Search for Fingerprinting Partial
ADR-007 Post-Quantum Cryptography for Sensing Proposed
ADR-008 Distributed Consensus for Multi-AP Proposed
ADR-009 RVF WASM Runtime for Edge Deployment Proposed
ADR-010 Witness Chains for Audit Trail Integrity Proposed
ADR-011 Proof-of-Reality and Mock Elimination Proposed
ADR-026 Survivor Track Lifecycle (MAT crate) Accepted
ADR-038 Sublinear GOAP for Roadmap Optimization Proposed
ADR-095 rvCSI — Edge RF Sensing Runtime Platform Proposed
ADR-096 rvCSI — Crate Topology, the napi-c Shim, and the napi-rs Node Surface Proposed
ADR-097 Adopt rvCSI as RuView's primary CSI runtime (phased adoption) Proposed
ADR-098 Evaluate ruvnet/midstream for RuView's CSI / WebSocket / mesh pipeline Rejected
ADR-099 Adopt midstream as RuView's real-time introspection + low-latency tap Proposed

  • DDD Domain Models — Bounded context definitions, aggregate roots, and ubiquitous language
  • User Guide — Setup, API reference, and hardware instructions
  • Build Guide — Building from source