mirror of
https://github.com/ruvnet/RuView
synced 2026-06-14 11:03:18 +00:00
Compare commits
10 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| f756a8af49 | |||
| 261ce80a72 | |||
| 0c2b1c16cc | |||
| 1d12e8831a | |||
| 8c24b8bdfe | |||
| 91248536bc | |||
| 865f9dee77 | |||
| cf2a85db66 | |||
| 9b07dff298 | |||
| 42dcf49f4d |
@@ -16,6 +16,15 @@ firmware/esp32-csi-node/sdkconfig.defaults.bak
|
||||
# ESP-IDF set-target backup (local only)
|
||||
firmware/esp32-hello-world/sdkconfig.old
|
||||
|
||||
# Host-built firmware test binaries (compiled from test/*.c, not source)
|
||||
firmware/esp32-csi-node/test/test_adr110
|
||||
firmware/esp32-csi-node/test/test_vitals
|
||||
firmware/esp32-csi-node/test/fuzz_serialize
|
||||
firmware/esp32-csi-node/test/fuzz_edge
|
||||
firmware/esp32-csi-node/test/fuzz_nvs
|
||||
firmware/esp32-csi-node/test/*.exe
|
||||
firmware/esp32-csi-node/test/*.obj
|
||||
|
||||
# Claude Flow swarm runtime state
|
||||
.swarm/
|
||||
|
||||
|
||||
@@ -18,3 +18,6 @@
|
||||
path = v2/crates/ruv-neural
|
||||
url = https://github.com/ruvnet/ruv-neural.git
|
||||
branch = main
|
||||
[submodule "vendor/rufield"]
|
||||
path = vendor/rufield
|
||||
url = https://github.com/ruvnet/rufield
|
||||
|
||||
+36
-2
@@ -7,14 +7,48 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Added
|
||||
- **ADR-261: RuVector graph-ANN index — a real HNSW baseline + a SymphonyQG-style quantized variant, MEASURED (honest negative).** Closes the [ADR-156 §5 #1](docs/adr/ADR-156-ruvector-fusion-beyond-sota.md) gap: the SymphonyQG (SIGMOD 2025) **3.5–17× QPS-over-HNSW** claim was CLAIMED-only because **no HNSW baseline existed to compare against**. This adds one. New pure-Rust, `--no-default-features`-buildable modules in `wifi-densepose-ruvector`: `hnsw.rs` (a correct float HNSW — Malkov & Yashunin: multi-layer NSW graph, `ef_construction`/`ef_search`, Algorithm-4 neighbour selection, **seeded-deterministic** level assignment via SplitMix64, L2 + cosine, full degenerate-case guards), `hnsw_quantized.rs` (the SymphonyQG-style variant — the **same** graph traversed by a cheap **1-bit Hamming** score over the RaBitQ Pass-2 rotated sign code, then **exact-float rerank**), `ann_measure.rs` + `benches/ann_bench.rs` (one shared deterministic planted-cluster fixture; the `ann_bench_report` test is the source of truth). **MEASURED (dim=128, N=10k, K=10, `--release`):** float HNSW = **~25× QPS over linear scan at recall ≥0.99** (the baseline this gap needed; recall@10 correctness gate ≥0.95 holds, L2 + cosine). **Honest negative:** the 1-bit quantized traversal is **too coarse to beat float HNSW at equal recall at this scale** — its best recall is **0.738**, never reaching the ≥0.90 equal-recall point, so there is **no QPS win** over float HNSW; the 3.5–17× is **not reproduced** by our 1-bit construction here. The recall gate also **caught a real index-out-of-bounds bug** in the insert path (disclosed in ADR-261 §4). Caveat: this is **our** HNSW + **our** 1-bit quant, not SymphonyQG's exact system — it tests the *direction* of the claim, with the expected crossover at large N + a multi-bit traversal code. **We did not tune to manufacture a speedup.** +20 tests (ruvector lib 131→151, 0 failed). ADR-156 §5 #1 / §8 backlog: CLAIMED → **MEASURED-direction-tested**. Python deterministic proof unchanged (off the signal proof path).
|
||||
- **ADR-260: RuField MFS — the open specification for camera-free multimodal field sensing.** A common event / tensor / calibration / privacy / provenance model that sits *above* WiFi CSI/CIR/BFLD, UWB, BLE Channel Sounding, mmWave radar, ultrasound, subsonic, infrared, and future quantum sensors (each modality emits a normalized `FieldEvent` → `FieldTensor` → `FusionGraph` → `PrivacyClass` → `ProvenanceReceipt`). Published as a **standalone repo** [`ruvnet/rufield`](https://github.com/ruvnet/rufield) and vendored here as the `vendor/rufield` submodule (the `vendor/rvcsi` pattern — not a `v2/` workspace member). The v0.1 reference stack is a self-contained 6-crate Rust workspace (`rufield-core`, `-provenance` [sha256 + ed25519], `-privacy` [P0–P5 guard], `-adapters` [deterministic `SyntheticSim` across wifi_csi/mmwave_radar/infrared_thermal], `-fusion` [graph + TOML weighted-Bayes rules → 7 room-state inferences], `-bench` [deterministic runner + the §31 acceptance test]). **60 tests / 0 failed, clippy-clean.** §27 acceptance criteria 1–8 and 10 PASS; the live dashboard (9) is deferred. **All benchmark metrics are SYNTHETIC** (scored against the simulator's own ground truth — presence/breathing/bed_exit/room_transition F1 = 1.000, nocturnal_scratch 0.923 reported honestly, p95 latency ~0.01 ms, provenance coverage 100%, 0 privacy violations) — they prove the pipeline recovers known truth, **not** field accuracy; real hardware adapters (ESP32 CSI, mmWave, thermal IR) are a documented roadmap item, none validated in v0.1. The Python deterministic proof is unchanged (rufield is off the signal-processing proof path).
|
||||
|
||||
### Security
|
||||
- **ADR-157 Milestone-1 B4 - constant-time HMAC sync-beacon tag compare (`wifi-densepose-hardware`).** `AuthenticatedBeacon::verify` compared the 8-byte HMAC-SHA256 tag with `self.hmac_tag == expected`, which short-circuits on the first differing byte and leaks, through verification latency, how many leading bytes an attacker's forged tag matched - a byte-by-byte tag-recovery oracle (~256*N trials instead of 256^N). Replaced with a hand-rolled branch-free `constant_time_tag_eq` (XOR-accumulate every byte difference into a single `u8`, no early exit, `#[inline(never)]` + `core::hint::black_box` to stop the optimizer reintroducing a short-circuit or a non-constant-time `memcmp`). **No new dependency** - ADR-157 had deferred this only to avoid adding the `subtle` crate; a fixed 8-byte compare needs none. Grade MEASURED (constant-time *construction*; micro-timing on a noisy host is a smoke check only, gated `#[ignore]`). Pinned by `tag_compare_is_constant_time_shape` (equal/first-differ/last-differ/all-differ/length-mismatch + an end-to-end `verify()` last-byte tamper), proven to fail on a last-byte-skipping constant-time bug. ADR-157 §8 B4 -> RESOLVED.
|
||||
- **ADR-080 open HIGH findings closed on the Rust `wifi-densepose-sensing-server` boundary (ADR-164 G11).** The QE sweep's three HIGH findings — XFF-spoofing bypass, leaked stack traces, JWT-in-URL (CWE-598) — were logged against the Python v1 API and never re-verified against the shipped Rust sensing-server; the HOMECORE/M7 sweep (ADR-161) covered `homecore-server`, not this crate.
|
||||
- **#2 leaked internal errors (the one live exposure) — FIXED.** Six handlers in `main.rs` serialized the internal error `Display` straight into the JSON response body: `edge_registry_endpoint` returned a panicked `spawn_blocking` `JoinError` (`"task … panicked"`) in a `500`, plus the raw upstream error in a `503`; `delete_model`/`delete_recording`/`start_recording` returned `std::io::Error` strings (OS detail / path); `calibration_start`/`calibration_stop` returned the `FieldModel` error chain. New `error_response` module logs the full detail **server-side only** (with a correlation id) and returns a generic body (`{"error":"internal_error","correlation_id":…}`) — no `panicked`, no file paths, no Debug chain. 5 module tests (a leak-substring guard proven to fail on the reverted old body) + the existing handler suite.
|
||||
- **#1 XFF-spoofing bypass — VERIFIED ABSENT, regression-pinned.** The sensing-server has no XFF-trusting control to bypass: there is no IP-based rate-limiter or IP-allowlist, and neither `bearer_auth` (token-only) nor `host_validation` (Host-header only) reads `X-Forwarded-For`/`X-Forwarded-Host` (no `forwarded`/`peer_addr`/`client_ip` anywhere in the crate). Added regression tests proving a spoofed `X-Forwarded-For` never flips an auth decision and a spoofed `X-Forwarded-Host` never bypasses the Host allowlist.
|
||||
- **#3 JWT-in-URL (CWE-598) — VERIFIED ABSENT, regression-pinned.** `require_bearer` reads the token only from the `Authorization` header; the WebSocket handlers take no token query param and the sole `Query` extractor (`EdgeRegistryParams`) is a non-secret `refresh` flag. Added a regression proving `?token=`/`?access_token=` in the URL never authenticates while the header path still does.
|
||||
|
||||
### Fixed
|
||||
- **ESP32 vitals: `n_persons` over-counted (reported 4 for one person) + presence flag flickered at close range (#998, #996).** Two firmware logic bugs in `firmware/esp32-csi-node/main/edge_processing.c`, both robustness/logic fixes — **not** validated-accuracy claims (true count/PCK vs labelled ground truth stays hardware/data-gated on the COM9 ESP32-S3).
|
||||
- **#998 over-count — root cause + fix.** `update_multi_person_vitals()` split the top-K subcarriers into `top_k_count/2` groups and marked **every** group `active` unconditionally, so one body's multipath always reported the full `EDGE_MAX_PERSONS` (=4). New pure, host-testable `count_distinct_persons()` gates each candidate group: (1) **energy gate** — a group's phase variance must be ≥ `EDGE_PERSON_MIN_ENERGY_RATIO` (0.35) × the strongest group's, so weak multipath echoes don't count; (2) **spatial dedup** — groups whose representative subcarriers sit within `EDGE_PERSON_MIN_SC_SEP` (4) of each other are the same body. A `person_count_debounce()` then requires the gated count to hold `EDGE_PERSON_PERSIST_FRAMES` (3) consecutive frames before it's emitted, so a single noisy frame can't promote a phantom. The strongest group always counts (a present body yields ≥1). All thresholds are named, documented constants in `edge_processing.h`.
|
||||
- **#996 presence flicker — root cause + fix.** Presence was a bare `score > threshold` compare on a noisy `presence_score` (field-observed 2.6–26.7 frame-to-frame for one stationary person), so the boolean chattered at the boundary while the score clearly indicated a person. New pure `presence_flag_update()` is a Schmitt trigger + clear-debounce: assert above `threshold`, **hold** in the dead band down to `threshold × EDGE_PRESENCE_HYST_RATIO` (0.5), and only clear after the score stays below the low threshold for `EDGE_PRESENCE_CLEAR_FRAMES` (5) consecutive frames. The score itself is unchanged (and still emitted at packet offset 20 for consumer-side thresholding). Constants named/documented in `edge_processing.h`.
|
||||
- **Tests:** `firmware/esp32-csi-node/test/test_vitals_count_presence.c` (host C99, `make run_vitals`) — 13 cases / 22 assertions, all passing under gcc 13 `-Wall -Wextra`. Pins: single-strong-signature + multipath → count==1; two well-separated → count==2; two strong-but-adjacent → 1 (dedup); transient count spike rejected; sustained change accepted; dithering presence trace → stable flag (no flicker); genuine departure → clears within hold window. The named tuning constants are `#include`d from the real header so the test and firmware can't disagree. **Hardware-gated caveat:** these pin the decision *logic*; the exact energy/separation/hysteresis values that best match a real room vs labelled occupancy remain on-device tuning (COM9 ESP32-S3 + ground truth).
|
||||
- **Observatory 3D figure never animated — `/ws/sensing` omitted per-person `position`/`motion_score`/`pose` (#1050).** The `sensing_update` frame shipped `nodes`/`features`/`classification`/`signal_field` and a `persons[]` carrying only image-space `keypoints`/`bbox`/`zone`; the Observatory's `FigurePool`/`PoseSystem` (and `demo-data.js`'s own contract) animate each figure from `persons[i].position` (room-world `[x,y,z]`), `persons[i].motion_score` (0..100), and `persons[i].pose`, none of which the live stream emitted — so the figure sat static while signal metrics updated. **Honest scope (Case 2 — no calibrated per-person localizer exists):** a single ESP32 link does not produce calibrated room-coordinate localization or per-person skeletal pose, so the fix emits only what is *truthfully derivable*. New `field_localize` module reads the **strongest peak(s)** out of the frame's real `signal_field` grid (already built from measured subcarrier variances × measured motion-band power) and maps the peak cell to Observatory world coordinates with the **exact** `_buildSignalField` transform (`x=(ix−nx/2)·0.6`, `z=(iz−nz/2)·0.5`, `y=0`), so the figure lands on the field hotspot it stands on. `motion_score` is the measured `motion_band_power` passed through (clamped 0..100); `pose` is set **only** from a real aggregate `posture` estimate when one exists, else `None` (never a fabricated skeleton — per-person pose keypoints in room coordinates stay gated on the pose model + ADR-079 paired data). An empty / below-threshold field yields `persons: []` (no phantom person); a present person on a field with no resolvable peak keeps `position=[0,0,0]` (not invented coords) while `motion_score` stays real. `attach_field_positions` runs after the tracker step at all five broadcast sites. **No UI change required** — the Observatory already reads these fields and defaults `pose`→`'standing'` when absent. New `PersonDetection.position`/`motion_score`/`pose` fields added to both the `main.rs`-local and `types.rs` structs. Pinned by 10 tests: `field_localize` peak-extraction/coordinate-mapping/empty-field/separation unit tests + `observatory_persons_field_position_tests` (`sensing_update_emits_persons_with_field_derived_position` feeds a synthetic field with a known peak at cell (15,4) and asserts the emitted `position` = `[3.0, 0, −3.0]` within tolerance; `empty_room_yields_no_phantom_person`; `pose_is_real_when_posture_present_and_absent_otherwise`; `present_but_below_threshold_field_keeps_position_at_origin_not_fabricated`). `wifi-densepose-sensing-server --no-default-features`: bin **441→451**, 0 failed; workspace green; Python proof unchanged (off the deterministic proof path).
|
||||
- **ADR-155 Milestone-1b — metric-definition unification, the §8 backlog subset (Goals A/B/C).** Closed the two §8 metric-integrity items; every change pinned by a test, graded MEASURED. The audit (Goal A) also surfaced findings the §1 table under-counted — recorded honestly in ADR-155 §8.1, not hidden. Workspace stays green; Python proof unchanged (metrics are not on the deterministic proof's signal path).
|
||||
- **Goal B — `test_metrics.rs` now validates the production metric, not a reimplementation.** The integration test previously asserted properties of its OWN local `compute_pck`/`compute_oks` (a test that can't catch a canonical-impl bug — both could be wrong the same way). Hoisted the canonical core (`pck_canonical`/`oks_canonical`/`canonical_torso_size`/sigmas/`bounding_box_diagonal`) into a new **un-gated** `metrics_core` module so the single definition is reachable under `cargo test --no-default-features` (the `metrics` module is `tch-backend`-gated); `metrics` re-exports it → still exactly ONE implementation. Rewrote the test to assert the production `pck_canonical`/`oks_canonical` equal **hand-computed** fixtures (`canonical_pck_matches_hand_computed_fixture` = 3/4 correct ⇒ 0.75; hip↔hip normalizer pin; zero-visible⇒0.0; OKS perfect⇒1.0; fake-Gold pin) plus a differential cross-check (`test_kernel_agrees_with_canonical`: an independent raw-threshold kernel must AGREE with canonical where torso==1.0). `wifi-densepose-train --no-default-features`: test_metrics **10→12**, 0 failed.
|
||||
- **Goal C — divergent live-server PCK/OKS relabelled so they're never conflated with canonical.** Goal C named `training_api.rs:804` (torso-HEIGHT PCK); the audit found that file is an **orphan (not `mod`-declared, does not compile)** and the **real** live `best_pck`/`best_oks` come from `trainer.rs` — a **raw, unnormalized** `pck_at_threshold` and an **`area=1.0` fake-Gold** `oks_map` (both MISSED by ADR-155 §1, both on the claim-inflating side, both serialized as bare "PCK@0.2"/"OKS"). Torso-height/raw math is load-bearing (pixel-space, different scale axis, no `ndarray`/train dep), so the honest fix is **relabel, not force-unify**: `training_api.rs` `compute_pck` → `compute_pck_torso_height` + field/log docs; `trainer.rs` kernels documented raw/fake-Gold; `main.rs` prints `pck_raw@0.2` / `oks_map(area=1.0 proxy)`. No wire-format field or `pub`-fn renames (no silent API break). Pinned by `torso_pck_is_labelled_distinctly_from_canonical` + `pck_at_threshold_is_raw_unnormalized_not_canonical`. `wifi-densepose-sensing-server --no-default-features`: lib **450→451**, 0 failed. True unification onto `pck_canonical`/`oks_canonical` remains a tracked ADR-155 §8 item.
|
||||
- **Pre-existing `SketchBank::topk` heap inversion returned the FARTHEST sketches (found during ADR-156 §8 Pass-2 work).** The `n > k` partial-sort path in `wifi-densepose-ruvector/src/sketch.rs` used `BinaryHeap<Reverse<(dist,id)>>` (a min-heap) but its eviction logic treated the peek as the max, so it kept the k *farthest* sketches and returned them as "nearest." The shipped unit tests only exercised the `n ≤ k` fast path (≤ 3 entries), so the inversion shipped silently in ADR-084. Fixed to a plain max-heap. Pinned by `topk_heap_path_returns_nearest` (farthest-first insertion exposes it) and `tight_clusters_give_high_coverage_with_overfetch` (**measured 0.072 coverage on the old code** — effectively random — vs >0.99 fixed). Every ADR-084 top-K coverage number depends on the fixed path. MEASURED, not a no-op.
|
||||
- **ADR-154 Milestone-1 — cleared the P1 deferred backlog in `wifi-densepose-signal` (§7.4 #1, #10; partial #9, #13).** Each fix pinned by a regression test that fails on the old behaviour; every claim graded MEASURED / DATA-GATED; no fabricated thresholds. Python proof unchanged (`f8e76f21…46f7a`, bit-exact — the CIR ghost-tap guard is not on the deterministic proof path).
|
||||
- **#1 (MEASURED metric / DATA-GATED threshold): circular phase variance.** `cir.rs::phase_variance` computed a *linear* sample variance over phase angles that wrap at ±π, so a tightly-clustered set straddling the branch cut reported spuriously HIGH dispersion — false-tripping the `> TAU` ghost-tap **guard** on real, tightly-clustered CIR taps. Replaced with Mardia's **circular variance** V = 1 − R̄, bounded **[0,1]** and invariant to where the cluster sits on the circle. The old TAU-scaled threshold is meaningless on [0,1]; re-derived against a named const `GHOST_TAP_CIRCULAR_VARIANCE_MAX = 0.99` (fires only when R̄ ≤ 0.01 — essentially uniform phase). The **metric is MEASURED**; the **threshold value is DATA-GATED** (a clean single-path ramp also sweeps the circle, so V alone can't separate clean from unsanitized without labelled frames — the default is deliberately conservative, strictly more permissive at the wrap boundary than the buggy linear guard). Fails-on-old: `phase_variance_circular_not_fooled_by_branch_cut` (old linear variance > TAU on wrap-straddling phases while circular V≈0, guard no longer trips) + `phase_variance_circular_is_bounded_and_extremal` (V∈[0,1], V≈0 identical, V≈1 uniform).
|
||||
- **#10 (MEASURED): Welford n=0/n=1 finiteness guard pinned.** The shared `WelfordStats` (`field_model.rs`) `count < 2` guards keep `variance`/`sample_variance`/`std_dev`/`z_score` finite at the boundaries, but the n=0 case was untested (same family as the §4 divide-by-(n−1) trio). Added `welford_finite_at_n0_and_n1` — finite + documented-sentinel (0.0) at n=0/n=1. Fails-on-old proof: removing the `sample_variance` guard makes the test panic with "attempt to subtract with overflow" at the `(count − 1)` underflow (guard restored).
|
||||
- **#9, #13 (DATA-GATED): de-magicked thresholds + boundary tests (values UNCHANGED).** Lifted the bare detection literals in `adversarial.rs` (`check`/`check_consistency`: Gini 0.8, energy ratios 2.0/0.1, consistency 0.1·mean, score weights), `coherence.rs::classify_drift` (0.85, 10) and `coherence_gate.rs` defaults (0.85/0.5/200/3.0) into named, documented consts marked EMPIRICAL DEFAULT pending labelled calibration. Added characterization/boundary tests pinning each decision at/just-below/just-above its threshold (`energy_ratio_high_boundary`, `energy_ratio_low_boundary`, `field_model_gini_boundary`, `consistency_active_fraction_boundary`, `classify_drift_*_boundary`, `*_consts_unchanged_from_literals`) so a future labelled-data retune is a visible, tested change. The operating **values were not changed**; the de-magicking + tests are MEASURED, the values stay DATA-GATED.
|
||||
- **Multistatic fusion guard was too tight for real TDM hardware (#1031).** `MultistaticConfig::default().guard_interval_us` was 5,000 µs (5 ms) with a comment claiming "well within the 50 ms TDMA cycle" — but on a real N-slot TDM schedule node `k` transmits in slot `k`, so two nodes are separated by the *slot offset*, not clock jitter. A real 2-node mesh (slots 0/1) measured an **18,194 µs** spread, so every real frame set exceeded the 5 ms guard and `fuse()` silently fell back to per-node sum/dedup — multistatic fusion never actually ran on hardware. Raised the default hard guard to **60 ms** (a full 50 ms TDMA cycle + 20% jitter headroom, derived from the slot model and documented in the field doc) and the soft guard to **20 ms** (just above the observed 18.2 ms 2-slot spread, so a normal cycle fuses cleanly with no privacy demotion). Added `MultistaticConfig::for_tdm_schedule(total_slots, slot_duration_us)` to derive the guard from a deployment's exact schedule, and a `WDP_TDM_SLOTS`+`WDP_TDM_SLOT_US` env seam in sensing-server. The honest per-node fallback remains for genuinely-mismatched frames — now the exception, not the default. Pinned by `fuse_real_tdm_spread_18194us_fuses_with_default_guard` (fails on the old 5 ms default) + `configurable_guard_rejects_too_large_spread` (guard still rejects a spread beyond one cycle).
|
||||
- **Published HuggingFace model was unloadable — RVF format mismatch (#894).** The `ProgressiveLoader` rejected the published `ruvnet/wifi-densepose-pretrained` model with the opaque `invalid magic at offset 0: expected 0x52564653 (RVFS), got 0x77455735`, then silently fell back to signal heuristics (the "10 persons for 1" garbage reporters saw). The HF repo ships `model.safetensors`, `model-q{2,4,8}.bin` (magic `0x77455735` = "5WEw"), and `model.rvf.jsonl` — none carry the binary-RVF magic. New `model_format` module **auto-detects** RVFS / safetensors / HF-quant-bin / JSONL by magic+name, returns a **typed actionable** `ModelLoadError` (lists accepted formats + the one-command convert path — never the opaque magic), and **converts** `model.safetensors` / `model.rvf.jsonl` → RVF in-memory so the published full-precision model now loads via `--model`. A `--convert-model <in> --convert-out <out>` CLI subcommand gives a one-command offline path; the silent heuristics fallback is now a loud, actionable error. **Honest scope:** the converter wires the format/load path (safetensors F32 tensors → RVF weight segment, manifest written, Layer A/B/C all succeed, weights round-trip) — it does **not** claim end-to-end pose accuracy, since the HF pose-decoder architecture differs from this crate's inference head (still data-gated in #894). Quantized `.bin` blobs are rejected with a typed error pointing at the safetensors path. Pinned by `safetensors_converts_and_loads` + `hf_quant_classifies_to_actionable_error` (both fail on the old opaque-magic path).
|
||||
|
||||
### Changed
|
||||
- **ADR-157 Milestone-1 §5 #4 - native `wlanapi.dll` multi-BSSID throughput MEASURED on real hardware (`wifi-densepose-wifiscan`).** The ADR's prior status ("asserted but NOT implemented; live scanner is the ~2 Hz netsh shim") is now stale: `wlanapi_native.rs` already implements the real `WlanOpenHandle` -> `WlanEnumInterfaces` -> `WlanGetNetworkBssList` -> `WlanFreeMemory`/`WlanCloseHandle` FFI and `WlanApiScanner` already wires it native-first with a netsh fallback. This milestone **measured it on this box** (Intel Wi-Fi 7 BE201 320MHz, 2026-06-13): a new `benchmark_backend(backend, window)` drives each backend over the same fixed 10 s wall-clock window so netsh is timed independently (the prior `benchmark()` picked native-first and never measured netsh on a Windows box where native works). **MEASURED: native 21.42 Hz vs netsh 3.84 Hz = 5.57x** (mean 5.0 BSSIDs/scan, both paths); a separate native-only run measured 18.0 Hz. Native genuinely beats netsh - this is a real positive result, not a fabricated "10x". 50 back-to-back native scans completed 50/50 with no handle leak/degradation. Live-WLAN tests (`measure_native_vs_netsh_throughput`, `native_scans_dont_leak_handles`, `measure_native_scan_rate`) are `#[ignore]` for CI but were RUN here; `native_scan_runs_real_ffi_on_windows` is a non-ignored schema-valid pin. ADR-157 §5 #4 + §8 -> MEASURED (was ACCEPTED-FUTURE / CLAIMED-unmeasured).
|
||||
- **Mesh partition risk now demotes the privacy class and is witnessed (ADR-032).** The dynamic min-cut guard's `at_risk` signal was advisory-only (it fed the recalibration advisor). It now also contributes to the ADR-141 privacy demotion alongside fusion- and array-level contradictions: a mesh close to partitioning makes the fused belief less trustworthy, so the cycle emits at a more restricted class (monotonic — information only removed). Because `effective_class` feeds the BLAKE3 witness, a fragmenting array now shifts the witness — partition risk is auditable, not just logged. The mesh computation moved ahead of the demotion step in `process_cycle`; new `mesh_guard_mut()` exposes risk-threshold tuning. Test proves a forced-risk 3-node cycle demotes PrivateHome Anonymous→Restricted and shifts the witness vs a clean *same-topology* baseline (the only delta between the two cycles is the forced risk).
|
||||
|
||||
### Added
|
||||
- **ADR-155 Milestone-2 — cleared the host-verifiable subset of the §8 P3 backlog in `wifi-densepose-train` (+ the pure-Rust `rf_encoder.rs`/`densepose.rs` the §3/§4 items named).** Mirrors the ADR-154 M3 cleanup discipline. **Honest enumeration first (grep, not the ADR's "~40" estimate):** the actual non-tch train/nn surface is smaller — **7 de-magicked (const + `*_consts_unchanged_from_literals` pin == prior literal), 9 boundary/characterization tests, 1 added input guard (`rf_encoder::LinearHead::try_new`) + test, 2 doc-only fixes, 1 perf item bench-first → MEASURED-INCONCLUSIVE (not shipped)**. **This is cleanup — no operating value or behaviour changed:** each lifted literal is bit-identical to its prior value, each boundary test pins CURRENT behaviour. De-magicked: `metrics_core.rs` (`VISIBILITY_THRESHOLD`/`MIN_REFERENCE_EXTENT`/`OKS_FALLBACK_SIGMA`), `ruview_metrics.rs` (`NUM_KEYPOINTS`/`VISIBILITY_THRESHOLD`/`PCK_THRESHOLD`/`MIN_BBOX_DIAG`/`MIN_DURATION_MINUTES`), `subcarrier.rs` (6 `SPARSE_*` consts), `eval.rs` (`MIN_POSITIVE_MPJPE`), `domain.rs` (`LAYER_NORM_EPS`), `virtual_aug.rs` (`BOX_MULLER_U1_FLOOR`/`MIN_ROOM_SCALE`), `rf_encoder.rs` (`SOFTPLUS_LINEAR_THRESHOLD`). **§3 `rf_encoder.rs`:** added a pure-Rust fallible `LinearHead::try_new` → typed `RfHeadError` so untrusted/deserialized checkpoint weights can be shape-validated without the `new()` panic (`new` unchanged; additive). **§4 native-conv:** `densepose.rs::apply_conv_layer` (pure-Rust naive loop) was benched (committed `benches/native_conv_bench.rs`); a bit-identical range-clamped rewrite measured ~35% faster on padding-heavy small-channel maps but ~3% *slower* on channel-heavy maps, all inside a ±20% host-noise floor — **MEASURED-INCONCLUSIVE, so NOT shipped** (no fabricated number), characterized by `native_conv_matches_reference` and honestly deferred. **Skipped honestly (not-real / already-handled):** `ablation.rs` (NaN-sort + boundaries already fixed/tested in M1), `signal_features.rs` (consts already named, n=0 tested), `mae.rs` (no bare guard literals). `wifi-densepose-train --no-default-features`: **303 passed** (was 288, +15), 0 failed; `wifi-densepose-nn --no-default-features` lib: **38** (was 35, +3). Workspace `--no-default-features`: GREEN (single clean run). Python proof **VERDICT: PASS**, hash **`f8e76f21…46f7a` UNCHANGED, bit-exact** (asserted — the metrics path is off the deterministic signal proof path). **Remaining §8 backlog stays deferred-not-dropped:** GraphPose-Fi / ONNX-INT4 / CSI-JEPA (data/model-gated), ONNX read-lock (upstream `ort`-gated), tch-gated panic sites in `proof.rs`/`trainer.rs`/`model.rs` + `metrics.rs` `*_v2` dead-code (tch-gated — need a libtorch host). **The non-tch-verifiable subset of §8 is now cleared.**
|
||||
- **ADR-154 Milestone-3 — cleared the §7.4 row #21–45 P3 backlog in `wifi-densepose-signal` (the lumped "remaining clarity/doc/magic-constant/missing-boundary-test findings across `ruvsense/*`, `features.rs`, `motion.rs`").** Honest enumeration first (grep, not the ADR's estimate): the lumped row was **~25 findings → 22 real, de-magicked across 11 modules; 6 boundary/characterization tests added; ~4 doc-only; the rest were already-handled or not-real and are reported as such** (the "row #21–45" count was an estimate — there were not 25 *distinct* magic constants left after M0–M2). **This is cleanup — no operating value or behaviour changed:** every de-magicked literal becomes a named, documented EMPIRICAL-DEFAULT const that **equals the prior literal exactly** (each module ships a `*_consts_unchanged_from_literals` pin test), and every boundary test pins **current** behaviour so a future retune is a visible, tested change. Modules touched: `motion.rs` (#18, fusion weights/normalization/adaptive-threshold consts + 5 tests), `gesture.rs` (#12, `euclidean_distance` length-mismatch `debug_assert` documenting the silent-truncation contract + DTW n=0/m=0 boundary), `longitudinal.rs` (drift thresholds 7-day/2σ/3-day/7-day/EMA + day-6/7 + zero-vector cosine), `cross_room.rs`/`multiband.rs`/`intention.rs`/`hampel.rs` (division-guard epsilons + zero-norm/zero-variance/zero-MAD boundary + `half_window==0` error path), `rf_slam.rs` (`NS_PER_DAY` + fixed-map defaults + zero-span guard), `attractor_drift.rs` (buffer/recent-window consts + documented the implicit `recent.len()≥1` divide-safety + `min_observations` off-by-one boundary), `coherence.rs` (#9 completion — variance-floor + default-decay), `calibration.rs` (#2 — `DEFAULT_MIN_FRAMES` deduped across 4 tier constructors + motion/subtract thresholds), `fusion_quality.rs` (contradiction penalty/bounds + n=0 identity), `temporal_gesture.rs` (confidence epsilon + quantization scale). **A "magic" the agents flagged that was NOT real:** an `attractor_drift.rs:301` "divide-by-zero" is unreachable (the `count < min_observations` guard guarantees `recent.len()≥1`) — documented + boundary-tested rather than guarded, per the no-behaviour-change rule. Signal crate lib `--no-default-features`: **476 passed, 0 failed, 1 ignored**; `--no-default-features --features cir`: **476 passed, 0 failed** (plain `--features cir` is unbuildable on this Windows host — the default `eigenvalue` feature pulls `openblas-src`, the same BLAS gate documented in M2 #8). Workspace `--no-default-features`: **3,275 / 0 failed** (single clean run). Python proof **VERDICT: PASS**, hash **`f8e76f21…46f7a` UNCHANGED, bit-exact** (asserted explicitly — these modules are off the deterministic PSD/Doppler proof path, and the de-magicked consts are bit-identical regardless). **This clears ADR-154's §7.4 deferred backlog to zero across M0–M3.**
|
||||
- **ADR-154 Milestone-2 — bench-first P2 perf subset + missing boundary tests (`wifi-densepose-signal`, §7.4 #5/#6/#7/#8/#14/#16/#19/#20).** PROOF discipline (ADR-154 §0): every perf item was **benched before being touched** (new committed `benches/dsp_perf_bench.rs`, criterion, this Windows box); only the one item the bench proved hot was optimized, the rest are committed MEASURED-NULLs — a benched null is the proof the micro-opt was unnecessary, the §5.1 "already amortized" pattern. Every behaviour-changing edit is pinned bit-identical (or documented-tolerance). Signal crate lib `--no-default-features`: **447 passed, 0 failed, 1 ignored**; `--features cir`: **447 passed, 0 failed**.
|
||||
- **#20 MEASURED-HOT, optimized (bit-identical).** `compute_multi_subcarrier_spectrogram` re-planned a fresh `FftPlanner` for *every* subcarrier (via `compute_spectrogram`). Hoisted the plan + window out of the per-subcarrier loop (new `compute_spectrogram_with_plan` core; `compute_spectrogram` delegates, unchanged). **56-subcarrier: 467.88 µs → 254.75 µs = 1.84×** (window 128); **627.27 µs → 448.39 µs = 1.40×** (window 256). Bit-identical via `multi_subcarrier_hoisted_plan_bit_identical` (`f64::to_bits` of every value across all 4 window functions × {power,magnitude}). The §7.4 intro's predicted "most likely real win" — confirmed.
|
||||
- **#5 / #6 / #7 MEASURED-NULL, left as-is.** `node_attention_weights` 181 ns (2 nodes)…848 ns (8) — sub-µs, no hot-path alloc. `tomography reconstruct` (full 50-iter ISTA, 256 voxels) 47.5 µs (16 links) / 60.4 µs (32) — the 2 voxel buffers are already alloc-once + `.fill`-reused, negligible vs O(iters·links·voxels). `pose_tracker` Kalman cycle 150 ns (17 keypoints) / 2.82 µs (170) — the "gain matrices" are fixed-size **stack** arrays, zero heap to reuse. No rewrite shipped; the committed benches prove each is not hot.
|
||||
- **#8 MEASUREMENT-ONLY, BLAS-gated (number deferred, not fabricated).** Correction to the finding: `extract_perturbation` does **not** recompute the SVD (it projects against cached `finalize_calibration` modes); the real per-call eigendecomposition is the `eigenvalue`-feature `estimate_occupancy` (`cov.eigh()` on a 56×56 covariance). The `eig` bench is committed but `openblas-src` won't build on this Windows host ("Non-vcpkg builds are not supported on Windows" — the exact reason the project gate runs `--no-default-features`), so its µs cost must come from a Linux/BLAS box. Recorded, not estimated. Incremental SVD stays a sized future item.
|
||||
- **#14 / #16 / #19 RESOLVED — tests added (no behaviour change).** `fft_operator_within_tolerance_of_dense_canonical56` pins the full `Cir` output of the opt-in FFT path within a documented relative tolerance of the dense path on the production canonical-56 config (τ ∈ {20,50,90} ns) — it changes the witness hash, so it must be provably *close*, not silently divergent. `refinement_terminates_at_iteration_cap_when_not_converging` (+ convergent companion) proves the LO-offset refinement terminates at exactly `max_iterations` on a non-converging input (cap, not convergence, bounds the loop; internal `…_counted` refactor returns the identical offsets). `ratio_finite_at_and_below_1e_12_epsilon` pins that the conjugate-product CSI-ratio (no division → no `1e-12` divide-guard needed) is finite + bit-exact at/below the epsilon boundary and at exact zero (where a naive `H_i/H_j` ratio is ±inf/NaN).
|
||||
- **ADR-156 §11 Milestone-2: RaBitQ unbiased distance estimator — IMPLEMENTED & MEASURED (RESOLVED-NEGATIVE on the strict-K bar).** Closes the §10.5 / §8 backlog "full RaBitQ residual-distance estimator (not just a uniform scalar code)" item — the **real** Gao & Long (SIGMOD 2024) contribution, not just sign bits. New `wifi-densepose-ruvector/src/estimator.rs`: `EstimatorSketch` carries the Pass-2 sign code (over the padded FHT length `D = next_pow2(dim)`) **plus 8 B/vec side info** (`residual_norm` + `x_dot_o = ⟨x̄, o'⟩`, 2× f32); `DistanceEstimator` computes the **unbiased** estimate `⟨o',q'⟩ ≈ ⟨x̄,q'⟩ / x_dot_o` (the random rotation makes the 1-bit code's quantization error orthogonal-in-expectation to the query, paper `O(1/√D)` bound); `EstimatorBank::topk_estimated_cosine` reranks the candidate set by the estimate instead of raw Hamming. **Zero-centroid simplification (`c = 0`) stated honestly** — the paper-faithful per-cluster centroid path (`from_embedding_centred` / `EstimatorBank::with_centroid`) is also built so the simplification is a measured choice (no centroid coverage number is reported against the cosine ground truth, because cosine-of-residual ≠ cosine-of-raw would be a metric mismatch). **Purely additive + backward-compatible** — new types only; Pass-1 `Sketch` / Pass-2 `SketchBank` / `WireSketch` wire format unchanged; all external callers (`event_log.rs`, `signal/longitudinal.rs`, `sensing-server`) use Pass-1 and are unaffected. **MEASURED strict-K coverage** (same fixture/seeds as §10: dim=128 N=2048 K=8, 64 clusters, noise=0.35, 128 queries, cosine ground truth): the estimator lifts the strict `candidate_k=K` bar **46.39% (Pass-2 sign) → 49.71% (estimator, cosine rerank)** — a real **+3.3 pp** lift, **still ~40 pp short of the ADR-084 ≥90% strict bar.** At over-fetch the estimator beats sign (candidate_k=24: **95.12%** vs 91.60%). **Honest verdict — RESOLVED-NEGATIVE: the unbiased estimator does NOT clear the strict-K 90% bar on this distribution** (the binding constraint is the 1-bit code's information ceiling, not estimator variance); the bar is still met only via the over-fetch "candidate set" pattern ADR-084 specifies, though the estimator **reduces the over-fetch factor** needed. A published negative, reported as such — no benchmark tuned to manufacture a pass. Unbiasedness pinned by `estimator_unbiased_on_fixture` (Monte-Carlo mean over 4000 rotation seeds → true inner product within tolerance); not-worse-than-sign pinned by `estimator_rerank_not_worse_than_sign`; determinism by `estimator_is_deterministic`. +12 tests in the crate (119→131). Workspace **3,228 / 0 failed** (`cargo test --workspace --no-default-features`, 162 test binaries, single clean run), Python proof **VERDICT: PASS** (`f8e76f21…46f7a`, unchanged — estimator is not on the proof's signal path). Full numbers + reproduce commands in ADR-156 §11 / ADR-084 "Pass 2b".
|
||||
- **ADR-156 §8 Milestone-1: RaBitQ Pass-2 randomized rotation + multi-bit experiment — IMPLEMENTED & MEASURED (RESOLVED-PARTIAL).** Closes the §8 "Multi-bit / Extended RaBitQ" backlog item. New `wifi-densepose-ruvector/src/rotation.rs`: a deterministic randomized orthogonal rotation `R = H·D` — **Fast Hadamard Transform** (`O(d log d)`, in-place, `1/√m`-normalized so norm-preserving) + seeded ±1 sign flips (SplitMix64 from a stored `u64` seed; identical at index + query time). Chosen over a dense `d×d` matrix (`O(d²)`, infeasible at the 65,535-d the wire format provisions for); pads to `next_pow2(d)`. Additive, backward-compatible API (`Sketch::from_embedding_rotated`, `SketchBank::with_rotation` + `insert_embedding`/`topk_embedding`/`novelty_embedding`); Pass-1 and the wire format are byte-for-byte unchanged. New `coverage.rs` single-source-of-truth top-K coverage harness (anisotropic planted-cluster fixture, cosine ground truth) backs both a `#[test]` report and the `sketch_bench` coverage table. **MEASURED (dim=128 N=2048 K=8, 64 clusters, noise=0.35, 128 queries, seeded):** at the strict `candidate_k=K` bar, rotation lifts coverage **36.13% → 46.39%**; Pass-2 reaches the **ADR-084 ≥90% bar at candidate_k=24 (~3× over-fetch)**; multi-bit Pass-3 reaches 54%/67%/74% at 2/3/4-bit (strict bar). **Honest verdict: neither rotation nor ≤4-bit multi-bit clears the strict-K 90% bar on this distribution — the bar is met only via the over-fetch "candidate set" pattern ADR-084 specifies.** No benchmark was tuned to manufacture a pass; the strict-bar gap is documented (ADR-156 §10, ADR-084 "Pass 2" section). +19 tests in the crate (100→119), workspace **3,225 / 0 failed**, Python proof VERDICT: PASS (`f8e76f21…`, unchanged — sketch is not on the proof's signal path).
|
||||
- **Beyond-SOTA `v2/crates/` sweep (ADR-154–158) + full stub-implementation push — every claim MEASURED or graded.** A 5-milestone review/optimize/secure/benchmark/validate sweep, then a verified-audit-driven push to replace every production stub with real, tested logic (no labels, no placeholders). Each fix is pinned by a test that fails on the old code; every number ships with a reproduce command. Workspace: **3,122 tests / 0 failed** (`cargo test --workspace --no-default-features`), Python proof **VERDICT: PASS** (bit-exact).
|
||||
- **ADR-154 Signal/DSP** — revived a dead ADR-134 CIR coherence gate (canonical-56 vs ht20 mismatch meant it never ran in production: 8/8 Err → 8/8 Ok); NaN-bypass + window div0 guards; PSD FFT-planner cache (**2.0–3.1×**) + honored DTW band (**2.4–4.1×**).
|
||||
- **ADR-155 NN/Training** — unified 7 divergent PCK/OKS metric definitions into one canonical torso-normalized source (fixed two claim-inflating bugs: zero-visible PCK 1.0→0.0, OKS fake-Gold); leak-free subject-disjoint MM-Fi split + injected-leak detector; rapid_adapt replaced fake gradients with real finite-difference; proof.rs gained a min-decrease margin + committed-hash requirement; zero-copy ORT input (**1.48×**).
|
||||
@@ -35,7 +69,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
- **Dynamic min-cut mesh partition guard in the streaming engine (`mesh_guard`).** Maintains a `ruvector-mincut` exact min-cut over the live mesh coupling graph (nodes = sensing nodes, coupling = product of fusion attention weights), surfacing per cycle: the global **cut value** (how close the array is to splitting — a structural measure per-node heuristics miss), the **weak side** (which specific nodes would partition: failure/jamming triage feeding ADR-032 posture), and an **at-risk flag** that counts as a structural event for the drift→recalibration advisor. Surfaced as `TrustedOutput::mesh`. **Measured cost policy** (criterion, 12-node mesh): weights are quantized (1/64; a *nonzero* coupling below one quantum saturates to quantum 1 so quantization never erases a live coupling — without the floor, balanced meshes of ≥ 65 nodes had every ~1/n coupling erased and sat permanently "at risk") and updates change-gated, so the steady-state cycle does zero graph work (~7.3 µs, ~23× cheaper than building); on any real change a full exact rebuild (~171 µs) is used because one `DynamicMinCut` delete+insert measured ~240 µs — the incremental machinery's overhead targets much larger graphs, so rebuild-on-change is the measured optimum at mesh scale (one-edge case −28% after the policy switch). Degenerate cases fail toward risk: a node with zero coupling is reported as already partitioned (cut 0). 9 mesh-guard tests + an engine-level wiring test; full `process_cycle` with the guard: ~33 µs for 4 nodes (50 ms budget).
|
||||
- **Opt-in FFT operator for the CIR ISTA solver (8–14× measured).** Φ is a sub-DFT, so each ISTA mat-vec can run as one length-G FFT (O(G log G)) instead of a dense O(K·G) product. New `CirConfig::fft_operator` (default **false** — the dense path stays the bit-exact witness default; the FFT evaluates the same sums in a different order, so enabling it shifts float results and requires regenerating any pinned witness). `FftOperator` (rustfft, planned once at construction, scratch reused across the ISTA loop) dispatches inside `ista_solve`; warm-start/Lipschitz stay dense at construction. Measured (criterion, same run): ht20 2.22 ms → 265 µs (**8.4×**), ht40 10.26 ms → 717 µs (**14.3×**); the real HE40 grid (K=484, G=1452) scales further. 3 new tests: FFT↔dense matvec equivalence to float tolerance (ht20 + he40 grids), end-to-end dominant-tap agreement on a single-path frame, and all default configs keep FFT off. New `cir_estimate_fft` bench group.
|
||||
- **Per-room adapter provenance + drift→recalibration advisor in the streaming engine.** Closes the trust-chain gap where an ~11 KB per-room LoRA adapter (ADR-150 §3.4) could silently change inference without the witness noticing. `StreamingEngine::set_room_adapter(AdapterInfo)` pins the adapter's content-derived id into provenance `model_version` (`rfenc-v1+adapter:<id>`) — and therefore into the BLAKE3 witness — so swapping or clearing adapter weights always shifts the witness (engine test proves base → adapter → other-adapter → cleared all witness differently, and cleared == base). New `RecalibrationAdvisor` recommends re-running the ADR-135 baseline / refitting the adapter on sustained low fusion coherence (streak threshold, default 60 cycles ≈ 3 s at 20 Hz) or an ADR-142 change-point; surfaced as `TrustedOutput::recalibration_recommended` and recorded on the sensing-server's `EngineBridge` alongside the witness. Bridge plumbing: `EngineBridge::{set_room_adapter, clear_room_adapter}` + live-path test that the adapter id flows into the live witness. *Scope note: this is the deployable provenance/trigger half of the "retrained model" roadmap item — fitting the adapter itself runs in the existing external calibration service (`aether-arena/calibration/`), and a trained RF-encoder checkpoint still does not exist in-tree.*
|
||||
- **RuView beyond-SOTA research series** (`docs/research/ruview-beyond-sota/`, 6 docs) — research-swarm output defining the beyond-SOTA bar and the path to it: system capability audit (role→crate maturity matrix, gap analysis, risk register), web-verified 2026 SOTA landscape per capability axis (incl. ratified IEEE 802.11bf-2025), 8-pillar target architecture on the ADR-136 contract spine (no rewrite), 6-layer benchmark/validation methodology (all 15 criterion bench targets inventoried; ADR-149 statistical protocol), and a determinism-safe optimization roadmap. Includes session validation evidence: 2,797 workspace tests / 0 failed, Python proof PASS (bit-exact), paired pre/post criterion runs.
|
||||
- **RuView beyond-SOTA research series** (`docs/research/ruview-beyond-sota/`, 6 docs) — research-swarm output defining the beyond-SOTA bar and the path to it: system capability audit (role→crate maturity matrix, gap analysis, risk register), web-verified 2026 SOTA landscape per capability axis (incl. ratified IEEE 802.11bf-2025), 8-pillar target architecture on the ADR-136 contract spine (no rewrite), 6-layer benchmark/validation methodology (all 15 criterion bench targets inventoried; ADR-171 statistical protocol), and a determinism-safe optimization roadmap. Includes session validation evidence: 2,797 workspace tests / 0 failed, Python proof PASS (bit-exact), paired pre/post criterion runs.
|
||||
|
||||
### Performance
|
||||
- **CIR estimator warm-start precompute** — the diagonal Tikhonov preconditioner `diag(Φ^H Φ)+λI` and its CSR matrix were rebuilt every frame although they depend only on Φ and λ (fixed at `CirEstimator::new`); now precomputed at construction (`ruvsense/cir.rs`). Bit-identical floats (summation order unchanged, witness chain unaffected). Measured: `cir_estimate/he40` −3.9% (p<0.01), multiband groups −1.2/−1.4%; smaller configs within container noise.
|
||||
@@ -79,7 +113,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
- `ruview-swarm` benchmarks (criterion, release): MARL actor inference 3.3 µs, RRT-APF planning 0.043 ms, multi-view CSI fusion 58.5 ns, 3-view localization 1.732 m (beats Wi2SAR 5 m SOTA baseline), 4-drone SAR coverage 223 s for 400×400 m (under 240 s target).
|
||||
|
||||
### Added
|
||||
- **ADR-147 — OccWorld world model integration** (`wifi-densepose-worldmodel` v0.3.0 published to crates.io). 15-frame trajectory prediction at 209 ms / 3.37 GB VRAM on RTX 5080. Phase 3 domain adapter `scripts/ruview_occ_dataset.py` (`RuViewOccDataset`) converts WorldGraph snapshots to OccWorld tensors with indoor class remapping + zero ego-poses (validated). Phase 5 retraining pipeline `scripts/occworld_retrain.py` — VQVAE + transformer fine-tuning on RuView occupancy snapshots. See [ADR-147](docs/adr/ADR-147-nvidia-cosmos-world-foundation-model-integration.md) · [benchmark proof](docs/adr/ADR-147-benchmark-proof.md).
|
||||
- **ADR-147 — OccWorld world model integration** (`wifi-densepose-worldmodel` v0.3.0 published to crates.io). 15-frame trajectory prediction at 209 ms / 3.37 GB VRAM on RTX 5080. Phase 3 domain adapter `scripts/ruview_occ_dataset.py` (`RuViewOccDataset`) converts WorldGraph snapshots to OccWorld tensors with indoor class remapping + zero ego-poses (validated). Phase 5 retraining pipeline `scripts/occworld_retrain.py` — VQVAE + transformer fine-tuning on RuView occupancy snapshots. See [ADR-147](docs/adr/ADR-147-nvidia-cosmos-world-foundation-model-integration.md) · [benchmark proof](docs/adr/ADR-168-benchmark-proof.md).
|
||||
|
||||
### Added
|
||||
- **ADR-125 (APPLE-FABRIC) — RuView ↔ Apple Home native HAP bridge proposal + reference impl** (issue #796). New ADR-125 lays out a three-phase plan to expose RuView as a discoverable HomeKit accessory on the LAN so a HomePod (as Home Hub) sees presence / vitals / BFLD-derived events natively — zero Home-Assistant intermediary. Two architectural decisions resolved in the ADR per design review: (1) **one HAP bridge with N child accessories** (single pairing, matches Hue/Eve pattern), and (2) **identity-risk mapping is semantic, not probabilistic** — `identity_risk_score` and Soul-Signature match probability never cross the HAP boundary; instead three thresholded events are exposed (`Unknown Presence`, `Unexpected Occupancy`, `Unrecognized Activity Pattern`) so RuView reads as calm-tech ambient awareness, not surveillance UX. ADR-125 §2.1.a reference impl ships now: `scripts/hap-test-sensor.py` (HAP-1.1 bridge advertised over mDNS, paired with operator's iPhone) + `scripts/c6-presence-watcher.py` (parses ESP32 `RV_FEATURE_STATE_MAGIC = 0xC5110006` UDP packets with IEEE CRC32 validation, hysteresis, and a Python port of `wifi-densepose-bfld::PrivacyClass` that enforces ADR-125 §2.1.d invariant I1 at the HomeKit edge — only `Anonymous` (2) and `Restricted` (3) frames may cross; `Raw`/`Derived` are refused with exit code 2 and the cited ADR clause). Validated end-to-end on real hardware (no mocks): ESP32-C6 on `ruv.net` → UDP/5005 → mac-mini watcher → BFLD gate → HAP bridge → iPhone Home app shows `Unknown Presence` live characteristic flip. **Empirical**: 50-51 valid CRC-passing feature_state packets per 10 s window from the live C6; zero CRC errors. P2 (Rust-native HAP via the `hap` crate, replaces the Python sidecar) and P3 (Matter Controller once `matter-rs` stabilizes) follow.
|
||||
|
||||
@@ -22,6 +22,7 @@ Dual codebase: Python v1 (`v1/`) and Rust port (`v2/`).
|
||||
| `wifi-densepose-vitals` | ESP32 CSI-grade vital sign extraction (ADR-021) |
|
||||
| `nvsim` | Deterministic NV-diamond magnetometer pipeline simulator (ADR-089) — standalone leaf, WASM-ready |
|
||||
| `vendor/rvcsi` (submodule) | **rvCSI** — edge RF sensing runtime (ADR-095/096): 9 crates (`rvcsi-core`/`-dsp`/`-events`/`-adapter-file`/`-adapter-nexmon`/`-ruvector`/`-runtime`/`-node`/`-cli`). Lives in its own repo ([github.com/ruvnet/rvcsi](https://github.com/ruvnet/rvcsi)), vendored here under `vendor/rvcsi`, published to crates.io as `rvcsi-* 0.3.x` and to npm as `@ruv/rvcsi`. Not a `v2/` workspace member — depend on the published crates (or the submodule's `crates/rvcsi-*` paths). Normalized `CsiFrame`/`CsiWindow`/`CsiEvent` schema, validate-before-FFI, reusable DSP, typed confidence-scored events, the napi-c Nexmon shim (real nexmon_csi `.pcap` from a Raspberry Pi 5 / 4 / 3B+ — BCM43455c0), the napi-rs SDK, the `rvcsi` CLI, a Claude Code plugin. |
|
||||
| `vendor/rufield` (submodule) | **RuField MFS** — the open spec for camera-free multimodal field sensing (ADR-260). A common `FieldEvent`/`FieldTensor`/`FusionGraph`/`PrivacyClass`/`ProvenanceReceipt` model *above* WiFi CSI/CIR/BFLD, UWB, BLE Channel Sounding, mmWave radar, ultrasound, subsonic, infrared, and quantum sensors. Lives in its own repo ([github.com/ruvnet/rufield](https://github.com/ruvnet/rufield)), vendored here under `vendor/rufield`. Not a `v2/` workspace member. v0.1 reference stack = 6 crates (`rufield-core`/`-provenance`/`-privacy`/`-adapters`/`-fusion`/`-bench`), 60 tests/0 failed; all benchmark metrics are **SYNTHETIC** (simulator ground truth, no hardware — real adapters are roadmap). |
|
||||
| `ruview-swarm` | Drone swarm control system (ADR-148) — hierarchical-mesh topology, Raft consensus, MARL, CSI sensing payload, MAVLink/PX4 compat, Ruflo AI-agent integration |
|
||||
|
||||
### RuvSense Modules (`signal/src/ruvsense/`)
|
||||
|
||||
@@ -194,7 +194,7 @@ The separate **17-keypoint pose-estimation model** is now published at [`ruvnet/
|
||||
| **Efficiency frontier** | [`docs/benchmarks/wifi-pose-efficiency-frontier.md`](docs/benchmarks/wifi-pose-efficiency-frontier.md) | SOTA-beating WiFi pose in a 20 KB int4 edge model |
|
||||
| **Pretrained encoder** | [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained) | 82.3% held-out temporal-triplet, 8 KB int4 |
|
||||
| **Reproducible proof (Trust Kill Switch)** | [`archive/v1/data/proof/verify.py`](archive/v1/data/proof/verify.py) + [`expected_features.sha256`](archive/v1/data/proof/expected_features.sha256) | one-command deterministic pipeline replay (SHA-256 of output vs published hash) |
|
||||
| **Benchmark-proof ADR** | [ADR-147](docs/adr/ADR-147-benchmark-proof.md) | how the numbers are produced and verified |
|
||||
| **Benchmark-proof ADR** | [ADR-168](docs/adr/ADR-168-benchmark-proof.md) | how the numbers are produced and verified |
|
||||
| **Witness attestation** | [`docs/WITNESS-LOG-028.md`](docs/WITNESS-LOG-028.md) | 33-row capability attestation matrix with per-claim evidence |
|
||||
|
||||
```bash
|
||||
|
||||
@@ -1081,6 +1081,17 @@ The `wifi-densepose-vitals` crate (ESP32 CSI-grade vital signs) has not yet been
|
||||
- SONA-based environment adaptation
|
||||
- VitalSignStore with tiered temporal compression
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### 2026-06 — ESP32 edge vitals: person-count over-count + presence flicker (#998, #996)
|
||||
|
||||
Two robustness bugs were fixed in the on-device edge path (`firmware/esp32-csi-node/main/edge_processing.c`, the ADR-039 packet `0xC5110002`). These touch the *boolean/count emission logic*, not the underlying CSI signal-processing math, and do **not** constitute a validated-accuracy claim — true occupancy-count and presence accuracy vs labelled ground truth remain hardware/data-gated (COM9 ESP32-S3 + labelled capture).
|
||||
|
||||
- **#998 `n_persons` over-count (reported 4 for one person).** `update_multi_person_vitals()` divided the top-K subcarriers into `top_k_count/2` groups and marked *every* group `active`, so one body's multipath always read the full `EDGE_MAX_PERSONS`. Added an energy gate (`EDGE_PERSON_MIN_ENERGY_RATIO`), spatial dedup (`EDGE_PERSON_MIN_SC_SEP`), and a persistence debounce (`EDGE_PERSON_PERSIST_FRAMES`) via two pure functions `count_distinct_persons()` / `person_count_debounce()`.
|
||||
- **#996 presence flag flicker at ~50 cm.** Single-threshold compare on a noisy `presence_score` chattered at the boundary. Replaced with a Schmitt trigger + clear-debounce (`presence_flag_update()`, constants `EDGE_PRESENCE_HYST_RATIO` / `EDGE_PRESENCE_CLEAR_FRAMES`); `presence_score` is unchanged and still emitted for consumer-side thresholding.
|
||||
|
||||
Both are pinned by host-buildable C99 tests in `firmware/esp32-csi-node/test/test_vitals_count_presence.c` (`make run_vitals`). The exact thresholds are documented constants pending on-device calibration against ground truth.
|
||||
|
||||
## References
|
||||
|
||||
- Ramsauer et al. (2020). "Hopfield Networks is All You Need." ICLR 2021. (ModernHopfield formulation)
|
||||
|
||||
@@ -5,7 +5,7 @@
|
||||
| Status | Proposed |
|
||||
| Date | 2026-03-06 |
|
||||
| Deciders | ruv |
|
||||
| Depends on | ADR-012 (ESP32 CSI Mesh), ADR-039 (Edge Intelligence), ADR-040 (WASM Programmable Sensing), ADR-044 (Provisioning Enhancements), ADR-050 (Security Hardening), ADR-051 (Server Decomposition) |
|
||||
| Depends on | ADR-012 (ESP32 CSI Mesh), ADR-039 (Edge Intelligence), ADR-040 (WASM Programmable Sensing), ADR-044 (Provisioning Enhancements), ADR-166 (Security Hardening, renumbered from ADR-050), ADR-051 (Server Decomposition) |
|
||||
| Issue | [#177](https://github.com/ruvnet/RuView/issues/177) |
|
||||
|
||||
## Context
|
||||
@@ -211,7 +211,7 @@ pub struct FlashProgress {
|
||||
// commands/ota.rs
|
||||
|
||||
/// Push firmware to a node via HTTP OTA (port 8032).
|
||||
/// Includes PSK authentication per ADR-050.
|
||||
/// Includes PSK authentication per ADR-166.
|
||||
#[tauri::command]
|
||||
async fn ota_update(
|
||||
node_ip: String,
|
||||
@@ -801,7 +801,7 @@ Total estimated effort: ~11 weeks for a single developer.
|
||||
- ADR-039: ESP32 Edge Intelligence
|
||||
- ADR-040: WASM Programmable Sensing
|
||||
- ADR-044: Provisioning Tool Enhancements
|
||||
- ADR-050: Quality Engineering — Security Hardening
|
||||
- ADR-166: Quality Engineering — Security Hardening (renumbered from ADR-050)
|
||||
- ADR-051: Sensing Server Decomposition
|
||||
- `firmware/esp32-csi-node/` — ESP32 firmware source
|
||||
- `firmware/esp32-csi-node/provision.py` — Current provisioning script
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# ADR-080: QE Analysis Remediation Plan
|
||||
|
||||
- **Status:** Proposed
|
||||
- **Status:** Proposed — P0 security findings #1–#3 **RESOLVED** on the shipped Rust sensing-server boundary (2026-06-13; closes ADR-164 G11)
|
||||
- **Date:** 2026-04-06
|
||||
- **Source:** [QE Analysis Gist (2026-04-05)](https://gist.github.com/proffesor-for-testing/a6b84d7a4e26b7bbef0cf12f932925b7)
|
||||
- **Full Reports:** [proffesor-for-testing/RuView `qe-reports` branch](https://github.com/proffesor-for-testing/RuView/tree/qe-reports/docs/qe-reports)
|
||||
@@ -13,25 +13,38 @@ An 8-agent QE swarm analyzed ~305K lines across Rust, Python, C firmware, and Ty
|
||||
|
||||
Address the 15 prioritized issues from the QE analysis in three waves: P0 (immediate), P1 (this sprint), P2 (this quarter).
|
||||
|
||||
## Security P0 closure note (2026-06-13) — Rust sensing-server boundary
|
||||
|
||||
The three P0 security findings below were logged against the **Python v1** API
|
||||
(`archive/v1/src/…`). ADR-164 G11 re-scoped them to the *shipped* boundary:
|
||||
`wifi-densepose-sensing-server` (Rust). They were verified against the current
|
||||
Rust crate and closed on branch `fix/adr-080-sensing-server-security`. Each fix
|
||||
(or already-fixed finding) is pinned by a test that fails on the old behavior.
|
||||
**The Python v1 paths remain as-is** — v1 is archived and not the shipped
|
||||
surface; this closure governs the live Rust server only.
|
||||
|
||||
## P0 — Fix Immediately
|
||||
|
||||
### 1. Rate Limiter Bypass (Security HIGH)
|
||||
### 1. Rate Limiter Bypass / XFF spoofing (Security HIGH) — **RESOLVED (verified absent on Rust boundary)**
|
||||
|
||||
- **Location:** `archive/v1/src/middleware/rate_limit.py:200-206`
|
||||
- **Original location (v1):** `archive/v1/src/middleware/rate_limit.py:200-206`
|
||||
- **Problem:** Trusts `X-Forwarded-For` without validation. Any client bypasses rate limits via header spoofing.
|
||||
- **Fix:** Validate forwarded headers against trusted proxy list, or use connection IP directly.
|
||||
- **Rust verification (2026-06-13):** The Rust sensing-server has **no XFF-trusting control to bypass** — there is no IP-based rate-limiter and no IP-allowlist, and neither security middleware reads a forwarded header. `bearer_auth.rs` authenticates on the token alone (`require_bearer` inspects only the `AUTHORIZATION` header); `host_validation.rs` decides on the `Host` header only. A repo-wide grep for `x-forwarded-for|forwarded|peer_addr|client_ip|real-ip` over `wifi-densepose-sensing-server` returns nothing. The only "rate limiter" is the MQTT *sample-rate* gate (`mqtt/state.rs`), a per-entity publish throttle with no IP/header input.
|
||||
- **Resolution:** No code change needed (no vulnerable surface). Regression tests pin the immunity: `bearer_auth::tests::xff_header_never_affects_auth_decision` (spoofed XFF never flips a 401↔200 decision) and `host_validation::tests::forwarded_headers_never_bypass_host_allowlist` (spoofed `X-Forwarded-Host: localhost` never lets a foreign `Host: evil.com` past the allowlist). Residual: if an IP-based control is ever added, it must derive the peer from the socket (`ConnectInfo<SocketAddr>`) and only honor XFF from an explicit `--trusted-proxy` CIDR — captured as guidance in the test docstrings.
|
||||
|
||||
### 2. Exception Details Leaked in Responses (Security HIGH)
|
||||
### 2. Exception Details Leaked in Responses (Security HIGH, CWE-209) — **RESOLVED**
|
||||
|
||||
- **Location:** `archive/v1/src/api/routers/pose.py:140`, `stream.py:297`, +5 endpoints
|
||||
- **Problem:** Stack traces visible regardless of environment.
|
||||
- **Fix:** Wrap with generic error responses in production; log details server-side only.
|
||||
- **Original location (v1):** `archive/v1/src/api/routers/pose.py:140`, `stream.py:297`, +5 endpoints
|
||||
- **Problem:** Internal error/stack-trace detail serialized into client responses.
|
||||
- **Rust finding (2026-06-13):** Six handlers in `wifi-densepose-sensing-server/src/main.rs` serialized the internal error `Display` into the JSON body: `edge_registry_endpoint` returned a panicked `spawn_blocking` `JoinError` (`"task … panicked"`) in a `500` and the raw upstream error in a `503`; `delete_model`/`delete_recording`/`start_recording` returned `std::io::Error` strings (OS detail / path); `calibration_start`/`calibration_stop` returned the `FieldModel` error chain.
|
||||
- **Fix:** New `src/error_response.rs` module — `internal_error` / `internal_error_json` / `upstream_unavailable` log the full detail **server-side only** (tagged with a correlation id) and return a generic body (`{"error":"internal_error","correlation_id":…}`) with no `panicked`, no file paths, no Debug chain. All six call-sites rewired. Pinned by `error_response::tests::internal_error_body_does_not_leak_detail` (leak-substring guard, verified to fail on the reverted old body) + 4 sibling tests.
|
||||
|
||||
### 3. WebSocket JWT in URL (Security HIGH, CWE-598)
|
||||
### 3. WebSocket JWT in URL (Security HIGH, CWE-598) — **RESOLVED (verified absent on Rust boundary)**
|
||||
|
||||
- **Location:** `archive/v1/src/api/routers/stream.py:74`, `archive/v1/src/middleware/auth.py:243`
|
||||
- **Original location (v1):** `archive/v1/src/api/routers/stream.py:74`, `archive/v1/src/middleware/auth.py:243`
|
||||
- **Problem:** Tokens in query strings visible in logs/proxies/browser history.
|
||||
- **Fix:** Use WebSocket subprotocol or first-message auth pattern.
|
||||
- **Rust verification (2026-06-13):** The Rust sensing-server never reads a token from the URL. `require_bearer` (`bearer_auth.rs`) inspects only the `Authorization` header; the WebSocket handlers (`ws_sensing_handler`/`ws_introspection_handler`/`ws_pose_handler`) take a bare `WebSocketUpgrade` with no `Query` extractor; the single `Query` in the crate (`EdgeRegistryParams`) is a non-secret `refresh` flag.
|
||||
- **Resolution:** No code change needed (no query-token path exists). Regression test `bearer_auth::tests::query_string_token_is_never_accepted` proves `?token=`/`?access_token=` in the URL never authenticates (stays `401`) while the same token in the header succeeds (`200`) — verified to fail if a query-token path is re-introduced.
|
||||
|
||||
### 4. Rust Tests Not in CI
|
||||
|
||||
|
||||
@@ -259,14 +259,75 @@ Validation runs against:
|
||||
- **ADR-083** (Proposed) — Per-cluster Pi compute hop. Defines the
|
||||
device class that hosts the sketch bank.
|
||||
|
||||
## Pass 2 — randomized rotation + multi-bit (ADR-156 §8, landed 2026-06)
|
||||
|
||||
The "Open question" below ("does `BinaryQuantized` need a randomized
|
||||
rotation pre-pass?") is now **answered with measured numbers** via
|
||||
ADR-156 §10. Summary:
|
||||
|
||||
- **Pass 2 (randomized rotation) is implemented** —
|
||||
`crates/wifi-densepose-ruvector/src/rotation.rs`: a deterministic
|
||||
`R = H·D` (Fast Hadamard Transform + seeded ±1 sign flips), `O(d log d)`
|
||||
/ `O(d)`, norm-preserving, reproducible from a stored `u64` seed. Opt-in
|
||||
via `Sketch::from_embedding_rotated` / `SketchBank::with_rotation`;
|
||||
Pass-1 API and wire format unchanged.
|
||||
- **Measured top-K coverage** (anisotropic planted-cluster fixture,
|
||||
cosine ground truth, dim=128 N=2048 K=8): rotation lifts coverage
|
||||
**36.13% → 46.39%** at the strict `candidate_k = K` bar, and Pass-2
|
||||
reaches the **≥90% acceptance bar at candidate_k = 24 (~3× over-fetch)**.
|
||||
Multi-bit (≤4-bit) reaches 74% at the strict bar. **Honest verdict:
|
||||
neither rotation nor ≤4-bit multi-bit clears the strict-K 90% bar on
|
||||
this distribution; the bar is met via the over-fetch "candidate set"
|
||||
pattern this ADR specifies** (Decision §"the canonical pattern" — sketch
|
||||
picks the candidate set, full precision refines). Full numbers and
|
||||
reproduce commands in ADR-156 §10.
|
||||
- **Pre-existing `SketchBank::topk` bug fixed** — the `n > k` heap path
|
||||
returned the k *farthest* sketches (min-heap mistaken for max-heap);
|
||||
only the `n ≤ k` fast path had test coverage. Fixed + regression-pinned
|
||||
(`topk_heap_path_returns_nearest`,
|
||||
`tight_clusters_give_high_coverage_with_overfetch`). This makes every
|
||||
prior top-K acceptance number in this ADR depend on the fixed path; the
|
||||
≥90% coverage criterion is only meaningful post-fix.
|
||||
|
||||
## Pass 2b — RaBitQ unbiased distance estimator (ADR-156 §11, landed 2026-06)
|
||||
|
||||
The **real** RaBitQ contribution (Gao & Long, SIGMOD 2024) — an
|
||||
**unbiased estimator of the inner product / distance** from the 1-bit
|
||||
code + per-vector side info, not just sign bits — is now implemented and
|
||||
**MEASURED against this ADR's ≥90% strict-K bar**:
|
||||
|
||||
- **Implemented** — `crates/wifi-densepose-ruvector/src/estimator.rs`:
|
||||
`EstimatorSketch` (Pass-2 sign code + 8 B/vec side info:
|
||||
`residual_norm` + `x_dot_o = ⟨x̄, o'⟩`), `DistanceEstimator`
|
||||
(`⟨o',q'⟩ ≈ ⟨x̄,q'⟩ / x_dot_o`, the paper's unbiased rescale), and
|
||||
`EstimatorBank` reranking candidates by the estimate instead of raw
|
||||
Hamming. **Zero-centroid simplification** (`c = 0`) documented;
|
||||
paper-faithful centroid path also built (`with_centroid`). Additive —
|
||||
Pass-1/Pass-2 and the wire format are unchanged.
|
||||
- **MEASURED strict-K coverage** (same fixture as §"Pass 2", cosine
|
||||
ground truth): the estimator lifts the strict `candidate_k = K` bar
|
||||
**46.39% (Pass-2 sign) → 49.71% (estimator, cosine rerank)** — a real
|
||||
**+3.3 pp** lift, but **still ~40 pp short of the ≥90% strict bar.**
|
||||
At over-fetch the estimator does better than sign (95.12% vs 91.60% at
|
||||
candidate_k = 24). **Honest verdict: the unbiased estimator does NOT
|
||||
clear the strict-K 90% bar on this distribution** — the binding
|
||||
constraint is the 1-bit code's information ceiling, not estimator
|
||||
variance. The ≥90% acceptance bar is still met only via the over-fetch
|
||||
"candidate set" pattern this ADR's Decision specifies; the estimator
|
||||
**reduces the over-fetch factor** needed but does not remove it. This
|
||||
is a **published negative**, reported as such. Full numbers + reproduce
|
||||
commands in ADR-156 §11.
|
||||
|
||||
## Open questions
|
||||
|
||||
- **Does `BinaryQuantized` need a randomized rotation pre-pass for
|
||||
RuView's embedding distributions?** Pure sign quantization assumes
|
||||
zero-centered, isotropic embeddings. If AETHER / spectrogram
|
||||
distributions are skewed (likely for spectrogram), add a
|
||||
`randomized_rotation` pre-pass following the original RaBitQ paper
|
||||
(Gao & Long, SIGMOD 2024). Decided after pass-1 benchmark.
|
||||
RuView's embedding distributions?** **ANSWERED (ADR-156 §10):** rotation
|
||||
is built and measured — it helps (+10pp at strict K) but is not
|
||||
sufficient alone for strict-K 90% on the tested anisotropic
|
||||
distribution; the over-fetch candidate-set pattern meets the bar.
|
||||
Pure sign quantization assumes zero-centered, isotropic embeddings; the
|
||||
rotation decorrelates anisotropic coords as the RaBitQ paper
|
||||
(Gao & Long, SIGMOD 2024) prescribes.
|
||||
- **Sketch dimension target.** Default to the embedding's native
|
||||
dimension (128 for AETHER, 256 for spectrogram). Higher-dimensional
|
||||
sketches (Johnson-Lindenstrauss-projected to 512) trade compute for
|
||||
|
||||
@@ -9,8 +9,10 @@
|
||||
| Relates to | ADR-134, ADR-136, ADR-139, ADR-140, ADR-143, ADR-144, ADR-146, ADR-147 |
|
||||
|
||||
> **Scope note:** ADR-147 deferred Cosmos WFM to "ADR-148" as an offline data generator.
|
||||
> That item is promoted to ADR-149. This ADR takes 148 to address the broader drone swarm
|
||||
> control architecture, which is the first consumer of ADR-147's OccWorld occupancy output.
|
||||
> That item is promoted to ADR-171 (the swarm-benchmarking/evaluation companion to this ADR;
|
||||
> renumbered from ADR-149 to resolve the ADR-149 duplicate-number collision). This ADR takes
|
||||
> 148 to address the broader drone swarm control architecture, which is the first consumer of
|
||||
> ADR-147's OccWorld occupancy output.
|
||||
|
||||
---
|
||||
|
||||
@@ -874,9 +876,9 @@ validated; ITAR/EAR classification completed by export counsel.
|
||||
| GPS spoofing of full swarm simultaneously | Medium | Low | UWB mesh cross-check among all nodes; ≥ 3 nodes must agree on position to confirm |
|
||||
| 1000-UAV scale claims (not validated) | Low | High | SWARM+ demonstrated in simulation only; scale claims capped at 50 for production targets |
|
||||
|
||||
### 12.3 Open Issues (Forward to ADR-149)
|
||||
### 12.3 Open Issues (Forward to ADR-171)
|
||||
|
||||
- Cosmos WFM offline training data generation (deferred from ADR-147) — ADR-149
|
||||
- Cosmos WFM offline training data generation (deferred from ADR-147) — ADR-171
|
||||
- Fixed-wing hybrid platform support (endurance missions) — future ADR
|
||||
- Underwater-aerial cross-domain handoff protocol — future ADR
|
||||
- Quantum-enhanced task assignment (E6) — future ADR when hardware matures
|
||||
@@ -998,4 +1000,4 @@ Implementation tracked at: https://github.com/ruvnet/RuView/issues/861
|
||||
|
||||
*ADR authored with research support from `ruflo-goals:deep-researcher` (2026-05-30).
|
||||
Implementation progress tracked by `ruflo-goals:horizon-tracker`.
|
||||
OccWorld integration basis: ADR-147. Next: ADR-149 (Cosmos WFM offline data generation).*
|
||||
OccWorld integration basis: ADR-147. Next: ADR-171 (Cosmos WFM offline data generation; renumbered from ADR-149).*
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
| **Deciders** | ruv |
|
||||
| **Codebase target** | `wifi-densepose-signal` (`ruvsense/`, `features.rs`, `csi_processor.rs`, `spectrogram.rs`, `bvp.rs`), benches, docs |
|
||||
| **Relates to** | ADR-134 (CIR sparse recovery), ADR-135 (Empty-Room Baseline), ADR-029/030/032 (Multistatic mesh + security), ADR-152 (WiFi-Pose SOTA 2026 intake), ADR-153 (802.11bf forward-compat) |
|
||||
| **Scope** | Milestone 0 of the beyond-SOTA signal/DSP sweep: high-leverage **correctness/security fixes**, two **measured** perf wins, the per-module SOTA landscape with evidence grades, and a prioritized roadmap. **45 review findings are explicitly deferred** (§7 backlog) — nothing is silently dropped. |
|
||||
| **Scope** | Milestone 0 of the beyond-SOTA signal/DSP sweep: high-leverage **correctness/security fixes**, two **measured** perf wins, the per-module SOTA landscape with evidence grades, and a prioritized roadmap. **45 review findings were explicitly deferred** (§7 backlog) — **now all addressed across Milestones 0–3** (§7.4 backlog cleared 2026-06-13); nothing was silently dropped. |
|
||||
|
||||
---
|
||||
|
||||
@@ -195,40 +195,46 @@ The §2–§5 fixes are **ACCEPTED and committed**: dead CIR gate fixed, NaN byp
|
||||
- Evaluate the **diffusion CIR prior** (public weights, MEASURED) as an offline quality ceiling — *not* an edge target.
|
||||
- Bayesian multi-AP fusion (2512.02462, CLAIMED) — comparison only, pending released code.
|
||||
|
||||
### 7.4 Deferred Milestone-0 review findings (the ~45 not fixed here — explicit backlog)
|
||||
### 7.4 Deferred Milestone-0 review findings (explicit backlog)
|
||||
|
||||
Catalogued so nothing is silently dropped. Priority: **P1** correctness-adjacent, **P2** perf, **P3** clarity/style.
|
||||
|
||||
**Milestone-1 update (2026-06-13):** the **four P1 backlog items** (#1, #9, #10, #13) are now cleared — #1 and #10 **RESOLVED (MEASURED)**, #9 and #13 **RESOLVED-PARTIAL (DATA-GATED:** de-magicked + boundary-tested, operating values unchanged**)**. Each fix is pinned by a regression test that fails on the old behaviour (commits `fd32f094a`, `4a9f2bcf4`, `d672fa602`, `5193f6369`); workspace `--no-default-features` green, Python proof unchanged (bit-exact).
|
||||
|
||||
**Milestone-2 update (2026-06-13):** the **bench-first P2 perf subset** (#5, #6, #7, #8, #20) and the **three missing boundary tests** (#14, #16, #19) are now cleared — ~36 P2/P3 items remained deferred *(now cleared — see the Milestone-3 update)*. PROOF discipline (§0): every perf item was **benched before being touched** — committed in `benches/dsp_perf_bench.rs` (criterion, this Windows box). Only **#20** proved hot and was optimized; **#5/#6/#7** are committed **MEASURED-NULLs** (benched, not hot, left as-is for clarity — exactly the §5.1 "already amortized" pattern); **#8** is **MEASUREMENT-ONLY** but its `eigenvalue`/BLAS backend won't build on this Windows host, so its µs cost must come from a Linux/BLAS box (recorded, not fabricated). Commits `e839fa8f1` (#20 fix), `02e5dd13a` (#14/#16/#19 tests), `aad9464f0` (benches). Workspace `--no-default-features` green; Python proof unchanged (#20 is bit-identical, off the proof path).
|
||||
|
||||
**Milestone-3 update (2026-06-13):** the lumped **row #21–45** P3 backlog — *"remaining clarity/doc/magic-constant/missing-boundary-test findings across `ruvsense/*`, `features.rs`, `motion.rs`"* — is now **cleared, and with it the residual P3 items #2/#12/#17/#18.** Honest enumeration first (`grep`, not the ADR's "21–45" estimate — that was a count, not 25 distinct findings): after M0–M2 the genuinely-bare in-function literals resolved to **22 de-magicked constants across 11 modules** (each → a named, documented **EMPIRICAL-DEFAULT** const that **equals the prior literal exactly**), **6 added boundary/characterization tests**, **~4 doc-only fixes** (no-behaviour-change), and **a handful of agent-flagged "findings" that were NOT real** and are reported as skipped (below). **No operating value or behaviour changed** — every module carries a `*_consts_unchanged_from_literals` pin test and every boundary test pins *current* behaviour, so a future retune is a visible, tested change. Resolution by module: `motion.rs` (**#18** — fusion weights / Doppler+variance+phase scales / confidence weights / adaptive-threshold clamp; 5 tests), `gesture.rs` (**#12** — `euclidean_distance` length-mismatch `debug_assert` documenting the silent-`zip`-truncation caller contract, behaviour-preserving in release; + confidence epsilon; + DTW n=0/m=0 boundary), `longitudinal.rs` (7-day/2σ/3-day/7-day drift thresholds + EMA-α + cosine epsilon; day-6/7 + zero-vector boundaries; the duplicated `>=7` deduped), `cross_room.rs`/`multiband.rs`/`intention.rs`/`hampel.rs` (**#17** — division-guard epsilons `1e-9`/`1e-12`/`1e-10`/`1e-15` + zero-norm/zero-variance/zero-MAD boundaries + the previously-untested `hampel half_window==0` error path + `# Errors` doc), `rf_slam.rs` (`NS_PER_DAY` + `MIGRATION_MIN_SPAN_DAYS` + fixed-map defaults; single-sighting zero-span guard), `attractor_drift.rs` (`METRIC_BUFFER_CAPACITY`/`STABLE_CENTER_WINDOW`; **documented** the implicit `recent.len()>=1` divide-safety; `min_observations` off-by-one boundary), `coherence.rs` (**#9 completion** — the residual bare `1e-6` variance-floor ×4 + default `0.95` decay; floor-effect test), `calibration.rs` (**#2 completion** — `DEFAULT_MIN_FRAMES` deduped across all 4 tier constructors + `AMP_STD_FLOOR`/`MOTION_AMP_Z_THRESHOLD`/`MOTION_PHASE_DRIFT_THRESHOLD`/`SUBTRACT_MIN_NORM`), `fusion_quality.rs` (`CONTRADICTION_PENALTY` 0.8 / bound-halfwidth 0.1; n=0 identity boundary), `temporal_gesture.rs` (confidence epsilon + L2-norm quantization scale). **NOT-REAL / skipped (reported honestly, no churn manufactured):** an agent-flagged `attractor_drift.rs:301` "divide-by-zero" is **unreachable** — the `count < min_observations` guard guarantees `recent.len()>=1` before the `PointAttractor` branch (documented + boundary-tested, **not** guarded, per the no-behaviour-change rule); agent-flagged `gesture.rs` `2.0`/`π·6` motion thresholds **do not exist** in that file (a confusion with `calibration.rs::deviation`); **`features.rs` was deliberately left untouched** (it is on the deterministic Python-proof PSD/Doppler path — its `1e-10` guards already exist and are already correct; doc-only-skipped to protect the bit-exact hash). Commits `c794d1a0c` (motion #18), `adf9ed8e4` (gesture #12), `19f5b6335` (longitudinal), `19e0373c8` (epsilon helpers #17), `c6a09b69a` (rf_slam + attractor_drift), `5a1839f33` (coherence #9 completion), `df25a303e` (calibration #2 completion), `0f931ff2f` (fusion_quality + temporal_gesture). Signal crate lib `--no-default-features` **476 passed / 0 failed / 1 ignored**; `--no-default-features --features cir` **476 / 0**; workspace `--no-default-features` **3,275 / 0 failed** (single clean run); Python proof **VERDICT: PASS**, hash `f8e76f21…46f7a` **UNCHANGED (bit-exact)**. **§7.4 backlog is now fully cleared — ADR-154's deferred findings are addressed across M0–M3 with nothing silently dropped.**
|
||||
|
||||
| # | Module | Finding | Pri | Why deferred |
|
||||
|---|--------|---------|-----|--------------|
|
||||
| 1 | cir.rs ~937 | `phase_variance` uses **linear** variance on **wrapped** angles (doc says "variance of phase angles") — spuriously inflates near ±π | P1 | Used as the `> TAU` ghost-tap *guard*; a correct circular variance is bounded [0,1] and would need the threshold re-derived. Semantic change — defer with a real recalibration, don't risk a silent gate regression in a perf/correctness pass. |
|
||||
| 2 | calibration.rs ~311 | `subtract_in_place` had a vacuous `if active_input {ki} else {ki}` branch implying a full-FFT→bin remap that didn't exist | P3 | **Resolved here** (branch removed, sequential-convention documented to match the sibling `extract_first_stream`). Listed for visibility — behavior unchanged. |
|
||||
| 1 | cir.rs ~937 | `phase_variance` uses **linear** variance on **wrapped** angles (doc says "variance of phase angles") — spuriously inflates near ±π | P1 | **RESOLVED (`fd32f094a`) — metric MEASURED, threshold DATA-GATED.** Replaced with Mardia's circular variance V = 1 − R̄ ∈ **[0,1]**, invariant to the cluster's position on the circle (branch-cut artefact gone). Guard re-derived against the bounded metric via named const `GHOST_TAP_CIRCULAR_VARIANCE_MAX = 0.99` (fires only when R̄ ≤ 0.01 — essentially uniform phase). The **threshold value is DATA-GATED**: a clean single-path ramp also sweeps the circle, so V alone can't separate clean from unsanitized without labelled frames — the default is deliberately conservative (strictly more permissive at the wrap boundary than the buggy linear guard). Fails-on-old: `phase_variance_circular_not_fooled_by_branch_cut` (old linear variance > TAU on wrap-straddling phases while circular V≈0, guard no longer trips), `phase_variance_circular_is_bounded_and_extremal`. |
|
||||
| 2 | calibration.rs ~311 | `subtract_in_place` had a vacuous `if active_input {ki} else {ki}` branch implying a full-FFT→bin remap that didn't exist | P3 | **Resolved (M0 + M3 `df25a303e`).** Branch removed in M0 (sequential-convention documented). M3 completed the de-magic: `DEFAULT_MIN_FRAMES=600` deduped across all four tier constructors, plus `AMP_STD_FLOOR`/`MOTION_AMP_Z_THRESHOLD`/`MOTION_PHASE_DRIFT_THRESHOLD`/`SUBTRACT_MIN_NORM` named + `calibration_consts_unchanged_from_literals`. Behaviour unchanged. |
|
||||
| 3 | spectrogram.rs / bvp.rs | FFT planner built once-per-call (already amortized across frames) | P2 | Marginal vs the per-frame PSD site; cache if these become hot. |
|
||||
| 4 | features.rs ~347 | Doppler FFT planner planned once per call, reused across subcarriers | P2 | Already amortized within the call. |
|
||||
| 5 | multistatic.rs | `node_attention_weights` recomputes consensus/softmax each call; no SIMD | P2 | Needs a bench before touching; not obviously hot. |
|
||||
| 6 | tomography.rs | ISTA L1 solver re-allocates voxel buffers per solve | P2 | Bench first. |
|
||||
| 7 | pose_tracker.rs | Kalman gain matrices reallocated per update | P2 | Bench first. |
|
||||
| 8 | field_model.rs | SVD recomputed on every perturbation extract | P2 | Incremental SVD is a real project, not a micro-fix. |
|
||||
| 9 | coherence.rs / coherence_gate.rs | Z-score thresholds are magic constants, untested at boundaries | P1 | Needs labelled data to set defensible thresholds. |
|
||||
| 10 | longitudinal.rs | Welford update not numerically guarded for n=0 | P1 | Add `n>=1` guard + test (same family as §4). |
|
||||
| 5 | multistatic.rs | `node_attention_weights` recomputes consensus/softmax each call; no SIMD | P2 | **MEASURED-NULL (`aad9464f0`) — benched, not hot, left as-is.** `multistatic_attention/weights`: **181 ns** (2 nodes) … **848 ns** (8 nodes) @ 56 subcarriers — sub-µs, no hot-path allocation. A precompute/SIMD rewrite buys nothing measurable at the realistic 2–8 node fan-in; the cosine/softmax cost is dwarfed by the surrounding fusion + per-frame FFT. Bench `multistatic_attention` in `dsp_perf_bench.rs`. |
|
||||
| 6 | tomography.rs | ISTA L1 solver re-allocates voxel buffers per solve | P2 | **MEASURED-NULL (`aad9464f0`) — benched, not hot, left as-is.** A full 50-iteration `reconstruct` (256 voxels): **47.5 µs** (16 links) / **60.4 µs** (32 links). The two voxel buffers (`x`, `gradient`; ~4 KB) are already allocated *once* per `reconstruct()` and `.fill`-reused across iterations — the per-solve alloc is a negligible fraction of the O(iters·links·voxels) inner product. Reusing scratch across *calls* would force `reconstruct(&self)`→`&mut self` (API break) for no measurable gain. Bench `tomography_reconstruct`. |
|
||||
| 7 | pose_tracker.rs | Kalman gain matrices reallocated per update | P2 | **MEASURED-NULL (`aad9464f0`) — benched, not hot, left as-is.** A Kalman predict+update cycle: **150 ns** (17 keypoints) / **2.82 µs** (170). The "gain matrices" (`s:[f32;3]`, `k:[[f32;3];6]`) are fixed-size **stack** arrays, *not* heap — there is no per-update allocation to reuse; the compiler keeps them in registers/stack. Bench `pose_kalman_update`. |
|
||||
| 8 | field_model.rs | SVD recomputed on every perturbation extract | P2 | **MEASUREMENT-ONLY (`aad9464f0`) — BLAS-gated, not measurable on this host.** Correction: `extract_perturbation` does **not** recompute the SVD — it projects against the cached `modes` from `finalize_calibration`. The real per-call eigendecomposition is in the `eigenvalue`-feature `estimate_occupancy` (`cov.eigh()` on a 56×56 covariance, an O(n³)≈175k-flop symmetric eigensolve + O(n²·frames) covariance build, run per call). The bench (`dsp_perf_bench`'s `eig` module) is committed, but `openblas-src` **fails to build on this Windows box** ("Non-vcpkg builds are not supported on Windows" — the very reason the project gate runs `--no-default-features`), so a measured µs number must come from a Linux/BLAS host; **not estimated/fabricated here.** Incremental SVD remains a sized future project, not a micro-fix. |
|
||||
| 9 | coherence.rs / coherence_gate.rs | Z-score thresholds are magic constants, untested at boundaries | P1 | **RESOLVED-PARTIAL (`5193f6369`) — DATA-GATED.** De-magicked `classify_drift` (`DRIFT_STABLE_SCORE=0.85`, `DRIFT_STEP_CHANGE_MAX_STALE=10`) and the `coherence_gate.rs` defaults (`DEFAULT_ACCEPT_THRESHOLD`/`…REJECT…`/`…MAX_STALE_FRAMES`/`…PREDICT_ONLY_NOISE`) into named, documented consts marked EMPIRICAL DEFAULT; added at/just-below/just-above boundary tests (`classify_drift_*_boundary`) + `*_consts_unchanged_from_literals`. **Operating values explicitly NOT changed** — defensible values still need labelled stable/drifting traces. The gate already exposed these via `GatePolicyConfig` (config seam). |
|
||||
| 10 | longitudinal.rs | Welford update not numerically guarded for n=0 | P1 | **RESOLVED (`4a9f2bcf4`) — MEASURED.** The shared `WelfordStats` (`field_model.rs`, consumed by longitudinal.rs) `count < 2` guards already prevent the n=0 NaN / n=1 div0 / `(count−1)` underflow, but the boundary was untested. Added `welford_finite_at_n0_and_n1` (finite + documented 0.0 sentinel at n=0/n=1). Fails-on-old proof: removing the `sample_variance` guard makes the test panic with "attempt to subtract with overflow" at the `(count − 1)` underflow. |
|
||||
| 11 | cross_room.rs | Fingerprint hash collisions unhandled | P2 | Low collision prob; needs design. |
|
||||
| 12 | gesture.rs | `euclidean_distance` no length-mismatch guard | P3 | Caller-enforced; add `debug_assert`. |
|
||||
| 13 | adversarial.rs | Gini/consistency thresholds are magic constants | P1 | Same labelled-data dependency as #9. |
|
||||
| 14 | cir.rs | `fft_operator` path changes the witness hash (documented) — no test that it's *numerically close* to dense | P2 | Add a tolerance test. |
|
||||
| 12 | gesture.rs | `euclidean_distance` no length-mismatch guard | P3 | **RESOLVED (M3 `adf9ed8e4`).** Added a `debug_assert_eq!` on the two slice lengths + a doc block stating the same-`feature_dim` caller contract and that `zip()` silently truncates on a mismatch. Behaviour-preserving (no-op in release, the operating path). Also de-magicked the confidence `1e-10` epsilon and pinned the DTW `n=0`/`m=0` boundary (`dtw_empty_sequence_is_infinite`). |
|
||||
| 13 | adversarial.rs | Gini/consistency thresholds are magic constants | P1 | **RESOLVED-PARTIAL (`d672fa602`) — DATA-GATED.** Lifted the bare literals in `check`/`check_consistency` (`FIELD_MODEL_GINI_VIOLATION=0.8`, `ENERGY_RATIO_HIGH_VIOLATION=2.0`, `ENERGY_RATIO_LOW_VIOLATION=0.1`, `CONSISTENCY_ACTIVE_FRACTION_OF_MEAN=0.1`, `SCORE_W_*`) into named, documented consts marked EMPIRICAL DEFAULT; added at/just-below/just-above boundary tests (`energy_ratio_high_boundary`, `energy_ratio_low_boundary`, `field_model_gini_boundary`, `consistency_active_fraction_boundary`) + `tuning_consts_unchanged_from_literals`. **Operating values explicitly NOT changed** — defensible values still need labelled spoofed/clean CSI (Wi-Spoof, §6.2/§7.3). Bumping a const fails a boundary test (verified). |
|
||||
| 14 | cir.rs | `fft_operator` path changes the witness hash (documented) — no test that it's *numerically close* to dense | P2 | **RESOLVED (`02e5dd13a`) — tolerance test added.** `fft_operator_within_tolerance_of_dense_canonical56` pins the **full `Cir` output** of the FFT path within a *documented* relative tolerance of the dense path on the production **canonical-56** config across τ ∈ {20,50,90} ns: every tap within `1e-2·|dominant|`, identical `dominant_tap_idx`, `active_tap_count`, `ranging_valid`, `dominant_tap_ratio` within `1e-2`, `rms_delay_spread` within `1e-2` rel. A regression that lets the FFT path drift (scaling/Φ-column bug) now fails here instead of silently corrupting a downstream witness. Extends the existing HT20/single-τ `fft_estimate_matches_dense_dominant_tap`. |
|
||||
| 15 | multistatic.rs | `cir_gate_coherence` only estimates the **first** node/channel; multi-node CIR consensus unused | P2 | Design item (which node's CIR is authoritative?). |
|
||||
| 16 | phase_align.rs | Iterative LO offset estimation has no convergence cap test | P2 | Add iteration-cap test. |
|
||||
| 17 | hampel.rs | Window edge handling at series boundaries | P3 | Cosmetic. |
|
||||
| 18 | motion.rs | Threshold constants undocumented | P3 | Doc-only. |
|
||||
| 19 | csi_ratio.rs | Division guard relies on `1e-12` epsilon; no test | P2 | Add boundary test. |
|
||||
| 20 | spectrogram.rs | `compute_multi_subcarrier_spectrogram` re-plans per subcarrier via `compute_spectrogram` | P2 | Hoist the planner (relates to #3). |
|
||||
| 21–45 | (assorted) | Remaining clarity/doc/magic-constant/missing-boundary-test findings across `ruvsense/*`, `features.rs`, `motion.rs` | P3 | Bulk-addressable in a dedicated "test-the-boundaries + de-magic-constant" follow-up; not high-leverage individually. |
|
||||
| 16 | phase_align.rs | Iterative LO offset estimation has no convergence cap test | P2 | **RESOLVED (`02e5dd13a`) — cap test added.** `refinement_terminates_at_iteration_cap_when_not_converging` forces non-convergence (`tolerance = 0.0`, unreachable since `max_update ≥ 0`) and asserts the loop runs **exactly `max_iterations`** then returns — proving the cap (not convergence) bounds the loop, so a non-converging input can never spin forever. Companion `refinement_converges_before_cap_on_easy_input` proves the cap is an upper bound, not the only exit. Internal-only refactor: `estimate_phase_offsets` still returns the identical offset vector; a `…_counted` core surfaces the iteration count for the test. |
|
||||
| 17 | hampel.rs | Window edge handling at series boundaries | P3 | **RESOLVED (M3 `19e0373c8`).** De-magicked the zero-MAD `1e-15` epsilon (`ZERO_MAD_EPSILON`), documented `hampel_filter`'s `# Errors`, and added the previously-untested `half_window == 0` error-path boundary (`test_zero_half_window_error`) + a zero-MAD constant-window characterization (`test_zero_mad_constant_window`). Window-edge handling itself is correct (`saturating_sub`/`.min(n)`); it is now pinned. |
|
||||
| 18 | motion.rs | Threshold constants undocumented | P3 | **RESOLVED (M3 `c794d1a0c`).** Lifted the fusion weights, Doppler/variance/phase full-scale divisors, confidence-indicator weights, and adaptive-threshold clamp into named, documented EMPIRICAL-DEFAULT consts (`motion_tuning_consts_unchanged_from_literals` pins them) + small-`n` boundary tests (correlation `n<2`, temporal-variance `len<2`, adaptive-threshold history 9-vs-10, Doppler full-scale saturation). Doc-only-plus: values unchanged. |
|
||||
| 19 | csi_ratio.rs | Division guard relies on `1e-12` epsilon; no test | P2 | **RESOLVED (`02e5dd13a`) — boundary test added.** Finding clarification: `csi_ratio.rs` implements the CSI *ratio model* as the **conjugate product** `H_i·conj(H_j)` (SpotFi/IndoTrack) — there is **no division**, hence no literal `1e-12` epsilon; the classic `H_i/H_j` ratio (which a `1e-12` guard protects) is deliberately avoided. `ratio_finite_at_and_below_1e_12_epsilon` pins the property the finding cares about: at and below the `1e-12` target magnitude (and at exact zero — where a division ratio is ±inf/NaN) the conjugate-product output is **finite**, exactly the conjugate product (bit-exact), collapses toward zero (the physically correct "no path" answer), and stays finite through `ratio_to_amplitude_phase`. |
|
||||
| 20 | spectrogram.rs | `compute_multi_subcarrier_spectrogram` re-plans per subcarrier via `compute_spectrogram` | P2 | **MEASURED-HOT (`e839fa8f1`) — optimized, bit-identical.** Hoisted the FFT plan + window out of the per-subcarrier loop (new `compute_spectrogram_with_plan` core). **56-subcarrier** multi-spectrogram: **467.88 µs → 254.75 µs = 1.84×** (window 128); **627.27 µs → 448.39 µs = 1.40×** (window 256). The removed cost is the per-subcarrier `FftPlanner` re-plan (~1.86 µs/plan @ w128 × 56). Bit-identical (`multi_subcarrier_hoisted_plan_bit_identical`, `f64::to_bits` across all 4 windows × {power,magnitude}). The most likely real win predicted by the §7.4 intro — confirmed. (Relates to #3, which stays deferred: `spectrogram.rs`/`bvp.rs` single-signal callers already plan once-per-call.) |
|
||||
| 21–45 | (assorted) | Remaining clarity/doc/magic-constant/missing-boundary-test findings across `ruvsense/*`, `features.rs`, `motion.rs` | P3 | **RESOLVED (Milestone-3, 2026-06-13).** Enumerated honestly (the "21–45" was an estimate, not 25 distinct findings): **22 bare in-function literals de-magicked → named EMPIRICAL-DEFAULT consts (each == prior literal, pinned)**, **6 boundary/characterization tests added**, **~4 doc-only fixes**, across 11 modules (`motion`, `gesture`, `longitudinal`, `cross_room`, `multiband`, `intention`, `hampel`, `rf_slam`, `attractor_drift`, `coherence`, `calibration`, `fusion_quality`, `temporal_gesture`). **No operating value changed.** **Skipped-as-not-real (reported, no churn):** `attractor_drift.rs:301` "divide-by-zero" is unreachable (guarded by `count < min_observations`) → documented + boundary-tested, not guarded; agent-flagged `gesture.rs` `2.0`/`π·6` motion thresholds don't exist there (confusion with `calibration::deviation`); **`features.rs` left untouched** (on the deterministic Python-proof path; its `1e-10` guards already exist & are correct — doc-only-skipped to keep the `f8e76f21…` hash bit-exact). See the Milestone-3 update note above and the per-row #2/#12/#17/#18 entries. |
|
||||
|
||||
> **Horizon-ledger one-liner.** Milestone-0 DONE: dead CIR gate (FIXED+proved), NaN/inf adversarial bypass (FIXED+proved), divide-by-(n−1) window trio (FIXED+proved), calibration dead-branch (FIXED), PSD FFT-planner cache (MEASURED), DTW band (MEASURED). DEFERRED to follow-up: the ~45 findings in §7.4 (P1: phase_variance circular bug #1, Welford guard #10, threshold magic-constants #9/#13; P2/P3: the rest) — none silently dropped.
|
||||
> **Horizon-ledger one-liner.** Milestone-0 DONE: dead CIR gate (FIXED+proved), NaN/inf adversarial bypass (FIXED+proved), divide-by-(n−1) window trio (FIXED+proved), calibration dead-branch (FIXED), PSD FFT-planner cache (MEASURED), DTW band (MEASURED). **Milestone-1 DONE (2026-06-13): all four P1 backlog items cleared — circular phase variance #1 (RESOLVED/MEASURED metric, DATA-GATED threshold), Welford n=0 guard #10 (RESOLVED/MEASURED), threshold magic-constants #9 & #13 (RESOLVED-PARTIAL/DATA-GATED — de-magicked + boundary-tested, values unchanged).** **Milestone-2 DONE (2026-06-13): bench-first P2 perf subset + missing boundary tests cleared — spectrogram per-subcarrier FFT re-plan #20 (MEASURED-HOT, 1.40–1.84×, bit-identical); attention/tomography/Kalman #5/#6/#7 (MEASURED-NULL — benched, not hot, left as-is); field_model eigendecompose #8 (MEASUREMENT-ONLY, BLAS un-buildable on this Windows host, number deferred to a BLAS box, NOT fabricated); fft_operator tolerance #14, phase-align convergence-cap #16, csi-ratio epsilon #19 (RESOLVED, tests added).** **Milestone-3 DONE (2026-06-13): the lumped §7.4 row #21–45 P3 backlog cleared, and with it residual P3 items #2/#12/#17/#18 — 22 magic constants de-magicked into named EMPIRICAL-DEFAULT consts (each pinned == prior literal) + 6 boundary/characterization tests across 11 modules; ~4 doc-only; not-real findings (unreachable attractor_drift div0, non-existent gesture thresholds, proof-path features.rs) reported + skipped, no churn; no operating value changed; workspace 3,275/0, Python proof bit-exact `f8e76f21…`.** **§7.4 deferred backlog is now FULLY CLEARED across M0–M3 — nothing silently dropped.**
|
||||
|
||||
---
|
||||
|
||||
## 8. Consequences
|
||||
|
||||
- **Positive:** the ADR-134 CIR gate is alive for the first time in production; the adversarial detector can no longer be NaN-bypassed; three latent divide-by-zero NaN sources are gone; the per-frame PSD path and gesture DTW are measurably faster with bit-identical output; the SOTA landscape and a concrete LISTA-for-CIR roadmap are graded and recorded.
|
||||
- **Negative / honest limits:** `canonical56()` models the canonical grid as a contiguous 56-tone band — a reasonable physical interpretation of a *resampled* grid, but not a literal hardware tone map; the CIR gate still uses only the first node's CIR (#15); the `phase_variance` circular bug (#1) remains until it can be re-thresholded with data.
|
||||
- **Negative / honest limits:** `canonical56()` models the canonical grid as a contiguous 56-tone band — a reasonable physical interpretation of a *resampled* grid, but not a literal hardware tone map; the CIR gate still uses only the first node's CIR (#15). The `phase_variance` **metric** is now correct (Mardia circular variance, Milestone-1 #1), so the branch-cut false-trip is gone — but its ghost-tap **threshold** (`GHOST_TAP_CIRCULAR_VARIANCE_MAX = 0.99`) is a conservative DATA-GATED default, not a calibrated operating point, and still awaits labelled sanitized/unsanitized frames to tune. Likewise the de-magicked coherence/adversarial thresholds (#9/#13) keep their pre-existing empirical values pending labelled calibration.
|
||||
- **Neutral:** no public API removed; `with_cir_ht20()` kept (warned); files stay scoped; new bench is additive.
|
||||
|
||||
@@ -187,11 +187,66 @@ The gap review surfaced ~60 findings; this milestone scoped to the provable inte
|
||||
- **GraphPose-Fi graph decoder** — build the §5 top candidate (ACCEPTED-future, not built).
|
||||
- **ONNX INT4** quantization; **CSI-JEPA vs MAE** A/B; the rest of the §5 roadmap.
|
||||
- **ONNX read-lock concurrency win** — blocked on an `ort` release exposing `&self` `Session::run` (§4.2); harness already committed.
|
||||
- **native-conv naive-loop** perf rewrite (§4).
|
||||
- **`rf_encoder.rs` `assert_eq!`-on-checkpoint** and any other **tch-gated** panic-on-input sites — require a libtorch host to compile/verify (`model.rs` `amp_fc1` unbounded alloc is *indirectly* guarded by the new `config.validate()` upper bounds, but a direct guard + test is deferred).
|
||||
- **`sensing-server/training_api.rs` PCK** — unify the live-server torso-height PCK with `pck_canonical` (crosses the service + tch boundary).
|
||||
- **`test_metrics.rs` reference kernels** — the integration test's local `compute_pck`/`compute_oks` are independent reference impls (not production); fold them onto the canonical definition.
|
||||
- The remaining ~40 lower-severity review findings (style, micro-opt, doc) from the NN/training gap review.
|
||||
- ~~**native-conv naive-loop** perf rewrite (§4).~~ — **RESOLVED in Milestone-2 (see §8.2): bench-first → MEASURED-INCONCLUSIVE, no perf change shipped.**
|
||||
- ~~**`rf_encoder.rs` `assert_eq!`-on-checkpoint**~~ — **RESOLVED in Milestone-2 (see §8.2): a pure-Rust fallible `LinearHead::try_new` guard was added.** Any genuine **tch-gated** panic-on-input sites remain deferred — they require a libtorch host to compile/verify (`model.rs` `amp_fc1` unbounded alloc is *indirectly* guarded by the new `config.validate()` upper bounds, but a direct guard + test is deferred).
|
||||
- ~~**`sensing-server/training_api.rs` PCK**~~ — **RESOLVED in Milestone-1b (see §8.1, Goal C).** Relabelled (not unified) — and the audit found the *real* live divergence is in `trainer.rs`, not the orphaned `training_api.rs`.
|
||||
- ~~**`test_metrics.rs` reference kernels**~~ — **RESOLVED in Milestone-1b (see §8.1, Goal B).** Canonical core hoisted to an un-gated module; the integration test now validates the production functions against hand-computed fixtures + a differential cross-check.
|
||||
- **`metrics.rs` `compute_pck_v2`/`compute_oks_v2`/`MetricsAccumulatorV2`/`evaluate_dataset_v2`/`hungarian_assignment_v2`** — confirmed to have **zero external callers** (only `evaluate_dataset_v2`→`MetricsAccumulatorV2` internally). They are already `#[deprecated]` and route through canonical, so they are not a *divergent-definition* risk, only dead weight. Left in place this pass (public API in a tch-gated module; deleting needs a deprecation-cycle + tch host to verify) — flagged here for a future cleanup, NOT deleted silently.
|
||||
- **`sensing-server/trainer.rs` `pck_at_threshold` (raw) + `oks_map(area=1.0)` and the `training_bench.rs` raw kernel** — relabelled in Milestone-1b (§8.1); true unification onto `pck_canonical`/`oks_canonical` (needs a torso scale + the train crate as a sensing-server dep) remains deferred.
|
||||
- ~~The remaining ~40 lower-severity review findings (style, micro-opt, doc).~~ — **RESOLVED in Milestone-2 (§8.2): the host-verifiable subset is cleared.** The "~40" was an estimate; the actual host-verifiable (non-tch) train/nn surface is smaller. Enumerated resolution below.
|
||||
|
||||
### 8.2 Milestone-2 — host-verifiable §8 P3 backlog clearance — RESOLVED
|
||||
|
||||
Mirroring the ADR-154 M3 cleanup discipline, M2 closed the **host-verifiable (non-tch) subset** of the §8 backlog in `wifi-densepose-train` (+ the pure-Rust `rf_encoder.rs`/`densepose.rs` in `wifi-densepose-nn` that the §3/§4 items named). Everything behind `#[cfg(feature = "tch-backend")]` (`metrics.rs`, `model.rs`, `losses.rs`, `proof.rs`, `trainer.rs`, `wiflow_std/{layers,model}.rs`) is **out of host-verifiable scope** — it cannot be compiled/verified without libtorch and stays genuinely deferred (not dropped).
|
||||
|
||||
**PROOF discipline held:** every de-magicked constant is pinned `== prior literal` by a `*_consts_unchanged_from_literals` test; every boundary test characterizes CURRENT behaviour; no operating-value or behaviour change; the Python proof stays bit-exact at `f8e76f21…46f7a` (the metrics path is off the signal proof path — asserted, not assumed). A smaller-but-true count was reported rather than inventing 40 fixes.
|
||||
|
||||
**Enumerated finding → resolution (real counts):**
|
||||
|
||||
| # | Finding (location) | Action | Pin/characterization test |
|
||||
|---|---|---|---|
|
||||
| 1 | `metrics_core.rs` — `0.5` vis / `1e-6` extent / `0.07` OKS-fallback sigma | de-magic → `VISIBILITY_THRESHOLD` / `MIN_REFERENCE_EXTENT` / `OKS_FALLBACK_SIGMA` | `metrics_core_consts_unchanged_from_literals`; `visibility_threshold_boundary_is_inclusive`; `degenerate_extent_below_floor_is_unscoreable` |
|
||||
| 2 | `ruview_metrics.rs` — `17` / `0.5` / `0.2` / `1e-3` / `1e-6` | de-magic → `NUM_KEYPOINTS` / `VISIBILITY_THRESHOLD` / `PCK_THRESHOLD` / `MIN_BBOX_DIAG` / `MIN_DURATION_MINUTES` | `ruview_metrics_consts_unchanged_from_literals`; `tracking_zero_duration_does_not_divide_by_zero`; `oks_short_array_is_bounded_at_keypoint_count` |
|
||||
| 3 | `subcarrier.rs` — sparse-interp `0.15`/`1e-4`/`0.1`/`1e-8`/`1e-5`/`500` | de-magic → 6 `SPARSE_*` consts | `sparse_interp_consts_unchanged_from_literals`; `compute_interp_weights_single_target_is_index_zero`; `sparse_interp_single_target_is_finite` |
|
||||
| 4 | `eval.rs` — `1e-10` division guard (×3) | de-magic → `MIN_POSITIVE_MPJPE` | `eval_min_positive_mpjpe_unchanged_from_literal`; `domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero`; `domain_gap_unity_when_everything_perfect` |
|
||||
| 5 | `domain.rs` — `1e-5` LayerNorm eps | de-magic → `LAYER_NORM_EPS` | `layer_norm_eps_unchanged_from_literal` (n=0/zero-var boundary already covered) |
|
||||
| 6 | `virtual_aug.rs` — `1e-10` Box-Muller / room-scale guards | de-magic → `BOX_MULLER_U1_FLOOR` / `MIN_ROOM_SCALE` | `virtual_aug_guard_consts_unchanged_from_literals`; `augment_frame_zero_room_scale_passes_amplitude_finite` |
|
||||
| 7 | `rf_encoder.rs` — `20.0` softplus overflow threshold | de-magic → `SOFTPLUS_LINEAR_THRESHOLD` | `softplus_threshold_unchanged_from_literal` |
|
||||
| 8 | `rf_encoder.rs` — panic-only `LinearHead::new` for untrusted weights (§3) | add pure-Rust fallible `try_new` → typed `RfHeadError` (additive; `new` unchanged) | `try_new_accepts_valid_and_rejects_each_bad_shape` |
|
||||
| 9 | `densepose.rs::apply_conv_layer` naive-loop (§4) | **bench-first → MEASURED-INCONCLUSIVE**, no perf change shipped; committed bench + characterization anchor | `native_conv_matches_reference` + `benches/native_conv_bench.rs` |
|
||||
| 10 | `rapid_adapt.rs` module-doc "O(ε)" inconsistency | doc-only fix → "O(ε²)" (central differences) | n/a (doc) |
|
||||
| 11 | `geometry.rs` `DeepSets::encode` missing `# Panics` | doc-only fix (documents existing `assert!`) | n/a (doc) |
|
||||
|
||||
**Tally:** **7 de-magicked (const + pin test)**, **9 new boundary/characterization tests**, **1 added input guard (`try_new`) + test**, **2 doc-only fixes**, **1 perf item bench-first MEASURED-INCONCLUSIVE (not shipped, deferred)**. New tests: train `--no-default-features` **303** (was 288, +15); nn `--no-default-features` lib **38** (was 35, +3).
|
||||
|
||||
**Skipped honestly (flagged-but-not-real):** `ablation.rs` (NaN sort + boundary already fixed/tested in M1 — clean), `signal_features.rs` (consts already named, n=0 boundary already tested), `mae.rs` (no bare guard literals found), `metrics_core` already had thorough zero-visible/hip-normalizer coverage from M1. No churn was manufactured to hit a count.
|
||||
|
||||
**Genuinely data-gated / tch-gated — remaining backlog (blocked, not dropped):** GraphPose-Fi graph decoder, ONNX INT4, CSI-JEPA vs MAE A/B (all **data/model-gated** — need a training run + datasets); ONNX read-lock concurrency win (**upstream-gated** on `ort`); the tch-gated panic-on-input sites in `proof.rs`/`trainer.rs`/`model.rs` and the `metrics.rs` `*_v2` dead-code deletion (**tch-gated** — need a libtorch host to compile/verify). **The non-tch-verifiable subset of §8 is now cleared.**
|
||||
|
||||
### 8.1 Milestone-1b — metric-definition unification (the §8 metric subset) — RESOLVED
|
||||
|
||||
This milestone closed the two metric-integrity items above. The work is pinned by tests, graded MEASURED, and surfaced findings the §1 table missed.
|
||||
|
||||
**The complete, honest PCK / OKS audit map (every definition in `v2/`):**
|
||||
|
||||
| Definition (file:line) | Normalization basis | Threshold convention | Status |
|
||||
|---|---|---|---|
|
||||
| `metrics_core.rs` `pck_canonical` (was `metrics.rs`) | **hip↔hip torso WIDTH** (bbox-diag fallback), `[0,1]` coords | `k·torso` | **CANONICAL** |
|
||||
| `metrics_core.rs` `oks_canonical` | `s=sqrt(area)` from GT pose extent | COCO kernel | **CANONICAL** |
|
||||
| `metrics.rs` `compute_pck` / `compute_per_joint_pck` / `compute_oks` | — (thin wrappers) | — | route to canonical |
|
||||
| `metrics.rs` `aggregate_metrics` / `MetricsAccumulator` | — | — | route to canonical |
|
||||
| `metrics.rs` `compute_pck_v2` / `compute_oks_v2` / `MetricsAccumulatorV2` | hip↔hip (folded) | — | **legacy-redundant, deprecated, NO callers** — route to canonical |
|
||||
| `tests/test_metrics.rs` local `compute_pck`/`compute_oks` (removed) | raw-threshold reimpl | raw | **was independent reimpl** → now validate canonical + 1 differential kernel |
|
||||
| `benches/training_bench.rs` `compute_pck` | raw-threshold | raw | distinct-by-design (bench-only), annotated DO-NOT-REPORT |
|
||||
| `sensing-server/training_api.rs` `compute_pck` | **torso-HEIGHT** (nose→hip), **pixel-space** | `ratio·torso_h`, 50px floor | **distinct-by-design** — and **ORPHAN file (not `mod`-declared, does not compile)**; relabelled `compute_pck_torso_height` |
|
||||
| `sensing-server/trainer.rs` `pck_at_threshold` | **RAW (no normalization)** | raw `thr` | **distinct, LIVE** (drives `best_pck`); **MISSED by §1 table**; relabelled `pck_raw@0.2` |
|
||||
| `sensing-server/trainer.rs` `oks_map`→`oks_single(area=1.0)` | `area=1.0` | COCO kernel | **fake-Gold, LIVE** (drives `best_oks`); **MISSED by §1 table**; relabelled `oks_map(area=1.0 proxy)` |
|
||||
|
||||
**Findings the §1 seven-definition table under-counted (honest correction):** the live sensing-server claim surface is `trainer.rs` (in `lib.rs`), **not** the named `training_api.rs` — which is an **orphan file, never `mod`-declared, so it does not compile into the crate**. The live `best_pck` is a **raw, unnormalized** PCK and the live `best_oks` still uses the **`area=1.0` fake-Gold** path ADR-155 §2.1 reported as closed elsewhere. So the true metric landscape is **messier than §1 documented**: ≥3 PCK and ≥1 OKS live in `sensing-server`, two of them on the inflating side, and the file the ADR named for the fix was dead code. This is a finding, not a failure — recorded here rather than hidden.
|
||||
|
||||
**Goal B (`test_metrics.rs`) — RESOLVED, MEASURED.** The canonical core (`pck_canonical`/`oks_canonical`/`canonical_torso_size`/sigmas/`bounding_box_diagonal`) was hoisted into a new **un-gated** `metrics_core` module (the full `metrics` module is `tch-backend`-gated, so the canonical definition was previously unreachable from the workspace test gate; `metrics` now re-exports it → still ONE implementation). `tests/test_metrics.rs` now asserts the **production** functions against hand-computed fixtures — `canonical_pck_matches_hand_computed_fixture` (3/4 correct ⇒ 0.75, hand-derived), zero-visible⇒0.0, hip↔hip normalizer pin, OKS perfect⇒1.0, the fake-Gold pin — plus `test_kernel_agrees_with_canonical`, a differential test where an independent raw-threshold reference must AGREE with canonical in the torso=1.0 regime. (10→12 tests.)
|
||||
|
||||
**Goal C (`training_api.rs` PCK) — RESOLVED by RELABEL, MEASURED.** Torso-height is **load-bearing** (pixel-space, vertical nose→hip scale, `[17×3]` layout, no `ndarray`/train dep), so unifying would silently change the live numbers' meaning — exactly what to avoid. Resolution: relabel everywhere the metric surfaces so it is never read as canonical, in both the named `training_api.rs` (now `compute_pck_torso_height`, struct/JSON-field docs, `pck_torso_h@0.2` logs) **and** — the real fix — the LIVE `trainer.rs` path (`pck_at_threshold` documented raw-unnormalized; `oks_map` `area=1.0` flagged fake-Gold; `main.rs` prints `pck_raw@0.2` / `oks_map(area=1.0 proxy)`). No wire-format field or `pub`-fn renames (no silent API break). Pinned by `torso_pck_is_labelled_distinctly_from_canonical` (training_api) and `pck_at_threshold_is_raw_unnormalized_not_canonical` (the live kernel). True unification (route the live server through `pck_canonical`/`oks_canonical`) remains a deferred §8 item — it needs a torso scale on the live data and the train crate as a dep.
|
||||
|
||||
---
|
||||
|
||||
@@ -200,3 +255,5 @@ The gap review surfaced ~60 findings; this milestone scoped to the provable inte
|
||||
**Positive.** The training/metrics subsystem can now substantiate a clean accuracy claim: one documented metric used everywhere, a leak-free split, an honest TTA path, a proof that fails on noise and refuses to bless an unbaselined run, and two of the most claim-inflating bugs (false-perfect PCK, fake-Gold OKS) closed and pinned by regression tests. The unmeasured/unprovable parts are **disclosed**, not hidden.
|
||||
|
||||
**Negative / honest.** The reportable-metric tch-gated code cannot be compiled on the dev host (libtorch absent), so its validation rests on routing through the workspace-tested canonical functions plus review; the Rust deterministic proof is in SKIP until a baseline is committed on a tch host; the ONNX concurrency win is blocked upstream; and ~45 findings are deferred. None of these is presented as done.
|
||||
|
||||
**Picture changed by Milestone-1b (§8.1) — corrected, not hidden.** The §1 "seven divergent metrics" count was an **under-count**. The metric-unification audit (Goal A) found the live `wifi-densepose-sensing-server` carries additional, divergent definitions the §1 table omitted: a **raw, unnormalized** `pck_at_threshold` and an **`area=1.0` fake-Gold** `oks_map` in `trainer.rs` — and these, not the orphaned `training_api.rs` the backlog named, are what actually drive the live-reported `best_pck`/`best_oks`. Milestone-1b **relabelled** them (load-bearing math on different data; relabel beats false unification) and pinned the divergence with tests; full unification onto the canonical definition stays deferred. So the canonical *train/nn* metric is unified and test-validated end-to-end, but the *sensing-server* still computes (now clearly-labelled, non-canonical) progress proxies — disclosed here as the honest current state.
|
||||
|
||||
@@ -102,8 +102,8 @@ The double-clone elimination is also correctness-neutral: all 100 `viewpoint`/`m
|
||||
|
||||
| # | Candidate | What | Grade | Verdict |
|
||||
|---|-----------|------|-------|---------|
|
||||
| **1** | **SymphonyQG** (SIGMOD 2025, public code) | Unified quantization + graph ANN; source reports **3.5–17× QPS over HNSW at equal recall**, pure-CPU / edge-portable. | **CLAIMED** (author-measured; **not reproduced on our hardware** — reproduction is future work) | **Lead beyond-SOTA candidate for the ruvector ANN path.** Propose as ACCEPTED-future; cite honestly as "claimed by source, reproduction pending." Best fit because the ruvector retrieval path (AETHER re-ID, sketch prefilter) is exactly an ANN problem and SymphonyQG is CPU/edge-portable like our deployment. |
|
||||
| **2** | **Multi-bit / Extended RaBitQ** | Extends our existing **1-bit** `sketch.rs` (ADR-084) to multiple bits per dimension — precisely the "Pass 2" our own `sketch.rs` doc deferred (1-bit sign quantization ships first; rotation/more-bits "later if benchmark-measured top-K coverage drops below the ADR-084 90% threshold"). | **CLAIMED** (RaBitQ family well-characterised; our 1-bit baseline is MEASURED in `sketch_bench`) | **Accepted near-term.** Concrete, in-scope, incremental — extends a MEASURED capability rather than importing a new system. #2 priority. |
|
||||
| **1** | **SymphonyQG** (SIGMOD 2025, public code) | Unified quantization + graph ANN; source reports **3.5–17× QPS over HNSW at equal recall**, pure-CPU / edge-portable. | **MEASURED-direction-tested** (was CLAIMED) — **[ADR-261](ADR-261-ruvector-graph-ann-index.md)** built the missing HNSW baseline + a SymphonyQG-style 1-bit quantized-traversal variant and **measured** the ratio on our hardware. | **DONE — direction REFUTED at our scale (honest negative).** ADR-261 built the real HNSW baseline (**~25× QPS over linear scan at recall ≥0.99**, the substrate this row wanted) and a quantized variant. At N=10k the 1-bit Hamming traversal is **too coarse** — its best recall is 0.738, never reaching the ≥0.90 equal-recall point, so **no QPS win over float HNSW** (the SymphonyQG 3.5–17× is *not* reproduced by our 1-bit construction here). Caveat: **our HNSW + our 1-bit quant, not SymphonyQG's system**; expected crossover at large N + a multi-bit code. We did **not** tune to manufacture a speedup. |
|
||||
| **2** | **Multi-bit / Extended RaBitQ + unbiased estimator** | Extends our existing **1-bit** `sketch.rs` (ADR-084): Pass-2 rotation, multi-bit Pass-3, and the **real RaBitQ unbiased distance estimator** (Gao & Long SIGMOD 2024) reranking the candidate set from the 1-bit code + 8 B/vec side info (§11). | **MEASURED-on-our-hardware** (was CLAIMED) — rotation (§10), multi-bit (§10), and the estimator (§11) all implemented + benchmarked. Rotation lifts strict-K 36%→46%; multi-bit (≤4-bit) reaches 74% strict; **the estimator reaches 49.71% strict (cosine rerank), still short of 90%.** All clear 90% only with over-fetch (estimator improves the factor: 95% at candidate_k=24 vs sign 91.6%). | **DONE — RESOLVED-PARTIAL / NEGATIVE.** Rotation (§10) + estimator (§11) built and MEASURED. The honest negative (no strict-bar 90% from rotation, ≤4-bit, **or the unbiased estimator**) is recorded, not hidden. Over-fetch + Pass-2 is the path that meets the bar (ADR-084's "candidate set" pattern); the estimator lowers the over-fetch factor needed. |
|
||||
| **3** | **GraphPose-Fi-style learned antenna-attention + ChebGConv fusion head** | Would replace the current **untrained identity-projection + mean-pool** "attention" (the `CrossViewpointAttention` default is `ProjectionWeights::identity` — not a *learned* attention) with a learned graph fusion head. | **DATA-GATED** (per ADR-152 measurement (b): architecture is **NOT** the current bottleneck — **data is**) | **ACCEPTED-future, data-gated. Do NOT build now.** ADR-152's measured lesson was that swapping architecture without more/better paired data does not move PCK. Building a learned fusion head before the data exists would repeat the mistake ADR-155 §5 also flagged for GraphPose-Fi. |
|
||||
| — | **Cramér-Rao / sensor-placement** (`geometry.rs` CRB) | Investigated for a 2026 advance beating the textbook Fisher-information CRB already implemented. | **Investigated — NO ACTION** | **Cleared honestly.** No 2026 method beats the closed-form Fisher-information CRB for this 2-D bearing problem; our implementation is already correct SOTA. (Recording a negative result is a deliberate anti-slop signal.) The only CRB change this milestone is the §2.3 *GDOP* honesty fix, which is a labelling/quantity correction, not an algorithmic one. |
|
||||
|
||||
@@ -138,8 +138,8 @@ The double-clone elimination is also correctness-neutral: all 100 `viewpoint`/`m
|
||||
|
||||
The review surfaced more than this milestone scoped. Tracked here for a future ADR-156 milestone:
|
||||
|
||||
- **SymphonyQG reproduction** (§5 #1) — reproduce the 3.5–17× QPS-over-HNSW claim on our hardware before integrating into the ruvector ANN path. Currently CLAIMED-only.
|
||||
- **Multi-bit / Extended RaBitQ** (§5 #2) — implement the `sketch.rs` "Pass 2" (more bits per dimension and/or the randomized rotation) and re-measure top-K coverage against the ADR-084 ≥90% acceptance bar in `sketch_bench`.
|
||||
- **SymphonyQG reproduction** (§5 #1) — **RESOLVED-DIRECTION-TESTED** (see [ADR-261](ADR-261-ruvector-graph-ann-index.md)). The missing HNSW baseline + a SymphonyQG-style 1-bit quantized-traversal variant were built and **MEASURED**: float HNSW is ~25× over linear scan at recall ≥0.99 (the baseline this gap needed), but our 1-bit quantized traversal is **too coarse to beat float HNSW at equal recall at N=10k** (best recall 0.738) — the 3.5–17× is **not reproduced** by our construction. Honest negative recorded; expected crossover is large N + a multi-bit traversal code. (Caveat: our HNSW + our 1-bit quant, not SymphonyQG's exact system.)
|
||||
- **Multi-bit / Extended RaBitQ** (§5 #2) — **RESOLVED-PARTIAL** (see §10). Pass-2 randomized rotation (FHT + seeded ±1 sign flips, `src/rotation.rs`) and a multi-bit Pass-3 experiment landed and were MEASURED against the ADR-084 ≥90% bar. **Honest result: rotation helps (+10pp at the strict bar) and Pass-2 reaches 90% with ~3× over-fetch, but NEITHER rotation nor multi-bit (up to 4-bit) clears the strict candidate_k==K 90% bar on the tested anisotropic distribution.** The original `1-bit sign quantization ships first; rotation/more-bits later if benchmark-measured top-K coverage drops below 90%` deferral is therefore retired: the rotation is built, the bar is characterised, and the residual gap is documented rather than deferred.
|
||||
- **Learned cross-viewpoint fusion head** (§5 #3, GraphPose-Fi-style) — **data-gated**: blocked on the paired multi-room data ADR-152 measurement (b) identified as the real bottleneck; do not build the architecture first.
|
||||
- **`CrossViewpointAttention` learned projections** — the default `ProjectionWeights::identity` + mean-pool is honest but unlearned; wiring real learned Q/K/V projections is part of the data-gated item above (no learned weights ⇒ the "attention" is currently a geometric-bias-weighted average, which the code/docs should keep stating plainly).
|
||||
- **`coherence.rs` / `fusion.rs` micro-opts and the remaining lower-severity review findings** (style, doc, further hot-path tuning) from the fusion gap review.
|
||||
@@ -151,3 +151,115 @@ The review surfaced more than this milestone scoped. Tracked here for a future A
|
||||
**Positive.** The fusion path now: uses one canonical wrapped angular-distance helper; reports a **real** dimensionless GDOP instead of a mislabeled RMSE; cannot be panicked by crafted multistatic indices or a zero-bin spectrogram (DoS closed); and does one embedding clone per viewpoint instead of two (measured). Every fix is pinned by a test that fails on the old code, and the ANN/fusion SOTA landscape is graded so the near-term (multi-bit RaBitQ) and the data-gated (learned fusion) are not confused.
|
||||
|
||||
**Negative / honest.** The headline angular-wrap fix is a **numeric no-op** under the current cos kernel — we land it for contract/maintainability, not because it changes an output, and we say so. The two strongest external candidates (SymphonyQG, learned fusion) are **not built here** — one is CLAIMED-pending-reproduction, the other is data-gated by a prior measurement. The perf win is a **local hot-path** improvement, modest in the end-to-end pipeline (attention dominates). None of these is presented as more than it is.
|
||||
|
||||
---
|
||||
|
||||
## 10. RaBitQ Pass-2 / multi-bit — IMPLEMENTED & MEASURED (§8 backlog item #2)
|
||||
|
||||
Milestone-1 of the §8 backlog. Status: **RESOLVED-PARTIAL** — built, measured, honest negative on the strict bar.
|
||||
|
||||
### 10.1 What landed
|
||||
|
||||
- **`crates/wifi-densepose-ruvector/src/rotation.rs`** (new) — `Rotation`, a deterministic randomized orthogonal rotation `R = H·D`: a **Fast Hadamard Transform** (`O(d log d)`, in-place butterfly, `1/√m` normalized so it is norm-preserving) composed with a diagonal of **seeded ±1 sign flips** (SplitMix64 from a stored `u64` seed). Chosen over a dense `d×d` matrix because that is `O(d²)` memory/time and infeasible at the 65,535-d the wire format provisions for; FHT is the standard fast-orthogonal (randomized-Hadamard / fast-JL) construction. Non-power-of-two `d` zero-pads to `next_pow2(d)` and reads back the first `d` coords.
|
||||
- **`sketch.rs`** — additive Pass-2 API: `Sketch::from_embedding_rotated`, `SketchBank::with_rotation` + `insert_embedding` / `topk_embedding` / `novelty_embedding`. **Pass 1 (`from_embedding`) is byte-for-byte unchanged**; a Pass-2 sketch has identical `embedding_dim` / packed-byte length / wire shape, so `WireSketch` and existing callers (`event_log.rs`, `signal/longitudinal.rs`) are untouched. Default behaviour preserved.
|
||||
- **`coverage.rs`** (new) — single-source-of-truth top-K coverage harness on a deterministic **anisotropic planted-cluster** fixture (cosine ground truth, the metric a sign sketch approximates). Backs both the `pass2_coverage_report` unit test and the `sketch_bench` coverage table.
|
||||
- **Multi-bit Pass-3 experiment** — `coverage::measure_multibit`: rotate, then `b`-bit uniform scalar-quantize each coord, rank by L1 over codes. Measures the bit/coverage tradeoff.
|
||||
|
||||
### 10.2 Pre-existing bug found and fixed (disclosed)
|
||||
|
||||
Building the coverage harness surfaced a **pre-existing correctness bug in `SketchBank::topk`** (shipped in ADR-084): the `n > k` heap path used `BinaryHeap<Reverse<(dist,id)>>` (a *min*-heap) but its comment/logic treated the peek as the max, so it evicted the *nearest* and returned the **k farthest** sketches as "nearest." The shipped unit tests only exercised the `n ≤ k` fast path (≤ 3 entries), so it was never caught. Fixed to a plain max-heap. Pinned by **`topk_heap_path_returns_nearest`** (fails on the old heap when entries are inserted farthest-first) and **`tight_clusters_give_high_coverage_with_overfetch`** (measured **0.072** coverage on the old code — random — vs **>0.99** fixed). This is a real, measured behaviour fix, not a no-op.
|
||||
|
||||
### 10.3 MEASURED top-K coverage
|
||||
|
||||
Test machine: Windows 11, `cargo bench --release` / `cargo test`. Fixture: **dim=128, N=2048, K=8, 64 planted clusters, intra-cluster noise=0.35, 128 queries, master_seed=0xAD000084, rotation_seed=0x5EEDC0DE12345678**, ground-truth metric = cosine. Reproduce: `cargo test -p wifi-densepose-ruvector --no-default-features pass2_coverage_report -- --nocapture` or `cargo bench -p wifi-densepose-ruvector --bench sketch_bench -- pass2_coverage`.
|
||||
|
||||
**Coverage vs over-fetch (`coverage = |sketch_topK ∩ float_cosine_topK| / K`):**
|
||||
|
||||
| candidate_k | Pass-1 (1-bit, no rot) | Pass-2 (1-bit, rot) | vs 90% bar |
|
||||
|---|---|---|---|
|
||||
| **8 (= K, strict bar)** | **36.13%** | **46.39%** | both **BELOW** |
|
||||
| 16 | 62.79% | 75.59% | below |
|
||||
| 24 | 83.89% | **91.60%** | **Pass-2 clears** |
|
||||
| 32 | 100.00% | 100.00% | clears |
|
||||
| 64 | 100.00% | 100.00% | clears |
|
||||
|
||||
**Multi-bit Pass-3 at the strict bar (candidate_k = K = 8):**
|
||||
|
||||
| Variant | Coverage | Memory |
|
||||
|---|---|---|
|
||||
| Pass-1 (1-bit, no rot) | 36.13% | 16 B/vec |
|
||||
| Pass-2 (1-bit, rot) | 46.39% | 16 B/vec |
|
||||
| Pass-3 (rot, 2-bit) | 54.39% | 32 B/vec |
|
||||
| Pass-3 (rot, 3-bit) | 66.70% | 48 B/vec |
|
||||
| Pass-3 (rot, 4-bit) | 74.22% | 64 B/vec |
|
||||
|
||||
### 10.4 Honest verdict
|
||||
|
||||
- **Rotation consistently helps** — +10.3 pp at the strict bar (36.13%→46.39%) and a uniform lift at every over-fetch level. The FHT construction is verified norm-preserving and deterministic.
|
||||
- **Neither rotation nor multi-bit (≤4-bit) clears the strict candidate_k==K 90% bar** on this anisotropic distribution. 1-bit sign quantization simply cannot resolve 8-of-2048 from sign bits alone; even 4× memory (4-bit) reaches only 74%.
|
||||
- **Pass-2 reaches the 90% bar at candidate_k=24 (~3× over-fetch)** — i.e. fetch ≥24 sketch candidates, refine to K with full float. This is exactly the "candidate set, then full refinement" deployment pattern ADR-084 specifies, so the bar is met *in the deployment the sensor is designed for*, just not at strict K=K.
|
||||
- **This is a measured, partial win, reported as such.** No benchmark was tuned to manufacture a pass. The strict-bar gap (and the multi-bit tradeoff that doesn't close it) is documented rather than spun.
|
||||
|
||||
### 10.5 Deferred sub-items (graded, not dropped)
|
||||
|
||||
- **Strict-bar 90% from a richer code** — neither rotation nor uniform multi-bit closes it here. A learned/asymmetric quantizer or the full RaBitQ residual-distance estimator (not just a uniform scalar code) might. **RESOLVED-NEGATIVE (§11): the estimator is now built and MEASURED — it lifts strict-K 46.39%→49.71% but does NOT clear the 90% strict bar.** The residual strict-bar gap is a published negative, not a deferral.
|
||||
- **Distribution sensitivity** — the result is for one synthetic anisotropic distribution; on real AETHER traces the strict-bar number may differ. Re-measuring on recorded embeddings is deferred to the ADR-084 post-merge soak.
|
||||
- **Promoting a `MultiBitSketch` type** — the multi-bit code lives in the measurement harness, not as a shipped sketch type. Building the production type is gated on a use site actually needing strict-K (vs over-fetch), which the measurement says is not required today.
|
||||
|
||||
---
|
||||
|
||||
## 11. RaBitQ unbiased distance estimator — IMPLEMENTED & MEASURED (Milestone-2, §8 backlog item #2 / §10.5 strict-bar item)
|
||||
|
||||
Milestone-2 of the §8 backlog. Status: **RESOLVED-NEGATIVE** — the estimator is built, measured, and lifts strict-K coverage, but the honest result is that it does **not** clear the ADR-084 ≥90% strict-K bar on this distribution. The negative is reported as such, exactly like the Pass-2 rotation result.
|
||||
|
||||
### 11.1 What landed
|
||||
|
||||
- **`crates/wifi-densepose-ruvector/src/estimator.rs`** (new) — the real Gao & Long (SIGMOD 2024) contribution: an **unbiased estimator of the inner product / squared distance** recovered from the 1-bit code plus per-vector side info, on top of the Pass-2 rotation. Pass-1/Pass-2 ranked candidates by raw Hamming over sign bits — a coarse proxy. This module reranks by the unbiased estimate.
|
||||
- `EstimatorSketch` — Pass-2 sign code (over the **padded** FHT length `D = next_pow2(dim)`, the frame `x̄` is unit in) **plus** the side info.
|
||||
- `SideInfo` = `{ residual_norm: f32, x_dot_o: f32 }` = **8 bytes/vector** (2× f32).
|
||||
- `EstimatorQuery` — query rotated once, reused across all candidates.
|
||||
- `DistanceEstimator` — `estimate_inner_product`, `estimate_sq_distance`, `ranking_key` (euclidean), `cosine_ranking_key` (the correct key vs a cosine ground truth — needs only the code + `x_dot_o`).
|
||||
- `EstimatorBank` — `topk_estimated` (euclidean) / `topk_estimated_cosine`; optional `with_centroid` (the paper's centroid path).
|
||||
- **`coverage.rs`** — `measure_estimator` (cosine rerank) + `measure_estimator_euclidean`, on the **bit-identical** fixture / cluster centres / query stream / cosine ground truth as `measure_pass1`/`measure_pass2`. Single source of truth for the §11.3 table; backs both `estimator_coverage_report` and the `sketch_bench` coverage table.
|
||||
- **Additive + backward-compatible.** New types only; Pass-1 `Sketch` / Pass-2 `SketchBank` / `WireSketch` wire format are untouched. All external callers (`event_log.rs`, `signal/longitudinal.rs`, `sensing-server`) use Pass-1 `from_embedding` and are unaffected.
|
||||
|
||||
### 11.2 The estimator formula (and the zero-centroid simplification, stated honestly)
|
||||
|
||||
Let `P` be the Pass-2 orthogonal rotation (`R = H·D`), `D = next_pow2(dim)`. For data `o_raw`, query `q_raw`, centroid `c`:
|
||||
|
||||
1. **Centroid — SIMPLIFIED to zero/global `c = 0`.** The paper centres on a per-cluster centroid (`o_r = o_raw − c`); we use `c = 0` (`o_r = o_raw`), because the current sketch path has no IVF/k-means cluster structure. This costs accuracy when the data is far off-origin. **We document it, do not hide it,** and built the paper-faithful centroid path (`from_embedding_centred` / `EstimatorBank::with_centroid`) so the simplification is a measured choice, not an assumption. (We do **not** report a centroid coverage number against the *cosine* ground truth: centroid-subtraction changes the metric — cosine-of-residual ≠ cosine-of-raw — so a centroid number vs raw-cosine truth would be a metric mismatch, itself dishonest. Zero-centroid is the correct match for this raw-cosine harness.)
|
||||
2. **Unit residual + 1-bit code.** `o = o_r/‖o_r‖`, `o' = P·o`, code `x̄_i = sign(o'_i)·(1/√D)` — a unit vector at the nearest hypercube corner.
|
||||
3. **Side info:** `residual_norm = ‖o_r‖` and `x_dot_o = ⟨x̄, o'⟩ ∈ (0,1]` (the paper's `⟨x̄, o⟩`).
|
||||
4. **Unbiased estimator** (paper Eq.): `⟨o', q'⟩ ≈ ⟨x̄, q'⟩ / ⟨x̄, o'⟩ = ⟨x̄, q'⟩ / x_dot_o`. The random rotation makes the code's quantization error orthogonal **in expectation** to `q'`, so the rescale is unbiased (paper's `O(1/√D)` bound). Per candidate: one length-`D` signed sum (`x̄ ∈ {±1/√D}`), as cheap as Hamming + a multiply.
|
||||
5. **Distance / cosine.** `⟨o_r,q_r⟩ = ‖o_r‖·(⟨x̄,q'⟩/x_dot_o)`; `‖q_r−o_r‖² = ‖q_r‖²+‖o_r‖²−2⟨o_r,q_r⟩`. For a **cosine** ground truth (AETHER / this harness), rank by `−⟨o,q_r⟩ = −(⟨x̄,q'⟩/x_dot_o)` (needs only the code + `x_dot_o`).
|
||||
|
||||
**Unbiasedness is pinned** (`estimator_unbiased_on_fixture`): averaging the estimate of `⟨o_r,q_r⟩` over 4000 random rotation seeds converges to the true inner product within ~6% of the `‖o‖‖q‖` envelope — a biased estimator (or sign-only proxy) would be systematically off.
|
||||
|
||||
### 11.3 MEASURED strict-K coverage
|
||||
|
||||
Same fixture/seeds as §10 (dim=128, N=2048, K=8, 64 clusters, noise=0.35, 128 queries, `master_seed=0xAD000084`, `rotation_seed=0x5EEDC0DE12345678`), cosine ground truth. Reproduce: `cargo test -p wifi-densepose-ruvector --no-default-features estimator_coverage_report -- --nocapture` or `cargo bench -p wifi-densepose-ruvector --bench sketch_bench -- pass2_coverage`.
|
||||
|
||||
| candidate_k | Pass-1 (sign) | Pass-2 (sign) | **Pass-2 + estimator (cosine)** | Pass-2 + estimator (euclid) | vs 90% bar |
|
||||
|---|---|---|---|---|---|
|
||||
| **8 (= K, strict bar)** | 36.13% | 46.39% | **49.71%** | 49.02% | **all BELOW** |
|
||||
| 16 | 62.79% | 75.59% | 79.20% | 77.93% | below |
|
||||
| 24 | 83.89% | 91.60% | **95.12%** | 93.65% | estimator clears |
|
||||
| 32 | 100.00% | 100.00% | 100.00% | 100.00% | clears |
|
||||
| 64 | 100.00% | 100.00% | 100.00% | 100.00% | clears |
|
||||
|
||||
Side-info memory overhead: **8 bytes/vector** (2× f32) on top of the 16 B/vec 1-bit sketch.
|
||||
|
||||
### 11.4 Honest verdict
|
||||
|
||||
- **The estimator helps, and the cosine key beats the euclidean key** (49.71% vs 49.02% at strict-K; cosine is the apples-to-apples match for the cosine ground truth — both it and sign-Hamming are angular). The unbiased rescale is a real, consistent lift at every over-fetch level (e.g. 24: 91.60%→95.12%).
|
||||
- **It does NOT clear the strict candidate_k==K 90% bar.** Strict-K goes 36.13% (Pass-1) → 46.39% (Pass-2-sign) → **49.71% (Pass-2 + estimator)** — a **+3.3 pp** improvement over sign-only, **still ~40 pp short of 90%**. This is a **published negative**, the same class of honest result as the Pass-2 rotation (§10).
|
||||
- **Why the strict-K gain is modest:** the binding constraint at strict K is the **1-bit code's information ceiling** (resolving 8-of-2048 from a single sign bit per coordinate), not the *estimator's variance* — the estimator sharpens the ranking but cannot add information the 1-bit code never captured. The estimator's larger wins are at over-fetch, where there is room to re-rank a wider candidate pool.
|
||||
- **The bar is still met the way ADR-084 deploys the sensor:** at candidate_k=24 (~3× over-fetch) the estimator reaches **95.12%** (vs Pass-2-sign 91.60%) — the "candidate set, then full refinement" pattern. The estimator **improves the over-fetch factor needed** but does not eliminate it.
|
||||
- **No benchmark was tuned to manufacture a pass.** The strict-bar gap is documented, not spun.
|
||||
|
||||
### 11.5 Pinning tests
|
||||
|
||||
- `estimator::estimator_is_deterministic` — fixed seed ⇒ identical estimate + identical bank top-K.
|
||||
- `estimator::estimator_unbiased_on_fixture` — Monte-Carlo mean over 4000 seeds converges to the true inner product within tolerance (the unbiasedness claim).
|
||||
- `coverage::estimator_rerank_not_worse_than_sign` — estimator-reranked coverage ≥ sign-only Pass-2 on a fixed fixture (must not regress).
|
||||
- Plus: `estimator_self_distance_is_small`, `x_dot_o_in_unit_range`, `zero_input_does_not_panic`, `bank_self_query_ranks_self_first`, `centroid_path_self_query_ranks_self_first`, `centroid_zero_matches_default`, `estimator_coverage_is_deterministic`.
|
||||
|
||||
@@ -85,9 +85,11 @@ A new criterion bench (`harness = false`, registered in `Cargo.toml`) drives eac
|
||||
|
||||
`OpportunisticCsiBridge::ingest` built `CsiReportPayload { n_subcarriers: self.amp_accum.len() as u16, … }`. The `as u16` would silently wrap a count above 65 535. **This is unreachable in practice**: `ingest` gates `frame.subcarrier_count() > MAX_REPORT_SUBCARRIERS` (484) at entry and returns `None`, and `report.validate()` independently rejects oversized counts downstream. We replaced the cast with `u16::try_from(self.amp_accum.len()).ok()?` (drop-instead-of-truncate) so the construction is **correct-by-construction** rather than relying on the upstream gate. We disclose this as **defense-in-depth on an unreachable path, not a live bug** — no behavior change, no new test (the gate already prevents the input that would exercise it).
|
||||
|
||||
### 2.6 §B4 — constant-time HMAC tag compare: **DEFERRED, not landed** (disclosed)
|
||||
### 2.6 §B4 — constant-time HMAC tag compare: **RESOLVED — no-dependency hand-rolled constant-time compare (Milestone-1)**
|
||||
|
||||
`secure_tdm.rs:284` compares the 8-byte HMAC tag with `self.hmac_tag == expected` (data-dependent, non-constant-time). The research authorized adding `subtle::ConstantTimeEq` **only if `subtle` were already a direct dependency** — it is not (only transitive, via a crypto crate). Per that guidance, and because this is an **8-byte tag on a LAN multistatic sync beacon** (not a remote attacker-controlled timing-oracle surface), we **do not add a direct dependency** for it. Tracked in §8 as a deferred item, not silently dropped.
|
||||
`secure_tdm.rs` compared the 8-byte HMAC tag with `self.hmac_tag == expected` (data-dependent, non-constant-time: short-circuits on the first differing byte, leaking through verification latency how many leading bytes a forged tag matched — a byte-by-byte tag-recovery oracle). Milestone-3 deferred this **only** to avoid adding the `subtle` crate as a direct dependency. Milestone-1 resolves it **without any dependency**: a hand-rolled `constant_time_tag_eq(a, b)` that XOR-accumulates every byte difference into a single `u8` with **no early exit**, then compares the accumulator to zero exactly once. `#[inline(never)]` + `core::hint::black_box(diff)` stop the optimizer from reintroducing a short-circuit or lowering the loop into a non-constant-time `memcmp`; a length mismatch returns `false` without inspecting contents. The former `==` verify site now calls this helper.
|
||||
|
||||
**Test (fails on old code, the hard gate):** `tag_compare_is_constant_time_shape` — asserts correct accept/reject for equal, first-byte-differ, last-byte-differ, all-byte-differ, and length-mismatch tags, plus an end-to-end `verify()` last-byte-only tamper. Verified to **bite**: introducing a classic constant-time bug (loop `take(LEN-1)`, skipping the last byte) makes it fail on `last-byte-differ must reject`. A coarse timing-invariance smoke check `tag_compare_timing_invariance_smoke` exists but is `#[ignore]`d (noisy host — not a CI gate). **Grade MEASURED** (constant-time *construction*; micro-timing on a noisy host is only a smoke check, disclosed honestly). Tracked RESOLVED in §8.
|
||||
|
||||
---
|
||||
|
||||
@@ -143,7 +145,7 @@ Grades: **MEASURED** (source measured it, ideally public method/code), **CLAIMED
|
||||
| 1 | **CSI vital signs (HR/BR)** | Deep-CSI vital-sign models report **MAE ~2–3 BPM** vs our classical IIR-bandpass + autocorrelation/zero-crossing. | **DATA-GATED + CLAIMED** | **NO ACTION on method.** A deep model needs **paired PPG/ECG ground truth** we do not have, and no public ESP32 artifact reproduces the cited MAE on commodity CSI. Our classical method is the honest commodity baseline; the real wins this milestone are the A1/A3 robustness fixes, not a new model. |
|
||||
| 2 | **802.11bf-2025 conformance** | Adopt a conformance test-vector suite for the `ieee80211bf/` forward-compat model. | **CLAIMED (not public)** | **NO ACTION.** No commodity silicon ships a conformant 802.11bf interface as of 2026, and the conformance suites are **WBA / Wi-Fi Alliance pre-certification** material, **not public**. Our model's "no OTA encoding until silicon exists" posture (ADR-153) is the correct one. Tracked in §8: *add SBP conformance vectors when the WFA publishes a test plan* — we will **not invent vectors**. |
|
||||
| 3 | **Per-room calibration (ADR-151)** | Bank-of-specialists + drift-veto vs a 2026 calibration SOTA. | **CLAIMED on numbers, DATA-GATED on a head-to-head** | **NO ACTION on architecture.** The bank-of-specialists + drift-veto design is SOTA-shaped, but we have **no head-to-head PCK** against a published method (no paired multi-room data). The geometry-conditioned LoRA head is **built-but-unconsumed** and data-gated → **ACCEPTED-FUTURE** (§8), not built now. |
|
||||
| 4 | **Multi-BSSID throughput (wifiscan)** | The module docs assert a native `wlanapi.dll` FFI 10–20 Hz path; the current `WlanApiScanner` wraps `netsh` (~2 Hz). | **CLAIMED-unmeasured** | **NO ACTION + corrected expectation.** The native FFI fast path is **asserted but NOT implemented** — the live scanner is the ~2 Hz netsh shim. The "10×" is unmeasured. → **ACCEPTED-FUTURE** (§8). **We explicitly do NOT claim a speedup that does not exist.** |
|
||||
| 4 | **Multi-BSSID throughput (wifiscan)** | The module docs assert a native `wlanapi.dll` FFI 10–20 Hz path; the current `WlanApiScanner` wraps `netsh` (~2 Hz). | **MEASURED (Milestone-1)** | **IMPLEMENTED + MEASURED — real positive win.** Status corrected: the native FFI is **fully implemented and wired live** (`wlanapi_native::scan_native` calls `WlanOpenHandle`/`WlanEnumInterfaces`/`WlanGetNetworkBssList`/`WlanFreeMemory`/`WlanCloseHandle`; `WlanApiScanner::scan_instrumented` runs it native-first with a netsh fallback). Milestone-1 **measured both paths on this box** (Intel Wi-Fi 7 BE201 320MHz, 2026-06-13) over an identical 10 s wall-clock window via a new `benchmark_backend`: **native 21.42 Hz vs netsh 3.84 Hz = 5.57× MEASURED** (mean 5.0 BSSIDs/scan each; native-only run 18.0 Hz). Native genuinely beats netsh — a real measured multiple, **not** a fabricated 10×; the achieved 21.4 Hz lands in the asserted >2 Hz regime though below the asserted 10–20 Hz upper bound. 50 back-to-back native scans = 50/50 OK, no handle leak. → §8 MEASURED. |
|
||||
|
||||
---
|
||||
|
||||
@@ -176,10 +178,10 @@ Grades: **MEASURED** (source measured it, ideally public method/code), **CLAIMED
|
||||
|
||||
## 8. Deferred backlog (NOT silently dropped)
|
||||
|
||||
- **§B4 constant-time HMAC compare** — `secure_tdm.rs:284` uses `==` on the 8-byte tag. Add `subtle::ConstantTimeEq` **if** `subtle` becomes a direct dependency for another reason; not worth a new dependency for an 8-byte LAN sync-beacon tag (out of the current threat model). Deferred, not dropped.
|
||||
- **§B4 constant-time HMAC compare** — **RESOLVED (Milestone-1).** Replaced the short-circuiting `==` on the 8-byte tag with a hand-rolled branch-free `constant_time_tag_eq` (XOR-accumulate, no early exit, `#[inline(never)]` + `black_box`). **No new dependency** — the `subtle` crate was the only reason this was deferred, and a fixed 8-byte compare needs none. Pinned by `tag_compare_is_constant_time_shape` (proven to fail on a last-byte-skipping bug). Grade MEASURED (constant-time construction). See §2.6.
|
||||
- **802.11bf SBP conformance vectors** (§5 #2) — add real conformance test vectors to the `ieee80211bf/` model **when the Wi-Fi Alliance / WBA publishes a public test plan**. Do not invent vectors before then.
|
||||
- **Geometry-conditioned LoRA calibration head** (§5 #3) — built-but-unconsumed and **data-gated** on paired multi-room PCK data (ADR-152 measurement (b): data, not architecture, is the bottleneck). ACCEPTED-FUTURE.
|
||||
- **Native `wlanapi.dll` FFI multi-BSSID fast path** (§5 #4) — the asserted 10–20 Hz path is **not implemented**; the live scanner is the ~2 Hz netsh shim. Implement and **measure** the real throughput before claiming any multiple. ACCEPTED-FUTURE, CLAIMED-unmeasured until then.
|
||||
- **Native `wlanapi.dll` FFI multi-BSSID fast path** (§5 #4) — **RESOLVED + MEASURED (Milestone-1).** The native FFI is implemented and wired live (native-first, netsh fallback). Measured on this box (Intel Wi-Fi 7 BE201 320MHz, 2026-06-13): **native 21.42 Hz vs netsh 3.84 Hz = 5.57×**, mean 5.0 BSSIDs/scan, 50/50 native scans with no handle leak. Real positive result — no fabricated 10×. See §5 #4. (Note: a prior sweep recorded 9.74 Hz on a different/older adapter; the per-adapter number varies, the ratio over netsh is the claim.)
|
||||
- **Deep-CSI vital-sign model** (§5 #1) — DATA-GATED on paired PPG/ECG ground truth. No public ESP32 artifact reproduces the cited ~2–3 BPM MAE. Not on the near-term path.
|
||||
|
||||
---
|
||||
|
||||
@@ -7,7 +7,7 @@
|
||||
|
||||
## Context
|
||||
|
||||
The corpus has grown to **162 ADR entries across 156 distinct files** (ADR-001 through ADR-163, plus 6 duplicate-number collisions). It now spans nine subsystems — signal/DSP, NN/training, ESP32 firmware, RuvSense multistatic, RuView desktop, Cognitum cogs, HOMECORE (HA reimplementation), BFLD privacy, and the streaming engine — written over roughly a year by many agent-driven sessions.
|
||||
The corpus has grown to **162 ADR entries across 156 distinct files** (ADR-001 through ADR-171; the 5 duplicate-number collisions / 6 displaced files originally noted here were RESOLVED by renumbering the displaced files to ADR-166…171 — see Gap Register G1). It now spans nine subsystems — signal/DSP, NN/training, ESP32 firmware, RuvSense multistatic, RuView desktop, Cognitum cogs, HOMECORE (HA reimplementation), BFLD privacy, and the streaming engine — written over roughly a year by many agent-driven sessions.
|
||||
|
||||
Two forces motivate a corpus-wide gap analysis *now*:
|
||||
|
||||
@@ -39,7 +39,7 @@ Counts are approximate (`~`) where a status string is non-canonical or dual-valu
|
||||
| Proposed (incl. conditional/research-only) | ~88 | partial | ~50 |
|
||||
| Superseded | 1 (ADR-002) | proposed-only | ~64 |
|
||||
| Rejected | 1 (ADR-098) | stale-or-contradicted | 3 (029/030/031) |
|
||||
| Missing / no Status header | 3 (ADR-147-proof, ADR-052-ddd, ADR-134) | unknown | 5 (034/044/052-ddd/147-proof/…) |
|
||||
| Missing / no Status header | 3 (ADR-168-proof [was 147], ADR-167-ddd [was 052], ADR-134) | unknown | 5 (034/044/167-ddd/168-proof/…) |
|
||||
| Mixed/dual status in one ADR | 3 (115, 149×2, 133) | superseded | 1 (ADR-002) |
|
||||
|
||||
**Headline:** ~114 of 162 ADRs (≈70%) are decisions that never fully landed (proposed-only + partial + stale + unknown). The dominant failure mode is **stale Status headers**, not abandoned work.
|
||||
@@ -50,8 +50,8 @@ Severity: CRITICAL (corpus integrity / tooling-breaking / life-safety / security
|
||||
|
||||
| ID | Gap | Severity | Affected ADRs | Recommended action |
|
||||
|----|-----|----------|---------------|--------------------|
|
||||
| G1 | 6 duplicate ADR numbers (two ADRs answer to one number; breaks index/`/adr` tooling) | CRITICAL | 050×2, 052×2, 147×3, 148×2, 149×2, 134 (identity split) | renumber 2-of-3 at 147, 1 each at 050/148/149; demote 052-ddd to appendix; resolve 134 identity |
|
||||
| G2 | 3 files with no Status header (cannot triage) — **INVESTIGATED in `docs/adr-gap-remediation-1`: only 2 genuinely lack one, both owner-gated** | CRITICAL | 147-benchmark-proof, 052-ddd-appendix, ~~134-CIR~~ | add canonical `## Status`; relocate 147-proof to `benchmarks/`; label 052-ddd as appendix — **NOTE: ADR-134-CIR DOES have a Status (`\| Status \| Proposed \|` in its header table) — mislabeled here. The two real misses (147-benchmark-proof, 052-ddd) are both inside owner-gated duplicate-number collisions (147×3, 052×2), so left untouched pending owner. The early ADRs (048/049/068/070 etc.) use `\| Status \|` not `\| **Status** \|` — different-format-but-present, not missing. Net: 0 headers added.** |
|
||||
| G1 | ~~6 duplicate ADR numbers (two ADRs answer to one number; breaks index/`/adr` tooling)~~ **RESOLVED (duplicate-number item)** | CRITICAL | 050×2, 052×2, 147×3, 148×2, 149×2; 134 (identity split, separate) | ~~renumber 2-of-3 at 147, 1 each at 050/148/149; demote 052-ddd to appendix; resolve 134 identity~~ **DONE: displaced files renumbered to the next free numbers (166–171), keepers = first-committed file per number (date ties broken by inbound-ref count / parent-appendix relationship): 050 keeps provisioning-tool-enhancements → quality-engineering-security-hardening = ADR-166; 052 keeps tauri-desktop-frontend → ddd-bounded-contexts appendix = ADR-167 (still linked to parent 052); 147 keeps nvidia-cosmos/OccWorld → benchmark-proof = ADR-168, adam-mode-light-theme = ADR-169; 148 keeps drone-swarm-control-system → yoga-mode-pose-system = ADR-170; 149 keeps public-community-leaderboard-huggingface → swarm-benchmarking-evaluation-methodology = ADR-171. In-file headers, intra-file self-refs, all inbound cross-references (README index, census, lens-findings, user-guide, CHANGELOG, proof-of-capabilities, research docs), and this register updated. `ls docs/adr/ADR-*.md | … | uniq -d` is now EMPTY. The ADR-134 identity split is NOT a filename collision; resolved separately under G3 (→ ADR-165).** |
|
||||
| G2 | 3 files with no Status header (cannot triage) — **INVESTIGATED in `docs/adr-gap-remediation-1`: only 2 genuinely lack one, both owner-gated** | CRITICAL | ADR-168-benchmark-proof (was 147), ADR-167-ddd-appendix (was 052), ~~134-CIR~~ | add canonical `## Status`; relocate ADR-168-proof to `benchmarks/`; label ADR-167-ddd as appendix — **NOTE: ADR-134-CIR DOES have a Status (`\| Status \| Proposed \|` in its header table) — mislabeled here. The two real misses (ADR-168-benchmark-proof [was 147], ADR-167-ddd [was 052]) were inside the owner-gated duplicate-number collisions (147×3, 052×2); those collisions are now resolved (G1) but the missing Status headers themselves remain owner-gated, so left untouched pending owner. The early ADRs (048/049/068/070 etc.) use `\| Status \|` not `\| **Status** \|` — different-format-but-present, not missing. Net: 0 headers added.** |
|
||||
| G3 | ~~Shipped crates cite a non-existent or wrong-identity governing ADR~~ **RESOLVED in `docs/adr-gap-remediation-1`** | CRITICAL | homecore-recorder→"ADR-132" (no file); homecore-migrate→"ADR-134" (file is CIR) | ~~write-missing-ADR (HOMECORE-RECORDER, HOMECORE-MIGRATE)~~ DONE: wrote ADR-132 (recorder, Accepted) + ADR-165 (migrate, Accepted — P1 scaffold); repointed migrate's ADR-134 refs → ADR-165 |
|
||||
| G4 | Anti-slop retractions: accuracy/security/function provably false until sweep landed | CRITICAL | 155, 154, 079, 161 (see Contradictions) | already fixed in-code by 154/155/161/162; this ledger records the retraction |
|
||||
| G5 | ~~10 streaming-engine ADRs marked `Proposed` while §Impl-Status reports Built + commits + tests~~ **RESOLVED in `docs/adr-gap-remediation-1`** | HIGH | 136–145 | ~~mark-stale → "Accepted — partial (integration glue pending)" (one batch)~~ DONE: all 10 (136–145) flipped to "Accepted — partial"; each retains its commit-pinned Implementation-Status note. NB: notes describe *building blocks built + tested*, **not** live-path integration — "partial" is the honest label, not full "Accepted" |
|
||||
@@ -60,7 +60,7 @@ Severity: CRITICAL (corpus integrity / tooling-breaking / life-safety / security
|
||||
| G8 | ADR-002 supersession not reciprocated by successors; 5 children stranded | HIGH | 002→016/017; children 003/007/008/009/010 | reconcile-docs (add reciprocal language or downgrade); split 002 to "partially superseded" |
|
||||
| G9 | Streaming-engine integrator crate has no governing ADR (composition/back-pressure/live-path seam) | HIGH | wifi-densepose-engine (composes 135–146) | write-missing-ADR |
|
||||
| G10 | CLAUDE.md doc-vs-header drift (doc says one status, header another) | HIGH | 017, 024, 027, 072, 152 | reconcile-docs |
|
||||
| G11 | Open security HIGH findings, gate FAILED, never marked done | HIGH | 080 (XFF bypass, leaked stack traces, JWT-in-URL CWE-598) | implement (sensing-server boundary — NOT covered by HOMECORE sweep 161/162) |
|
||||
| G11 | ~~Open security HIGH findings, gate FAILED, never marked done~~ **RESOLVED (2026-06-13, branch `fix/adr-080-sensing-server-security`)** | HIGH | 080 (XFF bypass, leaked stack traces, JWT-in-URL CWE-598) | ~~implement (sensing-server boundary — NOT covered by HOMECORE sweep 161/162)~~ DONE: verified all three against the *current Rust* `wifi-densepose-sensing-server`. **#2 leaked errors** was the one live exposure — 6 `main.rs` handlers serialized internal `Display`/`JoinError` into response bodies; fixed via a new `error_response` module (generic body + correlation id, detail logged server-side only). **#1 XFF** and **#3 JWT-in-URL** were verified *absent* on the Rust boundary (no IP-rate-limit/allowlist reads XFF; token is header-only, WS handlers take no query token) and pinned with regression tests that fail if either is re-introduced. ADR-080 P0 §1–3 marked RESOLVED. |
|
||||
| G12 | ADR-052→054 edge unacknowledged by successor; likely mis-modeled (impl, not replacement) | MEDIUM | 052-tauri, 054 | reconcile-docs (054 is the impl plan *for* 052, not a replacement) |
|
||||
| G13 | Capability governed only by remediation/deploy ADR, no creation/architecture ADR | MEDIUM | wasm-edge (only 160/163); occworld-candle (147 blessed Python path only); pointcloud (094 = viewer deploy only) | write-missing-ADR (taxonomy/ABI for wasm-edge; Candle backend swap; pointcloud data contract) |
|
||||
| G14 | Conflicting decisions on one topic, none superseding the others | MEDIUM | person-count 037/075/103; PQ-sign 007/109; fed key-exchange 107/108; provisioning 050/060/052; audit 010/028; RVF-WASM 009-vs-shipped | reconcile (pick one, supersede the rest) |
|
||||
@@ -104,7 +104,7 @@ The ADR-154–163 sweep was narrowly scoped. The two largest **capability** gaps
|
||||
|
||||
- **CRITICAL — Camera-teacher training validation (ADR-079 / 072 / 150).** P7–P9 Pending; blocker is a real synchronized camera+ESP32 paired-capture session + GPU training on the fleet (ruvultra RTX 5080). Cross-subject collapse (11.6%) is data-gated on a heterogeneous multi-subject CSI dataset, per ADR-150 §F3 / ADR-152 F3 (the lever is *more data*, not capacity). Accepted-on-paper, not proven.
|
||||
- **HIGH — Federation + BFLD privacy chains (ADR-105–109, 118–125).** All Proposed-only, ACs unchecked. Blockers: KIT BFId dataset (121), Pi5/Nexmon CBFR capture hardware (123 — ESP32 structurally cannot sniff CBFR), Soul-Signature + cog-ha-matter (122/125). The privacy control *plane* (ADR-141) is built; the *capture/scoring* chain it gates is not.
|
||||
- **HIGH — Sensing-server security (ADR-080).** Distinct from the HOMECORE boundary the sweep fixed; XFF bypass / stack-trace leakage / JWT-in-URL remain open.
|
||||
- ~~**HIGH — Sensing-server security (ADR-080).** Distinct from the HOMECORE boundary the sweep fixed; XFF bypass / stack-trace leakage / JWT-in-URL remain open.~~ **RESOLVED (2026-06-13, G11):** verified against the current Rust sensing-server — stack-trace leakage was the one live finding (fixed via `error_response` generic bodies); XFF bypass and JWT-in-URL were verified absent and regression-pinned. See ADR-080 P0 §1–3.
|
||||
- **MEDIUM — gold-standard deferrals (model to follow):** ADR-163 (ESP32 on-hardware latency UNMEASURED), ADR-160 (medical/affect/weapon NOT validated, relabelled), ADR-158 (RF-through-rubble + learned counter DATA-GATED). Code is real, the claim is withheld pending absent hardware/labelled data — labels are honest.
|
||||
- **MEDIUM — purely hardware/data-gated Proposed decisions (no overreach):** ADR-023, 027, 042, 063/064, 065/066, 070, 073/078, 083, 086, 091, 103, 110 (HE-CSI needs ESP-IDF ≥5.5), 113, 114, 134/135, 143-v2, 144. *needs verification* where flags rely on downstream prose rather than direct file inspection.
|
||||
|
||||
|
||||
+1
-1
@@ -1,4 +1,4 @@
|
||||
# ADR-050: Quality Engineering Response — Security Hardening & Code Quality
|
||||
# ADR-166: Quality Engineering Response — Security Hardening & Code Quality
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
@@ -1,4 +1,8 @@
|
||||
# ADR-052 Appendix: DDD Bounded Contexts — Tauri Desktop Frontend
|
||||
# ADR-167 Appendix: DDD Bounded Contexts — Tauri Desktop Frontend
|
||||
|
||||
> Appendix to [ADR-052](ADR-052-tauri-desktop-frontend.md). Renumbered from ADR-052
|
||||
> to ADR-167 to resolve the ADR-052 duplicate-number collision (per ADR-164 Gap Register
|
||||
> G1); the parent decision remains ADR-052.
|
||||
|
||||
This document maps out the domain model for the RuView Tauri desktop application
|
||||
described in ADR-052. It defines bounded contexts, their aggregates, entities,
|
||||
@@ -158,7 +162,7 @@ Represents an over-the-air firmware update to a running node.
|
||||
| `target_node` | `MacAddress` | Target node MAC |
|
||||
| `target_ip` | `IpAddr` | Target node IP |
|
||||
| `firmware` | `FirmwareBinary` | The binary being pushed |
|
||||
| `psk` | `Option<SecureString>` | PSK for authentication (ADR-050) |
|
||||
| `psk` | `Option<SecureString>` | PSK for authentication (ADR-166) |
|
||||
| `phase` | `OtaPhase` | Uploading / Rebooting / Verifying / Done / Failed |
|
||||
| `progress` | `Progress` | Upload progress |
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# ADR-147 Benchmark Proof — OccWorld on RTX 5080
|
||||
# ADR-168 Benchmark Proof — OccWorld on RTX 5080
|
||||
Date: 2026-05-29
|
||||
Hardware: NVIDIA GeForce RTX 5080 (15.47 GB VRAM), CUDA 12.8
|
||||
Model: OccWorld TransVQVAE (random weights — pre-domain-fine-tuning baseline)
|
||||
@@ -0,0 +1,226 @@
|
||||
# ADR-169: adam-mode — light theme toggle for the three.js realtime demo
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Status** | Proposed |
|
||||
| **Date** | 2026-06-02 |
|
||||
| **Deciders** | ruv |
|
||||
| **Codename** | **adam-mode** |
|
||||
| **Scope** | `examples/three.js/demos/05-skinned-realtime.html` (primary), demos 01–04 (follow-on) |
|
||||
| **Relates to** | ADR-019 (sensing-only UI), ADR-035 (live sensing UI accuracy) |
|
||||
| **Tracking issue** | none yet |
|
||||
|
||||
---
|
||||
|
||||
## 1. Context
|
||||
|
||||
`examples/three.js/demos/05-skinned-realtime.html` (build stamp `2026-05-15-fps-tune`) is the live MediaPipe → Mixamo retargeting + ESP32 CSI overlay demo. It currently ships a single, opinionated **dark theme**:
|
||||
|
||||
- Body `--bg: #050507` (near-black), `--text: #d8c69a` (warm beige).
|
||||
- Amber accents (`--amber: #ffb840`, `--amber-hot: #ffe09f`) on panels and controls.
|
||||
- Two full-screen overlays: a radial-vignette `.overlay-frame` and a 50%-opacity CRT-style `.scanlines` layer.
|
||||
- Three.js scene matches: `scene.background = new THREE.Color(0x050507)` and `scene.fog = new THREE.FogExp2(0x050507, 0.06)` (lines 269–270).
|
||||
|
||||
The dark/amber CRT aesthetic is intentional for screen-recording and "command-centre" feel, but it has real failure modes:
|
||||
|
||||
1. **Daylight visibility** — Demoing the live capture on a laptop in a sunlit room is unreadable; the dark background absorbs ambient glare and the amber-on-dark contrast disappears.
|
||||
2. **Recording for embedded/print contexts** — When the demo's screen is captured for documentation, blog posts, or HA blueprints, the dark theme bleeds into surrounding white content and looks heavy.
|
||||
3. **Accessibility** — A subset of users with light-sensitive retinas (the inverse of typical photophobia) report the high amber-on-near-black combination strains them; high-contrast light themes are easier.
|
||||
4. **Operator pairing with a light-mode IDE** — Many operators run a light-mode browser alongside a dark-mode IDE and want the demo to match the browser, not the IDE.
|
||||
|
||||
A toggle is the right answer because none of these reasons are universal — some sessions and some users want each mode.
|
||||
|
||||
### 1.1 What this ADR is *not*
|
||||
|
||||
- Not a redesign. The amber accent stays; only the surface colours and overlays swap. The information density, panel layout, and three.js scene geometry are unchanged.
|
||||
- Not a multi-theme system. We add exactly two themes: the existing dark (default, unnamed) and **adam-mode** (light). Future themes would need a new ADR.
|
||||
- Not a backend / data-model change. Pure presentation.
|
||||
- Not yet propagated to demos 01–04. Those follow-on after adam-mode lands on demo 05 and is validated.
|
||||
|
||||
## 2. Decision
|
||||
|
||||
Add a **client-side theme toggle** to `05-skinned-realtime.html` that switches between the existing dark theme and a new light theme called **adam-mode**, driven by a `data-theme="adam"` attribute on `<body>` plus a sibling `:root[data-theme="adam"]` CSS block that re-defines the existing custom properties. A new toggle button in the existing `#helpers` panel switches between modes and persists the choice in `localStorage` under the key `ruview.theme`.
|
||||
|
||||
### 2.1 CSS — the colour swap
|
||||
|
||||
Add immediately after the existing `:root { ... }` block in `<style>`:
|
||||
|
||||
```css
|
||||
:root[data-theme="adam"] {
|
||||
--bg: #f6f2ea;
|
||||
--bg-panel: rgba(252, 250, 246, 0.92);
|
||||
--amber: #b8741a; /* deeper amber, readable on cream */
|
||||
--amber-hot: #8a5612; /* deepest amber for emphasis text */
|
||||
--cyan: #1a6f8a; /* slate cyan */
|
||||
--magenta: #a8348a; /* slate magenta */
|
||||
--text: #2a241c; /* near-black warm */
|
||||
--text-mute: #7a6f5d; /* warm grey */
|
||||
--green: #1f7a32; /* forest green */
|
||||
--red: #b03a1a; /* burnt sienna */
|
||||
--border: rgba(184, 116, 26, 0.28);
|
||||
}
|
||||
```
|
||||
|
||||
Every existing element already reads from these custom properties, so the swap is automatic for panels, text, borders, and bar fills. No per-element CSS rewrites required.
|
||||
|
||||
### 2.2 Overlay handling
|
||||
|
||||
The vignette and scanlines are dark-theme aesthetics. In adam-mode they would muddy the cream background. Two new rules:
|
||||
|
||||
```css
|
||||
:root[data-theme="adam"] .overlay-frame {
|
||||
background:
|
||||
radial-gradient(ellipse at center, transparent 70%, rgba(184,116,26,0.10) 100%),
|
||||
linear-gradient(180deg, rgba(184,116,26,0.06) 0%, transparent 18%, transparent 82%, rgba(184,116,26,0.08) 100%);
|
||||
}
|
||||
:root[data-theme="adam"] .scanlines {
|
||||
opacity: 0.15;
|
||||
mix-blend-mode: multiply;
|
||||
}
|
||||
```
|
||||
|
||||
The vignette is preserved but inverted in colour and lightened; scanlines drop to 15 % opacity and switch from `overlay` to `multiply` blend so they read as faint paper texture rather than CRT lines.
|
||||
|
||||
### 2.3 Three.js scene reactivity
|
||||
|
||||
Two scene colours are hard-coded at construction (lines 269–270). Replace them with a function call that reads the current theme:
|
||||
|
||||
```js
|
||||
function themeSceneColors(theme) {
|
||||
return theme === 'adam'
|
||||
? { bg: 0xf6f2ea, fogDensity: 0.025 }
|
||||
: { bg: 0x050507, fogDensity: 0.06 };
|
||||
}
|
||||
function applySceneTheme(theme) {
|
||||
const c = themeSceneColors(theme);
|
||||
scene.background = new THREE.Color(c.bg);
|
||||
scene.fog = new THREE.FogExp2(c.bg, c.fogDensity);
|
||||
renderer.setClearColor(c.bg, 1.0);
|
||||
}
|
||||
```
|
||||
|
||||
Called once after `renderer` is constructed, then again from the toggle handler.
|
||||
|
||||
`scene.fog` density drops in adam-mode because exponential fog on a light background reads as "haze" much more strongly than on dark — 0.06 → 0.025 keeps the falloff visible without losing the figure into the background.
|
||||
|
||||
### 2.4 UI toggle
|
||||
|
||||
Add to the `#helpers` panel (top of its labels list):
|
||||
|
||||
```html
|
||||
<label class="theme-toggle">
|
||||
<input type="checkbox" id="adam-mode-toggle">
|
||||
<span>adam-mode (light)</span>
|
||||
<span class="swatch" style="background: var(--amber)"></span>
|
||||
</label>
|
||||
```
|
||||
|
||||
Handler:
|
||||
|
||||
```js
|
||||
const THEME_KEY = 'ruview.theme';
|
||||
const root = document.documentElement;
|
||||
const toggle = document.getElementById('adam-mode-toggle');
|
||||
|
||||
function applyTheme(theme) {
|
||||
if (theme === 'adam') {
|
||||
root.setAttribute('data-theme', 'adam');
|
||||
toggle.checked = true;
|
||||
} else {
|
||||
root.removeAttribute('data-theme');
|
||||
toggle.checked = false;
|
||||
}
|
||||
applySceneTheme(theme);
|
||||
try { localStorage.setItem(THEME_KEY, theme); } catch (_) {}
|
||||
}
|
||||
|
||||
const initialTheme = (() => {
|
||||
try { return localStorage.getItem(THEME_KEY) || 'dark'; }
|
||||
catch (_) { return 'dark'; }
|
||||
})();
|
||||
applyTheme(initialTheme);
|
||||
|
||||
toggle.addEventListener('change', e => {
|
||||
applyTheme(e.target.checked ? 'adam' : 'dark');
|
||||
});
|
||||
```
|
||||
|
||||
### 2.5 Why "adam-mode" as the codename
|
||||
|
||||
The user picked the name. It is a project-specific brand — distinct from the generic "light mode" terminology that other modes (`--theme=high-contrast`, `--theme=print`) may eventually need. Keeping a codename makes the toggle searchable in the codebase, the localStorage key portable across the demo set, and avoids ambiguity if dark itself is later renamed.
|
||||
|
||||
The string `"adam"` is the only literal value the `data-theme` attribute and the `localStorage` key ever take. `"dark"` is the implicit default (no attribute, no stored value).
|
||||
|
||||
### 2.6 Rejected alternatives
|
||||
|
||||
| Alternative | Rejected because |
|
||||
|---|---|
|
||||
| Use `prefers-color-scheme: light` only, no toggle | Operators frequently want the opposite of their OS preference for screen-recording or daylight desk use. Auto-only frustrates the actual use case. |
|
||||
| Ship two separate HTML files (`05-…-dark.html`, `05-…-light.html`) | Doubles maintenance for every future demo edit. No path to per-session toggle. |
|
||||
| Build a full multi-theme system with a runtime registry | Premature. Two themes don't need a registry; the `data-theme="adam"` attribute is the registry. |
|
||||
| Use Tailwind / DaisyUI / a CSS framework | Demos are intentionally stand-alone single-file HTML for portability. No build step exists; adding one for theming is wrong shape. |
|
||||
| Adopt the cognitum-v0 / HOMECORE design tokens (`--hc-*` from `examples/frontend/`) | That design system is dark-only by intent (ADR-131). adam-mode is the light counterpart needed in *demo* contexts, not HA dashboard contexts. |
|
||||
| Make adam-mode the default | Breaks the dark-aesthetic recording context this demo was originally built for. Default stays dark; toggle stays opt-in. |
|
||||
|
||||
## 3. Consequences
|
||||
|
||||
### 3.1 Positive
|
||||
|
||||
- Demo is usable in daylight, in printed documentation, on light-mode browsers, and by users who find the dark-amber combination fatiguing.
|
||||
- Toggle persists across reloads via `localStorage` — set once, sticks.
|
||||
- No structural change to information density, panel layout, or three.js scene geometry. Operators familiar with the dark theme can switch and still find every readout in the same place.
|
||||
- Implementation is contained — a single `<style>` block addition, a single button, a ~25-line JS handler, and a swap of two scene-construction lines.
|
||||
|
||||
### 3.2 Negative
|
||||
|
||||
- Two themes to maintain. Any future colour change requires updating both `:root` blocks. Mitigated by keeping the existing custom-property names — adam-mode's values are the only edits.
|
||||
- The vignette + scanlines lose some of the CRT charm in adam-mode. Tradeoff accepted by design.
|
||||
- One additional `localStorage` slot consumed per origin (`ruview.theme`).
|
||||
- The amber accent in adam-mode (`#b8741a`) is visibly different from the dark-mode amber (`#ffb840`) — they share the same CSS variable name but a screenshot from each mode is not pixel-comparable. This is the correct call for accessibility (the bright amber is unreadable on cream) but does mean side-by-side comparisons need both screenshots labelled.
|
||||
|
||||
### 3.3 Risks
|
||||
|
||||
| Risk | Likelihood | Mitigation |
|
||||
|---|---|---|
|
||||
| Future demo edits update one `:root` block and forget the other | Medium | A lint script in `scripts/` could grep both blocks for matching key sets; documented as P2 follow-up. |
|
||||
| `localStorage` blocked by privacy settings | Low | All accesses are wrapped in try/catch; falls back to dark. |
|
||||
| Three.js fog density of 0.025 still washes out the model on adam-mode | Low | Empirically tuned during implementation; if it does, drop to 0.015 or remove fog entirely in adam-mode. |
|
||||
| User on a high-DPI display sees scanlines as visible paper texture even at 15 % opacity | Low | If reported, drop to 8 % or hide scanlines entirely in adam-mode. |
|
||||
|
||||
## 4. Implementation plan
|
||||
|
||||
Tiny scope — single file. No swarm needed.
|
||||
|
||||
1. Add `:root[data-theme="adam"]` CSS block and the two overlay overrides.
|
||||
2. Refactor scene background + fog into the two helper functions `themeSceneColors()` and `applySceneTheme()`.
|
||||
3. Add `<label>` markup and handler script.
|
||||
4. Verify in a browser at http://127.0.0.1:8765/examples/three.js/demos/05-skinned-realtime.html — toggle on, reload, confirm adam-mode persists; toggle off, reload, confirm dark persists.
|
||||
5. Smoke-screenshot both modes; commit.
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
- Toggle checkbox visible in `#helpers` panel.
|
||||
- Clicking the toggle swaps colours within one frame.
|
||||
- Reload preserves last choice.
|
||||
- Three.js scene background follows the toggle (no dark frame visible behind a light HUD or vice-versa).
|
||||
- Existing dark-theme appearance is byte-identical when toggle is off.
|
||||
|
||||
## 5. Test plan
|
||||
|
||||
- Manual visual check in two themes (no automated visual regression — demos aren't in the CI test loop today).
|
||||
- `view-source` confirms the new CSS block, the toggle markup, and the handler are present.
|
||||
- DevTools `localStorage` shows `ruview.theme` after a toggle.
|
||||
- Three.js inspector (or a `console.log(scene.background.getHexString())`) confirms scene colour swap.
|
||||
|
||||
## 6. Follow-on work (out of scope for this ADR)
|
||||
|
||||
- Roll adam-mode into demos 01–04. Each demo has its own `<style>` block; the same `data-theme="adam"` selector and the same JS handler can be copied.
|
||||
- Honor `prefers-color-scheme: light` on first load *if* `localStorage` has no stored choice. Trivial three-line addition.
|
||||
- Add a high-contrast theme for accessibility (separate ADR).
|
||||
- Lint script that asserts both `:root` blocks declare the same custom-property names.
|
||||
|
||||
## 7. Related ADRs
|
||||
|
||||
- [ADR-019](ADR-019-sensing-only-ui-mode.md) — sensing-only UI mode (Gaussian splats viewer)
|
||||
- [ADR-035](ADR-035-live-sensing-ui-accuracy.md) — live sensing UI accuracy norms (which this demo follows)
|
||||
- [ADR-131](docs/adr/ADR-131-...) — HOMECORE / cognitum-v0 design tokens (dark-only, separate context)
|
||||
@@ -0,0 +1,643 @@
|
||||
# ADR-170: yoga-mode — pose detection, classification, and scoring for the three.js realtime demo
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Status** | Proposed |
|
||||
| **Date** | 2026-06-02 |
|
||||
| **Deciders** | ruv |
|
||||
| **Codename** | **yoga-mode** |
|
||||
| **Scope** | `examples/three.js/demos/05-skinned-realtime.html` (primary); new `examples/three.js/demos/06-yoga-mode.html` (secondary, slimmed-down) |
|
||||
| **Relates to** | ADR-169 (adam-mode light theme), ADR-019 (sensing-only UI), ADR-035 (live sensing UI accuracy) |
|
||||
| **Tracking issue** | none yet |
|
||||
|
||||
---
|
||||
|
||||
## 1. Context
|
||||
|
||||
`examples/three.js/demos/05-skinned-realtime.html` already runs the full MediaPipe Pose Heavy pipeline at ~30 Hz: 33 BlazePose landmarks flow through a one-euro-filter bank into joint-angle extraction and then into a Mixamo X Bot IK retarget. The `#pose-panel` HUD shows landmark count, visibility, and pose FPS. The `#helpers` panel (ADR-097) has adam-mode (ADR-169) and eight visualisation toggles.
|
||||
|
||||
This infrastructure is complete. Every frame, per-joint angles are already computable from the existing `liveKp` world-space landmark array. What does not yet exist is any layer that interprets those angles as a known yoga pose, scores the user's alignment against a target shape, and guides the user through a structured sequence.
|
||||
|
||||
### 1.1 Why yoga-mode in this demo
|
||||
|
||||
Three concrete use-cases drive this:
|
||||
|
||||
1. **Developer self-test for the retargeting pipeline.** Cycling through a Sun Salutation A is a systematic, reproducible way to exercise every major joint (shoulder, elbow, hip, knee, spine). A pose-scoring overlay makes regression immediately visible — if a code change breaks elbow retargeting, the yoga classifier will output a depressed alignment score on Chaturanga even before a visual inspection.
|
||||
|
||||
2. **Public demonstration value.** The demo is served at `http://127.0.0.1:8765/examples/three.js/demos/05-skinned-realtime.html` and shown to evaluators. A guided instructional mode that scores real-time body alignment against Tadasana or Downward Dog is immediately intelligible to a non-technical audience in a way that raw CSI amplitude bars are not.
|
||||
|
||||
3. **Future bridge to the Rust host.** The Rust-side `wifi-densepose-signal/src/ruvsense/pose_tracker.rs` maintains a 17-keypoint Kalman tracker in COCO convention. yoga-mode in the demo operates on the 33-landmark MediaPipe convention. These are not the same: MediaPipe indices 0–32 (BlazePose) map non-trivially to COCO 0–16. Deciding the mapping now — even in a pure-JS context — canonicalises it for the eventual Rust integration.
|
||||
|
||||
### 1.2 What this ADR is *not*
|
||||
|
||||
- Not a backend service. No WebSocket endpoint, no session record, no cloud upload. Pure client-side HTML.
|
||||
- Not a fitness-app competitor. The scope is Sun Salutation A (8 poses). The full 84-asana classical corpus is out of scope.
|
||||
- Not an integration with the Rust `pose_tracker.rs`. That bridge is documented here as a future consequence, not an immediate deliverable.
|
||||
- Not a redesign of demo 05. Panel layout, three.js scene geometry, and the CSI overlay are unchanged.
|
||||
- Not a new design system. yoga-mode inherits every existing CSS custom property.
|
||||
|
||||
### 1.3 COCO-17 ↔ BlazePose-33 mapping note
|
||||
|
||||
The Rust tracker uses COCO 17-keypoint indices (0=nose, 5=left-shoulder, 6=right-shoulder, 7=left-elbow, 8=right-elbow, 9=left-wrist, 10=right-wrist, 11=left-hip, 12=right-hip, 13=left-knee, 14=right-knee, 15=left-ankle, 16=right-ankle). MediaPipe BlazePose-33 uses a different, denser scheme where shoulders are at 11–12, elbows at 13–14, wrists at 15–16, hips at 23–24, knees at 25–26, ankles at 27–28.
|
||||
|
||||
The mapping for the 13 joints used in yoga-mode angle computation is:
|
||||
|
||||
| Joint role | COCO idx | BlazePose idx |
|
||||
|---|---|---|
|
||||
| nose | 0 | 0 |
|
||||
| left shoulder | 5 | 11 |
|
||||
| right shoulder | 6 | 12 |
|
||||
| left elbow | 7 | 13 |
|
||||
| right elbow | 8 | 14 |
|
||||
| left wrist | 9 | 15 |
|
||||
| right wrist | 10 | 16 |
|
||||
| left hip | 11 | 23 |
|
||||
| right hip | 12 | 24 |
|
||||
| left knee | 13 | 25 |
|
||||
| right knee | 14 | 26 |
|
||||
| left ankle | 15 | 27 |
|
||||
| right ankle | 16 | 28 |
|
||||
|
||||
When the Rust host integration is implemented, the joint-angle features extracted by yoga-mode in JS and by `pose_tracker.rs` in Rust will be computed from the same physical joints via this table. No translation layer is needed at runtime — yoga-mode always uses BlazePose indices; `pose_tracker.rs` always uses COCO indices.
|
||||
|
||||
### 1.4 Biomechanical basis for joint-angle targets
|
||||
|
||||
The joint-angle targets in this ADR are grounded in peer-reviewed measurements. Perez-Testor et al. (2019, PMC6521759) captured 10 trained practitioners performing Surya Namaskar A on a 12-camera Vicon system at 100 Hz, reporting sagittal-plane joint angles at each pose transition. Key ranges: elbow 22°–116°, hip 15° extension to 134° flexion, knee 3° hyperextension to 140° flexion, spine 44° extension to 58° flexion, shoulder 56°–183°. These empirical ranges set the upper and lower bounds for the tolerance bands in this ADR's pose templates. Where Perez-Testor does not report a joint (e.g. wrist flexion for Chaturanga arm angle), the Iyengar geometry — "elbows at 90° bent close to the body" — supplies the target value. A 2023 PMC yoga-pose review (PMC10280249) confirming angle-heuristic approaches as the most reliable real-time classification method validates the algorithmic choice.
|
||||
|
||||
---
|
||||
|
||||
## 2. Decision
|
||||
|
||||
### 2.1 Pose taxonomy — Sun Salutation A, 8 poses
|
||||
|
||||
Sun Salutation A is chosen for the first ship. It satisfies three criteria simultaneously: the poses are geometrically distinct from each other (no two share the same joint-angle signature), they form a complete bilateral sequence (both left and right sides are exercised), and they are among the best-documented asanas in the biomechanics literature. The Sanskrit and English names are unambiguous in the Ashtanga tradition.
|
||||
|
||||
The 8 poses in sequence order with their one-line joint-angle signatures:
|
||||
|
||||
| Stage | Sanskrit | English | Joint-angle signature |
|
||||
|---|---|---|---|
|
||||
| 1 | Tāḍāsana | Mountain Pose | All limbs extended: knees 180°, hips 180°, elbows 180°, spine vertical |
|
||||
| 2 | Ūrdhva Hastāsana | Upward Salute | Arms overhead: shoulders ~180° abducted, elbows 180°, torso elongated |
|
||||
| 3 | Uttānāsana | Standing Forward Fold | Hips ~0–30° (full fold), knees 180°, elbows relaxed, spine flexed |
|
||||
| 4 | Ardha Uttānāsana | Half Lift / Flat-Back | Hips ~90° (parallel torso), knees 180°, spine neutral (horizontal) |
|
||||
| 5 | Catvāri (Chaturanga Daṇḍāsana) | Four-Limbed Staff | Hips 180° (plank line), elbows ~90°, shoulders depressed, body horizontal |
|
||||
| 6 | Ūrdhva Mukha Śvānāsana | Upward-Facing Dog | Hips extended ~160°+, shoulders over wrists, spine extended, knees off floor |
|
||||
| 7 | Adho Mukha Śvānāsana | Downward-Facing Dog | Hips ~80–110° (inverted V), knees 180°, shoulders ~180° (arms overhead), spine long |
|
||||
| 8 | Uttānāsana | Standing Forward Fold (return) | Same as stage 3 — mirrors the descent; re-classified as stage 8 for sequence tracking |
|
||||
|
||||
"All 84 classical asanas" is explicitly rejected. Even the 26-pose Bikram set is rejected — the goal is a complete, self-contained instructional sequence for a 2–3 minute demo session, not exhaustive coverage. Eight poses are the minimum for a meaningful sequence narrative and the maximum that fits a single UI strip without horizontal scrolling on a 1080p screen.
|
||||
|
||||
### 2.2 Detection algorithm — joint-angle threshold matching with weighted scoring
|
||||
|
||||
**Chosen: joint-angle threshold matching.** For each frame, compute the angle at 6–10 named joints (one angle per joint, defined as the interior angle at the vertex formed by three landmarks). Compare each computed angle to the per-pose target. Score by weighted absolute deviation. Classify the argmax.
|
||||
|
||||
**Why not the alternatives:**
|
||||
|
||||
| Alternative | Verdict | Reason |
|
||||
|---|---|---|
|
||||
| Skeleton-as-vector cosine similarity | Rejected | Position-sensitive: a person standing 2 m from the camera vs. 1 m produces different vectors. Joint angles are translation- and scale-invariant by construction. |
|
||||
| Small MLP trained on a labelled dataset | Rejected | No labelled dataset exists in this codebase. Training a reliable MLP for 8 poses would require hundreds of labelled examples per class, a train/test split, and a model serialization format — none of which belongs in a single-file demo HTML. Joint-angle matching achieves the same discrimination for 8 geometrically distinct poses with zero training data. |
|
||||
| MediaPipe Tasks PoseClassifier (EfficientNet-based) | Rejected | Requires loading a separate `.task` bundle (~4 MB), adds a network dependency to the demo's offline-capable design, and uses a black-box embedding — undebuggable when a pose is misclassified. Threshold matching is fully inspectable in DevTools. |
|
||||
| DTW template matching on full landmark sequences | Rejected | Appropriate for gesture recognition over time (ADR-014's `gesture.rs`), not static pose classification. Sun Salutation transitions are slow (2–5 seconds per pose); per-frame angle scoring is sufficient. |
|
||||
|
||||
**Joint angle computation.** For three landmark positions A (proximal), B (vertex), C (distal), the interior angle at B is:
|
||||
|
||||
```
|
||||
angle_B = arccos( dot(A-B, C-B) / (|A-B| * |C-B|) ) in degrees
|
||||
```
|
||||
|
||||
This is computed in world-space from the existing `liveKp` THREE.Vector3 array. The computation is purely arithmetic — no matrix inversion, no DFT. At 30 Hz on any modern laptop it is unmeasurably fast relative to the MediaPipe inference cost.
|
||||
|
||||
**Named joints used in yoga-mode.** Joint names, their three-landmark triplets (proximal-vertex-distal), and the BlazePose indices:
|
||||
|
||||
| Joint name | Triplet (P-V-D) | Indices |
|
||||
|---|---|---|
|
||||
| `left_elbow` | shoulder→elbow→wrist | 11→13→15 |
|
||||
| `right_elbow` | shoulder→elbow→wrist | 12→14→16 |
|
||||
| `left_knee` | hip→knee→ankle | 23→25→27 |
|
||||
| `right_knee` | hip→knee→ankle | 24→26→28 |
|
||||
| `left_hip` | shoulder→hip→knee | 11→23→25 |
|
||||
| `right_hip` | shoulder→hip→knee | 12→24→26 |
|
||||
| `left_shoulder` | hip→shoulder→elbow | 23→11→13 |
|
||||
| `right_shoulder` | hip→shoulder→elbow | 24→12→14 |
|
||||
| `torso_lean` | hip-midpoint→shoulder-midpoint→vertical | synthetic |
|
||||
|
||||
`torso_lean` is the angle between the hip-to-shoulder axis and the world vertical (Y axis). It distinguishes standing-upright (≈0°) from folded-forward (≈90°) from plank-horizontal (≈90° in a different axis pattern). In practice, it is implemented as `acos(dot(hipToShoulder.normalize(), UP_VECTOR))` where `UP_VECTOR = (0,1,0)`.
|
||||
|
||||
### 2.3 Pose template format — inline JSON, single-file portable
|
||||
|
||||
Templates live as a JS object literal inside the `<script>` block of the demo file. A sibling `poses.json` would break the single-file portability that makes demos easy to share and locally serve. The inline approach imposes no additional HTTP request and no CORS constraint.
|
||||
|
||||
**Schema** (one template per pose):
|
||||
|
||||
```js
|
||||
{
|
||||
id: "tadasana", // machine-readable ID, localStorage key fragment
|
||||
name_en: "Mountain Pose", // English common name
|
||||
name_sa: "Tāḍāsana", // Sanskrit with diacritics
|
||||
stage: 1, // position in the Sun Salutation A sequence (1-8)
|
||||
joint_targets: {
|
||||
left_elbow: { angle_deg: 180, tolerance_deg: 15, weight: 0.5 },
|
||||
right_elbow: { angle_deg: 180, tolerance_deg: 15, weight: 0.5 },
|
||||
left_knee: { angle_deg: 180, tolerance_deg: 10, weight: 1.0 },
|
||||
right_knee: { angle_deg: 180, tolerance_deg: 10, weight: 1.0 },
|
||||
left_hip: { angle_deg: 180, tolerance_deg: 12, weight: 0.8 },
|
||||
right_hip: { angle_deg: 180, tolerance_deg: 12, weight: 0.8 },
|
||||
torso_lean: { angle_deg: 0, tolerance_deg: 12, weight: 1.2 },
|
||||
},
|
||||
instruction: "Stand tall. Feet hip-width, weight even. Arms relaxed at your sides. Lengthen through the crown.",
|
||||
min_hold_s: 3, // seconds the pose must be held to count as completed
|
||||
}
|
||||
```
|
||||
|
||||
**Schema decisions:**
|
||||
|
||||
- `tolerance_deg` is the half-width of the pass band. An angle within `[target - tolerance, target + tolerance]` contributes full score for that joint. Beyond the tolerance band the score degrades linearly to zero at `target ± (tolerance * 3)`, then clamps to zero. This linear-outside-band behaviour prevents cliff edges where being 16° off scores identically to 90° off.
|
||||
|
||||
- `weight` carries the importance signal. High-weight joints (torso_lean 1.2, knees 1.0) dominate the aggregate score. Low-weight joints (elbows 0.5 in Tadasana, where arm position is relaxed) have less influence. A weight of 0 would mask a joint entirely — used when the joint is not visible (see §2.7 graceful degradation).
|
||||
|
||||
- `min_hold_s` is per-template. Tadasana and Uttanasana are grounding poses that benefit from a 3-second hold. Chaturanga is a strength pose where 2 seconds is already challenging. The value lives in the template, not as a global constant, so future operators can tune it per pose without touching logic.
|
||||
|
||||
- There is no `max_hold_s`. Holding a pose longer than `min_hold_s` does not penalise the score.
|
||||
|
||||
**Why `tolerance_deg` over explicit pass/fail thresholds.** A binary pass/fail at a hard threshold creates a jarring UX: the alignment bar slams between 0% and 100% at a single degree of motion. Linear-outside-band degradation provides smooth visual feedback that guides the user toward the target incrementally.
|
||||
|
||||
### 2.4 Scoring formula
|
||||
|
||||
Per-frame alignment score for pose *p*, given measured angle `θ_j` at joint *j*:
|
||||
|
||||
```
|
||||
delta_j = |θ_j − target_j.angle_deg|
|
||||
|
||||
band_score_j =
|
||||
1.0 if delta_j ≤ tolerance_j
|
||||
1.0 − (delta_j − tolerance_j) / (2 * tolerance_j) if delta_j ≤ 3 * tolerance_j
|
||||
0.0 otherwise
|
||||
|
||||
raw_score_p = Σ_j ( weight_j * band_score_j ) / Σ_j ( weight_j )
|
||||
|
||||
alignment_score_p = clamp(raw_score_p, 0.0, 1.0)
|
||||
```
|
||||
|
||||
`alignment_score_p` is a value in [0, 1]. Displayed in the `#yoga-panel` as an integer percentage (0–100) with one decimal place for the progress ring to animate smoothly.
|
||||
|
||||
**Hold-time component.** The classifier reports a pose as *completed* when two conditions are simultaneously true:
|
||||
1. The pose has been the argmax classifier output for a contiguous streak of `K = 6` frames (see §2.5).
|
||||
2. Within that streak, the alignment score has remained above 0.6 (60%) for at least `min_hold_s` seconds.
|
||||
|
||||
Completion is a one-shot event per pose per sequence pass. It fires once, advances the sequence indicator, and triggers the audible cue. The user must drop out of the pose and re-enter it to re-trigger completion — this prevents accidental re-completion during a rest pause.
|
||||
|
||||
**Why 60% as the hold threshold.** At 60%, the user's joint angles are within the tolerance band on the majority of weighted joints. A strict 80% threshold would frustrate beginners; a lenient 40% threshold would fire on casual near-misses. 60% is consistent with the threshold used in the Google ML Kit PoseClassifier sample and the Perez-Testor study's reported inter-practitioner variance (mean joint-angle SD of ~10° across joints, which maps to roughly a 30% score drop relative to a perfect practitioner on a 15° tolerance band).
|
||||
|
||||
**Why not include a velocity component (punish fast transitions).** Velocity would require a second derivative of the landmark positions, which is already noisy from MediaPipe jitter even after the one-euro filter. Minimum hold time (2–3 s) implicitly penalises rushing through poses without adding noise sensitivity.
|
||||
|
||||
### 2.5 Pose classification flow and debounce
|
||||
|
||||
Every frame, after `ingestPoseLandmarks()` populates `liveKp`:
|
||||
|
||||
```js
|
||||
function classifyPose() {
|
||||
if (!yogaMode.enabled || !liveValid) return;
|
||||
computeJointAngles(); // fills yogaMode.angles from liveKp
|
||||
for (const p of yogaMode.activePoses) {
|
||||
p.frameScore = scorePose(p); // per-frame alignment_score_p
|
||||
}
|
||||
const best = yogaMode.activePoses.reduce((a, b) =>
|
||||
b.frameScore > a.frameScore ? b : a
|
||||
);
|
||||
if (best.frameScore > SCORE_NO_POSE_FLOOR) {
|
||||
yogaMode.streak = (yogaMode.candidate === best.id)
|
||||
? yogaMode.streak + 1 : 1;
|
||||
yogaMode.candidate = best.id;
|
||||
} else {
|
||||
yogaMode.streak = 0;
|
||||
yogaMode.candidate = null;
|
||||
}
|
||||
if (yogaMode.streak >= K_FRAMES && yogaMode.candidate !== yogaMode.current) {
|
||||
yogaMode.current = yogaMode.candidate;
|
||||
onPoseTransition(yogaMode.current);
|
||||
}
|
||||
updateYogaHUD();
|
||||
}
|
||||
```
|
||||
|
||||
**K = 6 frames** (debounce depth). At 30 Hz this corresponds to a 200 ms lag from first matching pose to classification announcement. This is long enough to suppress a one-frame flicker from a mediocre landmark result but short enough to feel instantaneous to a human moving at yoga pace (typical transition speed: 1–3 seconds).
|
||||
|
||||
Lowering K to 3 creates flickering when the user is near a pose boundary. Raising K to 12 introduces a 400 ms lag that makes the HUD feel unresponsive on quick transitions (e.g. Uttanasana → Ardha Uttanasana takes ~1 second in a vigorous practice). K = 6 is the correct value given the ~30 Hz landmark update rate.
|
||||
|
||||
**SCORE_NO_POSE_FLOOR = 0.40.** If no pose scores above 40%, yoga-mode reports "no recognised pose" and does not transition. This prevents the classifier from latching onto the closest-matching pose during, say, walking across the room or sitting at a desk. At 40%, at least a plurality of the weighted joints must be within their tolerance band — a constraint that a non-yoga posture reliably fails.
|
||||
|
||||
### 2.6 UI surfaces
|
||||
|
||||
**Toggle in `#helpers` panel.** Added below the adam-mode row:
|
||||
|
||||
```html
|
||||
<label class="yoga-toggle">
|
||||
<input type="checkbox" id="yoga-mode-toggle">
|
||||
<span>yoga-mode (instructional)</span>
|
||||
<span class="swatch" style="color: var(--green)"></span>
|
||||
</label>
|
||||
```
|
||||
|
||||
yoga-mode is orthogonal to adam-mode: both can be active simultaneously. It uses `data-yoga="on"` on `<body>`, not `data-theme`. The attribute is distinct so that CSS selectors like `:root[data-theme="adam"]` and `:root[data-yoga="on"]` compose without conflict.
|
||||
|
||||
**`#yoga-panel` — bottom-centre overlay.** A new `<div id="yoga-panel" class="panel">` appears at the bottom centre of the viewport when yoga-mode is enabled. It is hidden (`display: none`) when yoga-mode is off, so it does not interfere with the existing layout.
|
||||
|
||||
The panel contains:
|
||||
|
||||
1. **Current pose name** — large (18px), Sanskrit name above English name below, amber colour. Falls back to "—" when no pose is recognised.
|
||||
2. **Alignment score ring** — a small SVG `<circle>` progress ring (r=22, stroke-dasharray) updating on every classified frame. Score 0–100 shown as integer inside the ring.
|
||||
3. **Hold-time progress bar** — a `<div class="bar-track">` identical in style to the CSI bars, filling from 0% to 100% as the hold-time accumulates. Resets on pose transition.
|
||||
4. **Instruction text** — one line from the current pose's `instruction` field, `font-size: 10px`, `color: var(--text-mute)`.
|
||||
5. **Visibility warning** — a `<span class="yoga-warn">` shown in `var(--red)` when `torso_not_visible` is true (see §2.7).
|
||||
|
||||
**Sequence strip — top-centre.** A horizontal strip of 8 thumbnail slots (`<div class="yoga-strip">`) spanning the top of the viewport (z-index above the titlecard, below `#info`). Each slot contains the pose's stage number and a 3-letter abbreviation (TAD, URD, UTT, ARD, CAT, UPD, DOG, UT2). Slots are styled:
|
||||
|
||||
- **Dimmed** (opacity 0.3, `var(--text-mute)` text) — not yet reached.
|
||||
- **Active** (opacity 1.0, `var(--amber)` border glow, pulsing) — current pose.
|
||||
- **Completed** (opacity 0.7, `var(--green)` checkmark `✓`, no glow) — held for `min_hold_s` seconds.
|
||||
|
||||
The strip does not scroll. Eight slots at ~90px each fit a 720px-wide viewport. On narrower screens the strip compresses gracefully because the slots use `flex: 1` within a `display: flex` container.
|
||||
|
||||
**Audible cue.** A single `<audio id="yoga-bell" src="data:audio/wav;base64,..." preload="auto">` element. The WAV is a 0.4-second C5 bell tone encoded inline as base64 (~12 KB). This preserves the single-file portability. It fires once on pose completion via `yogaBell.currentTime = 0; yogaBell.play()`. A `muted` toggle in `#helpers` (beneath the yoga-mode checkbox) allows the user to silence it: `<label><input type="checkbox" id="yoga-mute-toggle"> mute bell</label>`. The bell is muted by default (`yogaBell.muted = true`) to avoid startling first-time users.
|
||||
|
||||
**Theme compatibility.** `#yoga-panel` and the sequence strip use only existing custom properties: `var(--bg-panel)`, `var(--border)`, `var(--amber)`, `var(--amber-hot)`, `var(--text)`, `var(--text-mute)`, `var(--green)`, `var(--red)`. No new CSS variables are introduced. The panel therefore inherits both the default dark theme and adam-mode automatically — the same mechanism described in ADR-169 §2.1.
|
||||
|
||||
### 2.7 Camera / MediaPipe assumptions and graceful degradation
|
||||
|
||||
**Expected input:** front-facing camera, full body from head to ankles in frame, neutral indoor lighting. The demo's existing camera pipeline already requests `{ video: { facingMode: 'user', width: 640, height: 480 } }`. No change to the MediaPipe setup.
|
||||
|
||||
**Graceful degradation when body is partially out of frame.** MediaPipe assigns a `visibility` score in [0, 1] to each landmark. When a landmark's visibility drops below 0.35, yoga-mode treats that joint as missing:
|
||||
|
||||
```js
|
||||
function effectiveWeight(jointName, angles) {
|
||||
const vis = jointVisibility(jointName); // min visibility of the 3 landmarks
|
||||
if (vis < 0.35) return 0.0; // joint masked — not counted
|
||||
if (vis < 0.65) return angles.weight * (vis / 0.65); // partial weight
|
||||
return angles.weight;
|
||||
}
|
||||
```
|
||||
|
||||
When two or more of the high-weight joints (knees, hips, torso_lean) are masked simultaneously, `Σ_j(weight_j)` falls below a minimum viable total, and `alignment_score_p` is set to 0 regardless of the numerator. This prevents spurious high scores from a partially visible body where only one or two low-weight joints (e.g. elbows) are visible and happen to match a pose.
|
||||
|
||||
The `#yoga-panel` surfaces a `torso_not_visible` warning ("Move back — full body not in frame") in `var(--red)` whenever `liveVis[23] < 0.35 || liveVis[24] < 0.35` (left or right hip not visible). The hips are the reference joint for torso_lean and for hip-angle computation; their absence makes the entire classifier unreliable.
|
||||
|
||||
### 2.8 Cross-demo applicability
|
||||
|
||||
**yoga-mode ships in demo 05 only for the first iteration.** Demos 03 and 04 do not have a MediaPipe pipeline; there are no `liveKp` landmarks to score. Adding yoga-mode to them would require pulling in the entire MediaPipe Pose Heavy CDN script — changing those demos' character and load time.
|
||||
|
||||
**New demo: `06-yoga-mode.html`.** A new file `examples/three.js/demos/06-yoga-mode.html` is introduced as a slimmed-down variant of demo 05 where yoga-mode is the primary focus rather than an optional overlay. Differences from demo 05:
|
||||
|
||||
- The CSI panel (`#csi`) and the tomography sweep are hidden by default (`display: none`).
|
||||
- The `#yoga-panel` is expanded to a larger centre-screen layout with a bigger score ring (r=44) and larger pose name text (24px).
|
||||
- The sequence strip is rendered larger (100px slot width).
|
||||
- The `#helpers` panel shows only the yoga-related toggles (yoga-mode, adam-mode, mute bell).
|
||||
- The titlecard text reads "RuView · Yoga Mode".
|
||||
|
||||
This file is created from a copy of demo 05 with the CSI and tomography sections stripped. It shares the `YogaMode` object and pose templates verbatim — no logic is duplicated.
|
||||
|
||||
The decision to introduce a sixth demo file rather than making demo 05's yoga features more prominent is: demo 05 is a complete multi-feature demo (CSI + MediaPipe + IK retarget); demo 06 is a single-purpose instructional demo. Evaluators who want to show the yoga system without the RF sensing noise get demo 06.
|
||||
|
||||
### 2.9 Persistence
|
||||
|
||||
User settings are persisted in `localStorage` under the `ruview.yoga.*` namespace:
|
||||
|
||||
| Key | Type | Value shape | Default |
|
||||
|---|---|---|---|
|
||||
| `ruview.yoga.enabled` | boolean string | `"true"` or `"false"` | `"false"` |
|
||||
| `ruview.yoga.muted` | boolean string | `"true"` or `"false"` | `"true"` |
|
||||
| `ruview.yoga.tolerance_scale` | float string | `"0.5"` to `"2.0"` | `"1.0"` |
|
||||
| `ruview.yoga.sequence` | JSON string | `["tadasana","urdhva_hastasana",…]` | full 8-pose sequence |
|
||||
|
||||
`tolerance_scale` is a global multiplier applied to every `tolerance_deg` value in every template. A scale of 0.5 makes the classifier strict (tight bands); a scale of 2.0 makes it forgiving (wide bands). The HUD exposes this as a simple "Difficulty" slider: Easy (2.0×), Normal (1.0×), Strict (0.5×). The default is Normal.
|
||||
|
||||
`ruview.yoga.sequence` allows an operator to load a custom subset or reordering of the 8 poses, or to load additional poses added via `YogaMode.addPose()`. The array contains pose `id` strings. On load, yoga-mode resolves each ID against the registered template map; unknown IDs are skipped with a console warning.
|
||||
|
||||
All `localStorage` accesses are wrapped in try/catch to handle privacy-restricted origins.
|
||||
|
||||
### 2.10 JS API surface
|
||||
|
||||
yoga-mode exposes a clean internal module object. Because the demo is a single-file HTML with no ES module bundler, the pattern is a plain object literal assigned to a local `const`:
|
||||
|
||||
```js
|
||||
const YogaMode = {
|
||||
// ---- Lifecycle ----
|
||||
init(opts = {}) {}, // wire up UI, register pose templates, restore localStorage
|
||||
enable() {}, // set data-yoga="on", show #yoga-panel, start classifying
|
||||
disable() {}, // remove data-yoga="on", hide #yoga-panel, reset state
|
||||
|
||||
// ---- Classification callbacks ----
|
||||
onPoseChanged(cb) {}, // cb(poseId: string | null) — fires on confirmed transition
|
||||
onPoseScored(cb) {}, // cb(scores: {[poseId]: number}) — fires every frame
|
||||
onPoseCompleted(cb) {}, // cb(poseId: string, holdMs: number) — fires on hold completion
|
||||
|
||||
// ---- Template management ----
|
||||
addPose(template) {}, // validate and register a custom pose template
|
||||
removePose(id) {}, // remove a template by id (built-ins can be removed)
|
||||
poses() {}, // returns Array<PoseTemplate> — current registered set
|
||||
|
||||
// ---- State accessors ----
|
||||
currentPose() {}, // returns current confirmed pose id or null
|
||||
currentScore() {}, // returns alignment score [0,1] of current pose or 0
|
||||
angles() {}, // returns the latest computed joint angles object
|
||||
|
||||
// ---- Sequence control ----
|
||||
resetSequence() {}, // clears all completion state, restarts from stage 1
|
||||
setSequence(ids) {}, // replace active sequence with a custom id array
|
||||
|
||||
// Internal state — not part of the public API:
|
||||
_state: { enabled, candidate, current, streak, holdStart, completedSet }
|
||||
};
|
||||
```
|
||||
|
||||
`onPoseChanged`, `onPoseScored`, and `onPoseCompleted` follow the same pattern as the demo's existing event hooks: they register a single callback (last-writer wins, not an array). This is sufficient for a single-file demo where there is at most one consumer per event. A future multi-listener pattern would need a `listeners` array; that is out of scope.
|
||||
|
||||
`addPose(template)` validates the template schema before registering it. A template missing `joint_targets` or with an `id` that contains non-alphanumeric characters is rejected with a `console.error` and returns `false`. Valid templates return `true`.
|
||||
|
||||
### 2.11 Pose templates — Sun Salutation A joint targets
|
||||
|
||||
The full 8-pose template set. Angle targets are derived from Perez-Testor et al. (2019) Vicon measurements and Iyengar alignment geometry. Tolerances are set to twice the reported inter-practitioner SD (~10°) rounded to the nearest 5°, then scaled by the user's `tolerance_scale`.
|
||||
|
||||
**Stage 1 — Tāḍāsana (Mountain Pose)**
|
||||
|
||||
All joints extended. Body in anatomical position. Baseline for comparison.
|
||||
|
||||
```js
|
||||
{ id: "tadasana", name_en: "Mountain Pose", name_sa: "Tāḍāsana", stage: 1,
|
||||
min_hold_s: 3,
|
||||
joint_targets: {
|
||||
left_knee: { angle_deg: 180, tolerance_deg: 10, weight: 1.0 },
|
||||
right_knee: { angle_deg: 180, tolerance_deg: 10, weight: 1.0 },
|
||||
left_hip: { angle_deg: 180, tolerance_deg: 12, weight: 0.8 },
|
||||
right_hip: { angle_deg: 180, tolerance_deg: 12, weight: 0.8 },
|
||||
torso_lean: { angle_deg: 0, tolerance_deg: 10, weight: 1.2 },
|
||||
left_elbow: { angle_deg: 180, tolerance_deg: 20, weight: 0.4 },
|
||||
right_elbow: { angle_deg: 180, tolerance_deg: 20, weight: 0.4 },
|
||||
},
|
||||
instruction: "Stand tall. Feet hip-width, weight even. Arms at sides. Lengthen through the crown.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 2 — Ūrdhva Hastāsana (Upward Salute)**
|
||||
|
||||
Arms sweep overhead. Shoulders maximally abducted. Distinguishing feature: both elbows extended and arms overhead (shoulder angle approaches 180° abduction). Perez-Testor reports shoulder elevation of 183° at peak overhead position.
|
||||
|
||||
```js
|
||||
{ id: "urdhva_hastasana", name_en: "Upward Salute", name_sa: "Ūrdhva Hastāsana", stage: 2,
|
||||
min_hold_s: 2,
|
||||
joint_targets: {
|
||||
left_shoulder: { angle_deg: 165, tolerance_deg: 20, weight: 1.2 },
|
||||
right_shoulder: { angle_deg: 165, tolerance_deg: 20, weight: 1.2 },
|
||||
left_elbow: { angle_deg: 180, tolerance_deg: 15, weight: 0.8 },
|
||||
right_elbow: { angle_deg: 180, tolerance_deg: 15, weight: 0.8 },
|
||||
left_knee: { angle_deg: 180, tolerance_deg: 12, weight: 0.8 },
|
||||
right_knee: { angle_deg: 180, tolerance_deg: 12, weight: 0.8 },
|
||||
torso_lean: { angle_deg: 0, tolerance_deg: 15, weight: 0.7 },
|
||||
},
|
||||
instruction: "Inhale. Sweep arms overhead. Palms face each other. Gaze forward or slightly up.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 3 — Uttānāsana (Standing Forward Fold)**
|
||||
|
||||
Deep hip flexion. Torso approaches vertical-inverted. Perez-Testor reports hip flexion of 134°. The angle at the hip joint as computed by our triplet (shoulder→hip→knee) goes to ~30° as the torso folds toward the legs. Knees remain extended.
|
||||
|
||||
```js
|
||||
{ id: "uttanasana", name_en: "Standing Forward Fold", name_sa: "Uttānāsana", stage: 3,
|
||||
min_hold_s: 3,
|
||||
joint_targets: {
|
||||
left_hip: { angle_deg: 40, tolerance_deg: 25, weight: 1.2 },
|
||||
right_hip: { angle_deg: 40, tolerance_deg: 25, weight: 1.2 },
|
||||
left_knee: { angle_deg: 175, tolerance_deg: 15, weight: 1.0 },
|
||||
right_knee: { angle_deg: 175, tolerance_deg: 15, weight: 1.0 },
|
||||
torso_lean: { angle_deg: 85, tolerance_deg: 20, weight: 1.0 },
|
||||
},
|
||||
instruction: "Exhale. Fold forward from the hips. Let the crown of the head drop toward the floor.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 4 — Ardha Uttānāsana (Half Lift / Flat-Back)**
|
||||
|
||||
Torso lifts to horizontal. Hip angle opens to ~90°. Spine neutral. This is the most distinctive pose for classification: it is the only one where the torso is neither upright nor fully folded — the `torso_lean` angle is ~90° and the hips are also ~90°. Perez-Testor reports the half-lift as an intermediate transition posture; the distinguishing cue is the simultaneous hip angle and spine neutral (not flexed).
|
||||
|
||||
```js
|
||||
{ id: "ardha_uttanasana", name_en: "Half Lift", name_sa: "Ardha Uttānāsana", stage: 4,
|
||||
min_hold_s: 2,
|
||||
joint_targets: {
|
||||
left_hip: { angle_deg: 90, tolerance_deg: 20, weight: 1.2 },
|
||||
right_hip: { angle_deg: 90, tolerance_deg: 20, weight: 1.2 },
|
||||
left_knee: { angle_deg: 175, tolerance_deg: 12, weight: 0.8 },
|
||||
right_knee: { angle_deg: 175, tolerance_deg: 12, weight: 0.8 },
|
||||
torso_lean: { angle_deg: 90, tolerance_deg: 15, weight: 1.2 },
|
||||
left_elbow: { angle_deg: 180, tolerance_deg: 20, weight: 0.5 },
|
||||
right_elbow: { angle_deg: 180, tolerance_deg: 20, weight: 0.5 },
|
||||
},
|
||||
instruction: "Inhale. Lift the chest. Flat back. Fingertips on the shins or floor. Gaze forward.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 5 — Catvāri / Chaturanga Daṇḍāsana (Four-Limbed Staff)**
|
||||
|
||||
Plank lowered. Elbows at 90°. Body horizontal. This is the hardest pose to classify from a front-facing camera alone: the body is horizontal and the depth axis is ambiguous. The key discriminator is `elbow_angle ≈ 90°` combined with `hip ≈ 180°` (no flexion) and `torso_lean ≈ 90°`. Note: from a front-facing camera, a person in Chaturanga facing the camera appears foreshortened. yoga-mode accepts this limitation and primarily tracks Chaturanga as the transition between Ardha Uttanasana and Upward Dog in the sequence, with lower weight on spatial cues and higher weight on elbow angle. Iyengar geometry specifies elbows at 90° against the body.
|
||||
|
||||
```js
|
||||
{ id: "chaturanga", name_en: "Four-Limbed Staff", name_sa: "Catvāri / Chaturanga Daṇḍāsana", stage: 5,
|
||||
min_hold_s: 2,
|
||||
joint_targets: {
|
||||
left_elbow: { angle_deg: 90, tolerance_deg: 20, weight: 1.5 },
|
||||
right_elbow: { angle_deg: 90, tolerance_deg: 20, weight: 1.5 },
|
||||
left_hip: { angle_deg: 175, tolerance_deg: 15, weight: 0.8 },
|
||||
right_hip: { angle_deg: 175, tolerance_deg: 15, weight: 0.8 },
|
||||
left_knee: { angle_deg: 175, tolerance_deg: 15, weight: 0.6 },
|
||||
right_knee: { angle_deg: 175, tolerance_deg: 15, weight: 0.6 },
|
||||
torso_lean: { angle_deg: 90, tolerance_deg: 20, weight: 0.7 },
|
||||
},
|
||||
instruction: "Lower down. Elbows at 90°, hugged to the ribs. Body in one straight line.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 6 — Ūrdhva Mukha Śvānāsana (Upward-Facing Dog)**
|
||||
|
||||
Hips extend, spine extends (backbend), shoulders over wrists, knees off floor. Distinguishing feature: hips are near 160–180° (extended), which is the opposite of Uttanasana's deep flexion. The `torso_lean` reverses from ~90° horizontal to approaching 0° or slightly past vertical (slight backbend). Perez-Testor's spine extension of 44° is the reference for the backbend component; the hip angle opens to near-full extension.
|
||||
|
||||
```js
|
||||
{ id: "urdhva_mukha_svanasana", name_en: "Upward-Facing Dog", name_sa: "Ūrdhva Mukha Śvānāsana", stage: 6,
|
||||
min_hold_s: 2,
|
||||
joint_targets: {
|
||||
left_hip: { angle_deg: 165, tolerance_deg: 20, weight: 1.2 },
|
||||
right_hip: { angle_deg: 165, tolerance_deg: 20, weight: 1.2 },
|
||||
left_elbow: { angle_deg: 170, tolerance_deg: 20, weight: 0.8 },
|
||||
right_elbow: { angle_deg: 170, tolerance_deg: 20, weight: 0.8 },
|
||||
left_knee: { angle_deg: 170, tolerance_deg: 20, weight: 0.6 },
|
||||
right_knee: { angle_deg: 170, tolerance_deg: 20, weight: 0.6 },
|
||||
torso_lean: { angle_deg: 15, tolerance_deg: 20, weight: 0.8 },
|
||||
},
|
||||
instruction: "Press the tops of the feet down. Lift the chest. Shoulders away from the ears. Gaze forward.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 7 — Adho Mukha Śvānāsana (Downward-Facing Dog)**
|
||||
|
||||
Hips high. Inverted V. The most geometrically distinct pose in the sequence: high hips, extended knees, arms overhead-ish (shoulder angle ~150° relative to torso), torso_lean ~90° but in the opposite direction to Chaturanga (body weight shifted back over the heels). The hip angle as measured by our shoulder→hip→knee triplet is ~80–110° (the pelvis is high, creating a roughly right-angle fold at the hip). Perez-Testor reports the hip-angle transition from Chaturanga to Downward Dog as the largest single-frame angle change in the sequence (~120° excursion), making it the easiest pose to classify correctly.
|
||||
|
||||
```js
|
||||
{ id: "adho_mukha_svanasana", name_en: "Downward-Facing Dog", name_sa: "Adho Mukha Śvānāsana", stage: 7,
|
||||
min_hold_s: 5,
|
||||
joint_targets: {
|
||||
left_hip: { angle_deg: 90, tolerance_deg: 25, weight: 1.2 },
|
||||
right_hip: { angle_deg: 90, tolerance_deg: 25, weight: 1.2 },
|
||||
left_knee: { angle_deg: 180, tolerance_deg: 15, weight: 1.0 },
|
||||
right_knee: { angle_deg: 180, tolerance_deg: 15, weight: 1.0 },
|
||||
left_shoulder: { angle_deg: 150, tolerance_deg: 25, weight: 0.8 },
|
||||
right_shoulder: { angle_deg: 150, tolerance_deg: 25, weight: 0.8 },
|
||||
torso_lean: { angle_deg: 90, tolerance_deg: 20, weight: 0.7 },
|
||||
},
|
||||
instruction: "Hips up and back. Heels reaching toward the floor. Arms and ears in one line. Breathe.",
|
||||
}
|
||||
```
|
||||
|
||||
**Stage 8 — Uttānāsana (Standing Forward Fold, return)**
|
||||
|
||||
Identical to stage 3 in geometry. Classified as stage 8 for sequence-tracking purposes only — same template joint targets, different `id` and `stage` value.
|
||||
|
||||
```js
|
||||
{ id: "uttanasana_return", name_en: "Standing Forward Fold (return)", name_sa: "Uttānāsana", stage: 8,
|
||||
min_hold_s: 2,
|
||||
joint_targets: { /* same as stage 3 */ },
|
||||
instruction: "Step or jump to the front. Exhale. Release the head. Return to stillness.",
|
||||
}
|
||||
```
|
||||
|
||||
Distinguishing stages 3 and 8 is handled by the sequence-tracking layer, not by the classifier. If yoga-mode is in stage 7 (Downward Dog) and detects a forward-fold shape, it advances to stage 8 rather than regressing to stage 3. If yoga-mode is in stages 1–2 and detects a forward-fold shape, it advances to stage 3. The sequence tracks forward direction only; there is no backward regression in the first implementation.
|
||||
|
||||
### 2.12 Test plan
|
||||
|
||||
**Manual — live camera:**
|
||||
Stand in front of the workstation USB camera (ruvzen, confirmed front-facing in CLAUDE.local.md). Enable yoga-mode from `#helpers`. Cycle through all 8 poses in order. For each pose: verify the HUD shows the correct Sanskrit and English name within 2 frames (~67 ms) of entering the pose, the alignment score exceeds 60%, and the sequence strip advances. Verify no pose is misclassified when standing in a casual at-rest position (score should be below 40% floor for all 8 poses).
|
||||
|
||||
**Synthetic — test mode triggered by `?test=1` URL parameter:**
|
||||
When `location.search` includes `test=1`, yoga-mode enters a headless test mode: instead of reading from `liveKp`, it reads from a pre-recorded `YOGA_TEST_FIXTURES` object — one synthetic landmark array per pose, generated at authoring time by capturing the real `liveKp` values during a manual demo session.
|
||||
|
||||
```js
|
||||
if (new URLSearchParams(location.search).has('test')) {
|
||||
for (const fixture of YOGA_TEST_FIXTURES) {
|
||||
ingestPoseLandmarks(fixture.landmarks);
|
||||
classifyPose();
|
||||
const result = YogaMode.currentPose();
|
||||
console.assert(result === fixture.expected_id,
|
||||
`FAIL: ${fixture.expected_id} got ${result}`);
|
||||
}
|
||||
console.log('YogaMode tests complete');
|
||||
}
|
||||
```
|
||||
|
||||
The fixture set is 8 entries (one per pose). Each entry is a hard-coded `landmarks` array of 33 objects with `{x, y, z, visibility}` values. These fixtures are inlined in the `<script>` block, gated behind `if (urlParams.has('test'))` so they are never executed in normal operation.
|
||||
|
||||
**Negative test:** A ninth fixture entry with the user standing in a neutral at-rest position (arms at sides but knees slightly bent, casual posture — not a yoga pose). Assert `YogaMode.currentPose() === null` (no pose above the 0.40 floor).
|
||||
|
||||
**Regression guard for joint-angle computation:** A tenth fixture that hard-codes known landmark positions forming a right angle at the left knee (three points forming a precise 90° angle). Assert `YogaMode.angles().left_knee` is within ±0.5° of 90.
|
||||
|
||||
### 2.13 Rejected alternatives
|
||||
|
||||
| Alternative | Rejected because |
|
||||
|---|---|
|
||||
| Train a custom MLP on a labelled yoga dataset | No labelled dataset in this codebase. Training requires hundreds of examples per class, a train/test pipeline, and a serialized model file — all incompatible with a single-file demo. Joint-angle matching achieves equivalent discrimination for 8 geometrically distinct poses with zero training data. |
|
||||
| Use a paid SaaS pose-classification API (e.g. a commercial yoga scoring cloud service) | Introduces an external network dependency, a per-request cost, and a privacy concern (camera frames leaving the browser). Pure client-side is a hard requirement. |
|
||||
| Ship audio/video instructional content (video of an instructor demonstrating each pose) | Massively increases the demo's asset footprint. A single instructor video per pose at 15 fps, 10 seconds, compressed, is ~500 KB × 8 = 4 MB minimum. The inline base64 bell (~12 KB) is the correct granularity of embedded media for this demo. |
|
||||
| Ship a backend yoga-tracking session record (store per-session completion data to a server) | No backend endpoint exists or is planned for the demos. Client-only; persistence via `localStorage`. |
|
||||
| Integrate with the Rust `pose_tracker.rs` now | Convention mismatch (BlazePose-33 vs COCO-17) documented in §1.3 but the cost of bridging it outweighs the benefit for a demo. The bridge is deferred: yoga-mode in JS is valuable without it. Rust integration becomes tractable once a WebSocket protocol for streaming joint angles (not raw CSI) from the sensing server is defined — a separate ADR. |
|
||||
| Use MediaPipe Tasks `PoseLandmarker` with a built-in `PoseClassifier` task | The Tasks API requires loading a `.task` bundle (~4 MB) from CDN at runtime. Demo 05 already uses the older `@mediapipe/pose@0.5` CDN script; switching APIs would require rewriting the entire landmark ingest pipeline. The classifier task is a black box undebuggable in DevTools. Threshold matching is fully transparent. |
|
||||
| Put yoga-mode on `data-theme` alongside adam-mode | yoga-mode is not a theme — it is a feature toggle. Mixing it with the theme attribute would prevent simultaneous adam-mode + yoga-mode activation and would conflate presentation with functionality. Separate `data-yoga="on"` attribute is the correct model. |
|
||||
|
||||
---
|
||||
|
||||
## 3. Consequences
|
||||
|
||||
### 3.1 Positive
|
||||
|
||||
- The retargeting pipeline in demo 05 gains a per-pose regression test harness (`?test=1`) at no additional tooling cost.
|
||||
- yoga-mode operates on the existing `liveKp` stream — zero additional CPU cost beyond a few arctangent calls per frame (~50 µs at 30 Hz).
|
||||
- The pose-scoring formula is fully deterministic and inspectable: `console.log(YogaMode.angles())` in DevTools shows every joint angle on every frame.
|
||||
- Demo 06 provides a clean instructional-first presentation that separates yoga-mode from the RF sensing visualisations, making the feature accessible to a fitness-context audience.
|
||||
- The `YogaMode.addPose()` API allows operators to extend the template library without touching core logic — enabling future pose sets (Warrior series, Yin postures) as a follow-on.
|
||||
- The `tolerance_scale` persistence allows the same demo codebase to serve both beginners (2× tolerance) and experienced practitioners (0.5× tolerance) without code changes.
|
||||
|
||||
### 3.2 Negative
|
||||
|
||||
- Two HTML files to maintain (`05` and `06`) where previously there was one. Mitigated by the fact that yoga-mode logic is identical between them — demo 06 is a layout variant, not a code fork.
|
||||
- Chaturanga Dandasana classification is inherently degraded from a front-facing camera (the body is horizontal; the depth axis is ambiguous). The classifier can detect the pose if the user faces the camera sideways (profile view), but the existing camera setup on ruvzen is front-facing. This is a known limitation, documented in the instruction text ("face the camera from the side for best Chaturanga detection").
|
||||
- The inline base64 bell WAV adds ~12 KB to the HTML file size. Negligible at the scale of the demo but noted.
|
||||
- `localStorage` namespace `ruview.yoga.*` adds four keys per origin. No conflict with `ruview.theme` from adam-mode.
|
||||
|
||||
### 3.3 Risks
|
||||
|
||||
| Risk | Likelihood | Mitigation |
|
||||
|---|---|---|
|
||||
| MediaPipe visibility scores are unreliable for floor-level landmarks (ankles, feet) during Dog poses | Medium | `effectiveWeight()` already masks low-visibility joints; Dog-pose templates weight knees (visible) more than ankles (may be occluded). |
|
||||
| The `?test=1` fixture landmarks become stale if the coordinate-space transform in `ingestPoseLandmarks()` changes | Low | Fixtures store raw `liveKp` world-space values, not normalized MediaPipe coords. If `ingestPoseLandmarks()` changes its output schema, the fixtures will produce obviously wrong joint angles in the assertion step — the failure is loud, not silent. |
|
||||
| Sequence-strip animation (CSS pulsing glow on the active stage) triggers repaint on every frame at 30 Hz | Low | The pulse is a CSS `animation` on `opacity` — composited by the GPU, no layout reflow. Negligible cost. |
|
||||
| User's camera position cuts off the hips (e.g. laptop on a desk) — `torso_not_visible` fires immediately | High for laptop use | The warning instructs the user to step back. This is the correct behaviour. Future: add a "camera too close" heuristic based on the ratio of shoulder distance to image width. |
|
||||
| Stage 8 (Uttanasana return) is classified identically to stage 3 by the angle classifier alone — the sequence layer must correctly disambiguate them | Medium | The sequence-tracking layer uses monotonic forward-only progression. Stage 3 can only fire when the current sequence position is 2 (after Urdhva Hastasana); stage 8 can only fire when the current sequence position is 7 (after Downward Dog). The classifier produces the angle score; the sequence layer decides which stage to credit. If the user skips a pose, the sequence layer waits — it does not leap to stage 8 from stage 2 even if a forward-fold shape is detected. |
|
||||
|
||||
---
|
||||
|
||||
## 4. Implementation plan
|
||||
|
||||
Moderate scope — two HTML files, no build step, no new external dependencies.
|
||||
|
||||
1. **Define the `YOGA_POSES` array** — 8 template objects as specified in §2.11, inline in the `<script>` block of demo 05.
|
||||
2. **Implement `computeJointAngles()`** — read from the existing `liveKp` array, fill a `yogaAngles` object using the 9 joint triplets in §2.2.
|
||||
3. **Implement `scorePose(template)`** — the weighted-sum formula from §2.4, respecting `effectiveWeight()` for visibility masking.
|
||||
4. **Implement `classifyPose()`** — argmax with K=6 debounce as in §2.5; call from the existing `requestAnimationFrame` loop after `applyRetargeting()`.
|
||||
5. **Add `#yoga-panel` markup and CSS** — bottom-centre panel, score ring, hold-time bar, instruction text, visibility warning. All styles via existing custom properties.
|
||||
6. **Add the sequence strip** — `#yoga-strip` top-centre, 8 flex slots, 3-state styling (dimmed/active/completed).
|
||||
7. **Wire the `#helpers` toggle** — `yoga-mode-toggle` checkbox and `yoga-mute-toggle` checkbox; `localStorage` persistence.
|
||||
8. **Add `YogaMode` object** — wrapping steps 1–7 with the API surface from §2.10.
|
||||
9. **Add `YOGA_TEST_FIXTURES` and the `?test=1` harness** — 10 fixture entries (8 positive, 1 negative, 1 angle-computation).
|
||||
10. **Create `06-yoga-mode.html`** — copy of demo 05 with CSI/tomography sections hidden, larger yoga panel layout.
|
||||
11. **Manual validation** — stand in front of ruvzen camera, cycle all 8 poses, verify classification and sequence advancement.
|
||||
|
||||
Acceptance criteria:
|
||||
|
||||
- All 8 poses classified correctly in the `?test=1` synthetic harness (assertions pass with no console errors).
|
||||
- The negative fixture (casual stand) produces `currentPose() === null`.
|
||||
- The angle-computation fixture (`left_knee` at a known 90°) asserts within ±0.5°.
|
||||
- Manual: each of the 8 Sun Salutation A poses classified within 2 frames when held correctly.
|
||||
- Alignment score exceeds 60% when the user matches the pose by self-assessment.
|
||||
- Sequence strip advances in order; completed poses show green checkmark.
|
||||
- Bell fires on completion (when unmuted).
|
||||
- adam-mode + yoga-mode simultaneously active: both panels visible, correct theme.
|
||||
- `localStorage` persists enabled-state and tolerance-scale across page reloads.
|
||||
|
||||
---
|
||||
|
||||
## 5. Related ADRs
|
||||
|
||||
| ADR | Relationship |
|
||||
|---|---|
|
||||
| [ADR-169](ADR-169-adam-mode-light-theme.md) | Sibling demo-side feature. yoga-mode toggle lives in the same `#helpers` panel. Both are orthogonal and must compose. |
|
||||
| [ADR-019](ADR-019-sensing-only-ui-mode.md) | Sensing-only UI — yoga-mode is the opposite: camera-first, sensing secondary. |
|
||||
| [ADR-035](ADR-035-live-sensing-ui-accuracy.md) | Live sensing UI accuracy norms. yoga-mode scores the user's body against templates, not CSI accuracy — but the same principle of not misrepresenting measurement quality applies. |
|
||||
| [ADR-014](ADR-014-sota-signal-processing.md) | The Rust-side `gesture.rs` uses DTW for gesture recognition. yoga-mode explicitly rejects DTW for static pose classification (§2.2). The two systems are complementary: DTW for motion gestures, angle-threshold for static poses. |
|
||||
| [ADR-029](ADR-029-ruvsense-multistatic-sensing-mode.md) | The Rust `pose_tracker.rs` (COCO-17) that yoga-mode defers integrating with. The COCO↔BlazePose mapping in §1.3 is the foundation for the future bridge. |
|
||||
|
||||
---
|
||||
|
||||
## 6. References
|
||||
|
||||
### Production code
|
||||
- `examples/three.js/demos/05-skinned-realtime.html` — primary implementation target; `liveKp`, `liveVis`, `ingestPoseLandmarks()`, `#helpers`, `#pose-panel`, `RETARGETS`, `visForRetarget()` are all anchors for yoga-mode integration
|
||||
- `examples/three.js/demos/04-skinned-fbx.html` — sibling demo; lighting reference
|
||||
- `v2/crates/wifi-densepose-signal/src/ruvsense/pose_tracker.rs` — Rust COCO-17 tracker; convention mapping in §1.3 of this ADR targets this module
|
||||
|
||||
### External references
|
||||
|
||||
1. **Perez-Testor, S. et al. (2019).** "Kinematics of Suryanamaskar Using Three-Dimensional Motion Capture." *PMC6521759*. 10 trained practitioners, 12-camera Vicon, 100 Hz, sagittal-plane joint angles for each of the 12 standard Surya Namaskar positions. Primary source for angle targets and tolerance bounds in §2.11.
|
||||
|
||||
2. **Chidamber, S. and Harikumar, K. (2023).** "A novel approach for yoga pose estimation based on in-depth analysis of human body joint detection accuracy." *PMC10280249*. Validates joint-angle threshold matching as the dominant reliable real-time method for small-to-medium yoga pose sets; reports average inter-joint angle error of 10.017° across six common daily poses — the empirical basis for the ±10–25° tolerance bands in the templates.
|
||||
|
||||
3. **Lugaresi, C. et al. (2020 / MediaPipe team).** "On-device, Real-time Body Pose Tracking with MediaPipe BlazePose." Google Research Blog and arXiv:2006.10204. Defines the 33-landmark BlazePose topology used throughout §1.3 and §2.2. Confirms the landmark visibility score semantics used in §2.7.
|
||||
|
||||
4. **Google ML Kit team.** "Pose classification options." developers.google.com/ml-kit/vision/pose-detection/classifying-poses. Documents the `PoseClassifier` EfficientNet approach that this ADR rejects in §2.13; the 60% alignment threshold in §2.4 is consistent with the sample thresholds in this guide.
|
||||
|
||||
5. **Iyengar, B.K.S. (2001).** *Light on Yoga* (Schocken Books, revised edition). Chaturanga Dandasana description pp. 102–104: "elbows at right angles along the body" — the 90° elbow target for stage 5. Tadasana pp. 61–63: anatomical position as baseline. The Iyengar descriptions supply angle targets where Perez-Testor's Vicon study does not explicitly report a joint.
|
||||
+1
-1
@@ -1,4 +1,4 @@
|
||||
# ADR-149: Drone Swarm Benchmarking & Evaluation Methodology — Metrics, Leaderboards, and Statistical Rigor
|
||||
# ADR-171: Drone Swarm Benchmarking & Evaluation Methodology — Metrics, Leaderboards, and Statistical Rigor
|
||||
|
||||
| Field | Value |
|
||||
|------------|-----------------------------------------------------------------------------------------|
|
||||
@@ -0,0 +1,391 @@
|
||||
# ADR 260: RuField Multimodal Field Sensing Specification
|
||||
|
||||
Status: Accepted — v0.1 reference stack
|
||||
|
||||
Date: 2026 06 14
|
||||
|
||||
Deciders: rUv
|
||||
|
||||
Tags: sensing, rf, csi, cir, bfld, radar, ultrasonic, infrared, quantum sensing, privacy, provenance, ruvector, ruview
|
||||
|
||||
## 1. Context
|
||||
|
||||
RuView proved that commodity wireless signals can be used as a practical sensing substrate. The next opportunity is larger: define a common specification for multimodal ambient sensing across RF, ultrasonic, subsonic, infrared, radar, and future quantum sensors.
|
||||
|
||||
Existing standards are valuable but fragmented.
|
||||
|
||||
IEEE 802.11bf 2025 standardizes WLAN sensing at the WiFi MAC and PHY layers and was published on September 26, 2025. It is important, but it is WiFi specific.
|
||||
|
||||
Bluetooth Channel Sounding standardizes techniques for obtaining phase and time delay information, but Bluetooth SIG explicitly does not define the distance algorithm. That leaves application level interpretation open.
|
||||
|
||||
IEEE 802.15.4z HRP UWB supports secure ranging using scrambled timestamp sequence waveforms, but UWB remains one modality rather than a universal sensing grammar.
|
||||
|
||||
Matter is a useful smart home interoperability protocol, but it is a device connectivity layer, not a multimodal field sensing specification.
|
||||
|
||||
The gap is clear: there is no open specification that normalizes sensor observations across CSI, CIR, BFLD, radar, ultrasound, subsonic vibration, thermal infrared, and quantum field sensing into one privacy aware, provenance rich, fusion ready event model.
|
||||
|
||||
## 2. Decision
|
||||
|
||||
Create **RuField MFS**, the RuField Multimodal Field Sensing Specification.
|
||||
|
||||
RuField MFS will define a common event, tensor, calibration, confidence, privacy, and provenance model for ambient field sensing.
|
||||
|
||||
It will not replace IEEE 802.11bf, Bluetooth Channel Sounding, UWB, Matter, radar protocols, or device vendor APIs.
|
||||
|
||||
It will sit above them.
|
||||
|
||||
```text
|
||||
WiFi CSI
|
||||
WiFi CIR
|
||||
WiFi BFLD
|
||||
UWB
|
||||
Bluetooth Channel Sounding
|
||||
mmWave radar
|
||||
Ultrasonic
|
||||
Subsonic
|
||||
Infrared
|
||||
Quantum magnetic sensing
|
||||
Quantum inertial sensing
|
||||
|
||||
all emit
|
||||
|
||||
RuField Field Event
|
||||
RuField Field Tensor
|
||||
RuField Fusion Graph
|
||||
RuField Privacy Class
|
||||
RuField Provenance Receipt
|
||||
```
|
||||
|
||||
## 3. Name
|
||||
|
||||
Preferred name: `RuField MFS`
|
||||
|
||||
Full name: `RuField Multimodal Field Sensing Specification`
|
||||
|
||||
Public positioning: `The open specification for camera free field intelligence.`
|
||||
|
||||
## 4. Problem Statement
|
||||
|
||||
Modern sensing systems are locked into modality specific silos: CSI systems produce channel matrices; radar produces range Doppler bins; UWB produces range and time of flight; Bluetooth Channel Sounding produces phase and timing primitives; infrared produces thermal arrays; ultrasonic produces acoustic echoes; subsonic produces structural vibration signatures; quantum sensors produce magnetic, inertial, or optical field traces.
|
||||
|
||||
Each has different sampling, calibration, confidence, privacy, and provenance semantics. This prevents reliable fusion and makes governance weak because raw sensing, derived sensing, biometric inference, and anonymous occupancy are often mixed without explicit boundaries.
|
||||
|
||||
## 5. Goals
|
||||
|
||||
1. Define a common multimodal sensing event format.
|
||||
2. Define a field tensor format spanning time, frequency, phase, amplitude, range, velocity, angle, temperature, vibration, and uncertainty.
|
||||
3. Define a modality registry for RF, acoustic, infrared, radar, and quantum sensing.
|
||||
4. Define privacy classes for raw waveforms, derived features, occupancy, anonymized aggregate state, and biometric inference.
|
||||
5. Define calibration receipts and provenance hashes.
|
||||
6. Define fusion rules for multimodal inference.
|
||||
7. Provide a Rust reference implementation.
|
||||
8. Provide benchmark tasks for camera free room intelligence.
|
||||
9. Make RuView one adapter inside a larger open sensing architecture.
|
||||
|
||||
## 6. Non Goals
|
||||
|
||||
1. Do not define a new wireless PHY.
|
||||
2. Do not replace IEEE 802.11bf.
|
||||
3. Do not replace Bluetooth Channel Sounding.
|
||||
4. Do not replace UWB secure ranging.
|
||||
5. Do not define medical diagnosis.
|
||||
6. Do not transmit speech, images, or raw biometric identity by default.
|
||||
7. Do not require cloud inference.
|
||||
8. Do not require expensive hardware.
|
||||
|
||||
## 7. Core Abstraction — the Field Event
|
||||
|
||||
A Field Event is a timestamped observation from any ambient field sensor.
|
||||
|
||||
```json
|
||||
{
|
||||
"spec": "rufield.mfs.v0.1",
|
||||
"event_id": "01J00000000000000000000000",
|
||||
"timestamp_ns": 1791986400000000000,
|
||||
"sensor": {
|
||||
"modality": "wifi_csi",
|
||||
"vendor": "esp32_c6",
|
||||
"device_id": "sensor_room_01",
|
||||
"placement": "ceiling_corner",
|
||||
"clock_domain": "local_ptp"
|
||||
},
|
||||
"field": {
|
||||
"carrier_hz": 5805000000,
|
||||
"bandwidth_hz": 80000000,
|
||||
"sample_rate_hz": 100,
|
||||
"channels": 234,
|
||||
"features": ["amplitude", "phase", "doppler", "range_proxy"]
|
||||
},
|
||||
"observation": {
|
||||
"space_cell": [4, 2, 1],
|
||||
"range_m": 3.42,
|
||||
"velocity_mps": 0.18,
|
||||
"motion_vector": [0.12, -0.03, 0.00],
|
||||
"confidence": 0.87,
|
||||
"privacy_class": "P2"
|
||||
},
|
||||
"provenance": {
|
||||
"raw_hash": "sha256:raw_measurement_hash",
|
||||
"firmware_hash": "sha256:firmware_hash",
|
||||
"model_id": "ruvector_field_encoder_v1",
|
||||
"calibration_id": "room_cal_2026_06_14"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## 8. Modality Registry
|
||||
|
||||
| Code | Modality | Example source |
|
||||
| ---: | -------------------- | --------------------------------------- |
|
||||
| 1 | wifi_csi | ESP32 C6, Intel BE200, AP CSI |
|
||||
| 2 | wifi_cir | channel impulse response |
|
||||
| 3 | wifi_bfld | beamforming feedback |
|
||||
| 4 | uwb_hrp | IEEE 802.15.4z ranging |
|
||||
| 5 | ble_channel_sounding | phase and timing primitives |
|
||||
| 6 | mmwave_radar | range Doppler radar |
|
||||
| 7 | ultrasonic | echo and time of flight |
|
||||
| 8 | subsonic | structural vibration and room resonance |
|
||||
| 9 | infrared_thermal | thermal array or passive IR |
|
||||
| 10 | active_infrared | reflected IR |
|
||||
| 11 | lidar_phase | phase based optical range |
|
||||
| 12 | quantum_magnetic | NV diamond or OPM field trace |
|
||||
| 13 | quantum_inertial | atom interferometer or precision IMU |
|
||||
| 14 | event_camera | optional visual event stream |
|
||||
| 15 | synthetic_sim | simulator or replay source |
|
||||
|
||||
## 9. Field Tensor
|
||||
|
||||
The normalized numeric container (`Modality`, `FieldAxis`, `FieldTensor`) as specified in the implementation crate `rufield-core`.
|
||||
|
||||
## 10. Privacy Classes
|
||||
|
||||
| Class | Description | Example |
|
||||
| ----- | -------------------------------- | ------------------------------- |
|
||||
| P0 | Raw waveform or raw sensor frame | raw CSI, raw radar cube |
|
||||
| P1 | Derived non identity features | Doppler peak, thermal blob |
|
||||
| P2 | Occupancy and motion only | person present, bed exit |
|
||||
| P3 | Anonymous aggregate state | room count, zone activity |
|
||||
| P4 | Biometric or health inference | breathing, gait, sleep, scratch |
|
||||
| P5 | Identity linked inference | named person state |
|
||||
|
||||
Default system policy: edge storage may retain P0 only temporarily; network transmission defaults to P2 or lower; P4 requires explicit consent; P5 requires explicit identity binding and audit log.
|
||||
|
||||
## 11. Provenance Receipt
|
||||
|
||||
Every event must be auditable (`ProvenanceReceipt`). Acceptance invariant: **No fused inference is valid unless every contributing event has a provenance receipt or is explicitly marked synthetic.**
|
||||
|
||||
## 12. Fusion Graph
|
||||
|
||||
Nodes: sensor, event, field_tensor, feature, object, zone, state, inference, receipt.
|
||||
Edges: observed_by, derived_from, calibrated_by, supports, contradicts, fused_into, expires_at, requires_consent.
|
||||
|
||||
## 13. Fusion Rule Format
|
||||
|
||||
Human readable TOML rules (`rule.person_present`, `rule.bed_exit`, `rule.nocturnal_scratch`) with `inputs`, `method`, `threshold`, `privacy_max`, optional `window_ms` and `requires_consent`.
|
||||
|
||||
## 14. Reference Architecture
|
||||
|
||||
Layer 0 physical sensors; Layer 1 native adapters; Layer 2 field tensor normalization; Layer 3 RuVector field embeddings; Layer 4 fusion graph; Layer 5 policy and privacy guard; Layer 6 application event stream; Layer 7 dashboard, API, MCP, Matter bridge.
|
||||
|
||||
## 15. Rust Crate Layout
|
||||
|
||||
`rufield-core`, `rufield-schema`, `rufield-adapters`, `rufield-fusion`, `rufield-privacy`, `rufield-provenance`, `rufield-bench`, `rufield-viewer`.
|
||||
|
||||
## 16. Core Rust Interfaces
|
||||
|
||||
`FieldAdapter`, `FieldEncoder`, `FusionEngine`, `PrivacyGuard` traits as specified in `rufield-core`.
|
||||
|
||||
## 17. MVP Adapters
|
||||
|
||||
v0.1 must support three real modalities: WiFi CSI, mmWave radar, Infrared thermal. Optional: ultrasonic, subsonic, synthetic simulator.
|
||||
|
||||
## 18. Benchmark Suite
|
||||
|
||||
| Task | Metric | Target |
|
||||
| ----------------------- | -------: | -----------: |
|
||||
| Presence detection | F1 | 0.90 |
|
||||
| Room transition | F1 | 0.85 |
|
||||
| Bed exit | F1 | 0.90 |
|
||||
| Breathing detected | F1 | 0.80 |
|
||||
| Nocturnal scratch | F1 | 0.75 |
|
||||
| Fall like event | Recall | 0.95 |
|
||||
| False alarm rate | per hour | below 0.10 |
|
||||
| Event latency | p95 | below 100 ms |
|
||||
| Provenance coverage | percent | 100 |
|
||||
| Privacy violation count | count | 0 |
|
||||
|
||||
## 19. First Viral Demo
|
||||
|
||||
Camera free room intelligence: person enters, sits, breathing detected, sleeps, scratches arm, exits bed, leaves room — no camera, no identity, signed field receipts, live fusion graph, privacy class visible per event.
|
||||
|
||||
## 20. Data Model
|
||||
|
||||
`FieldEvent { spec_version, event_id, timestamp_ns, sensor, tensor, observation, provenance }` and `Observation { zone_id, space_cell, range_m, velocity_mps, motion_vector, confidence, labels, privacy_class }`.
|
||||
|
||||
## 21. Decision Matrix
|
||||
|
||||
| Option | Interop | Novelty | Buildability | Business value | Risk | Score |
|
||||
| --------------------------------------------- | ------: | ------: | -----------: | -------------: | ---: | ----: |
|
||||
| Extend RuView only | 2 | 2 | 5 | 3 | 2 | 14 |
|
||||
| Build proprietary fusion engine | 3 | 3 | 4 | 4 | 3 | 17 |
|
||||
| Create open RuField spec plus reference stack | 5 | 5 | 4 | 5 | 3 | 22 |
|
||||
| Attempt new hardware standard | 5 | 5 | 1 | 4 | 5 | 20 |
|
||||
|
||||
Decision: **Create open RuField spec plus reference stack.** It maximizes credibility, extensibility, and ecosystem pull while avoiding the impossible burden of defining a new physical layer.
|
||||
|
||||
## 22. Security Model
|
||||
|
||||
| Threat | Impact | Mitigation |
|
||||
| ----------------------------------- | ------------------------------- | -------------------------------------- |
|
||||
| Raw waveform leakage | privacy breach | P0 edge only by default |
|
||||
| Biometric inference without consent | legal and trust risk | P4 consent gate |
|
||||
| Sensor spoofing | false occupancy or safety event | signed sensor receipts |
|
||||
| Replay attack | forged event stream | nonce plus timestamp plus hash chain |
|
||||
| Model drift | wrong inference | calibration expiry and benchmark gates |
|
||||
| Overfitting to one room | weak generalization | room split benchmark |
|
||||
| Vendor firmware change | silent degradation | firmware hash in receipt |
|
||||
|
||||
## 23. Calibration Model
|
||||
|
||||
`CalibrationReceipt` is first class. Required calibration tasks: empty room baseline, single person walk path, sit and stand, bed or couch transition, breathing reference, no motion stability period.
|
||||
|
||||
## 24. Inference Semantics
|
||||
|
||||
Every inference must include: label, confidence, supporting events, contradicting events, privacy class, calibration id, model id, expiry time.
|
||||
|
||||
## 25. Consequences
|
||||
|
||||
Positive: RuView becomes part of a larger sensing ecosystem; the spec creates a standards-style wedge without waiting for silicon vendors; multimodal fusion becomes portable; privacy and provenance become differentiators; enterprise deployment becomes easier to justify; benchmark receipts reduce skepticism.
|
||||
|
||||
Negative: broad scope can dilute execution; hardware variability will be painful; calibration is the hardest practical problem; some will claim existing standards already solve parts of this; medical and biometric use cases require careful governance.
|
||||
|
||||
Mitigation: keep v0.1 narrow; ship real adapters; publish benchmark receipts; do not claim medical diagnosis; position RuField above existing standards.
|
||||
|
||||
## 26. Implementation Plan
|
||||
|
||||
Phase 1 spec skeleton; Phase 2 Rust core; Phase 3 adapters; Phase 4 fusion graph; Phase 5 dashboard; Phase 6 benchmark.
|
||||
|
||||
## 27. Acceptance Criteria
|
||||
|
||||
v0.1 is accepted when:
|
||||
|
||||
1. Three modalities stream into one event graph.
|
||||
2. Every event has a privacy class.
|
||||
3. Every event has a provenance receipt.
|
||||
4. Fusion produces at least five room state inferences.
|
||||
5. p95 event pipeline latency is below 100 ms.
|
||||
6. Benchmark runner produces deterministic reports.
|
||||
7. Raw waveform storage is disabled by default.
|
||||
8. P4 inference requires consent policy approval.
|
||||
9. Dashboard shows live camera free room intelligence.
|
||||
10. Spec is readable enough for external implementers.
|
||||
|
||||
## 28. Reference Repository Structure
|
||||
|
||||
Crates under `v2/crates/rufield-*` (workspace members), spec under `docs/rufield/`, benches under `rufield-bench`.
|
||||
|
||||
## 29. Open Questions
|
||||
|
||||
1. JSON Schema first, Protobuf first, or both?
|
||||
2. Default transport: MQTT, NATS, WebSocket, or MCP?
|
||||
3. Matter integration: bridge or first class target?
|
||||
4. P4 health inference disabled by default in public demos?
|
||||
5. Benchmark datasets synthetic first, then real world?
|
||||
6. Include quantum modality IDs even if adapters are synthetic only?
|
||||
|
||||
## 30. Recommendation
|
||||
|
||||
Proceed. Publish RuField as an open specification with a working Rust reference stack and a viral camera free room intelligence demo.
|
||||
|
||||
## 31. Benchmark Acceptance Test
|
||||
|
||||
```text
|
||||
Given a room with WiFi CSI, mmWave radar, and thermal IR sensors
|
||||
When a person enters, sits, breathes, exits bed, and leaves
|
||||
Then RuField emits signed events
|
||||
And classifies room state without a camera
|
||||
And keeps all default network events at P2 or below
|
||||
And produces p95 latency below 100 ms
|
||||
And produces a deterministic benchmark report
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Implementation Status (v0.1 reference stack)
|
||||
|
||||
The v0.1 reference stack is implemented as a **standalone Cargo workspace**
|
||||
(`rufield/`, published as `github.com/ruvnet/rufield` and vendored into RuView
|
||||
as a submodule — the `vendor/rvcsi` pattern). It is pure Rust, builds and tests
|
||||
on Windows with no native deps (`ndarray`/`tch`/`openblas` are not used), and
|
||||
depends only on `serde`, `serde_json`, `toml`, `sha2`, and `ed25519-dalek`.
|
||||
|
||||
**All metrics below are SYNTHETIC.** They are scored against the simulator's own
|
||||
ground-truth labels. They demonstrate the pipeline recovers known truth and runs
|
||||
within latency/privacy/provenance budgets — they are **not** field-validated
|
||||
accuracy. There is no hardware in v0.1; real adapters (ESP32 CSI, mmWave, thermal
|
||||
IR) are a documented follow-up (see the repo README "Firmware" section).
|
||||
|
||||
### Crates delivered
|
||||
|
||||
| Crate | Implements |
|
||||
|-------|-----------|
|
||||
| `rufield-core` | §7/§9/§16/§20 data model: `Modality` (15), `FieldAxis`, `FieldTensor` (shape↔values validated), `PrivacyClass` (P0–P5), `SensorDescriptor`, `Observation`, `FieldEvent`, `CalibrationReceipt`, `InferenceQuery`, `FieldInference`, `FieldEmbedding`; `FieldAdapter`/`FieldEncoder`/`FusionEngine`/`PrivacyGuard` traits. §7 JSON example round-trips. |
|
||||
| `rufield-provenance` | Real `sha256` content hashing + deterministic `ed25519` sign/verify; §11 `is_fusable` invariant. Tests: tamper → verify fails; synthetic event fusable without signer. |
|
||||
| `rufield-privacy` | §10 default policy + `DefaultPrivacyGuard` (`authorize` → Allow/Deny/RequiresConsent). Tests: P0 transmit denied; P4 no-consent → RequiresConsent; P4 consent → Allow; P2 → Allow; P5 needs identity binding. |
|
||||
| `rufield-adapters` | Deterministic seeded `SyntheticSim` emitting the §19 sequence across 3 modalities (wifi_csi, mmwave_radar, infrared_thermal). Same seed ⇒ identical signed event stream with ground-truth labels. |
|
||||
| `rufield-fusion` | `FusionGraph` (§12) + `RuFieldFusion` engine; TOML rules (§13, ≥5 inferences: person_present, sitting, sleeping, breathing, nocturnal_scratch, bed_exit, room_transition); weighted-Bayes + temporal-window; rejects non-fusable events; `FieldInference` with §24 fields. |
|
||||
| `rufield-bench` | Deterministic runner: F1 per task (SYNTHETIC), p95 latency, provenance coverage, privacy violations; JSON + human table; §31 acceptance test as `#[test]`. |
|
||||
|
||||
Total test count across the workspace: **60 tests, 0 failed**.
|
||||
`cargo clippy --workspace` is clean.
|
||||
|
||||
### §27 acceptance-criteria scorecard
|
||||
|
||||
| # | Criterion | Status |
|
||||
|---|-----------|--------|
|
||||
| 1 | Three modalities stream into one event graph | **PASS** — wifi_csi, mmwave_radar, infrared_thermal |
|
||||
| 2 | Every event has a privacy class | **PASS** — `Observation.privacy_class` (non-optional), default ≤ P2 |
|
||||
| 3 | Every event has a provenance receipt | **PASS** — every event is ed25519-signed and verifies; coverage 100% |
|
||||
| 4 | Fusion produces ≥ 5 room-state inferences | **PASS** — 7 distinct inferences produced |
|
||||
| 5 | p95 event pipeline latency < 100 ms | **PASS** — p95 ≈ 0.01 ms (in-process) |
|
||||
| 6 | Benchmark runner produces deterministic reports | **PASS** — identical report across runs (latency is the only wall-clock field) |
|
||||
| 7 | Raw waveform storage disabled by default | **PASS** — P0 network transmission denied by default policy |
|
||||
| 8 | P4 inference requires consent policy approval | **PASS** — P4 without consent → RequiresConsent; breathing/scratch rules carry `requires_consent = true` |
|
||||
| 9 | Dashboard shows live camera-free room intelligence | **DEFERRED** — no `rufield-viewer` dashboard in v0.1; the benchmark + `room_intelligence` example provide a CLI view. Follow-up. |
|
||||
| 10 | Spec readable for external implementers | **PASS** — ADR-260 + detailed standalone README with compiling usage examples |
|
||||
|
||||
**Decision:** §27 criteria 1–8 and 10 PASS; criterion 9 (live dashboard) is
|
||||
**deferred** to a follow-up. Per the acceptance rule (1–8, 10 pass; 9 may be
|
||||
deferred), Status is set to **Accepted — v0.1 reference stack**.
|
||||
|
||||
### Deterministic benchmark report (SYNTHETIC, seed = 2026)
|
||||
|
||||
```text
|
||||
TASK (SYNTHETIC) METRIC VALUE TARGET MEETS
|
||||
presence f1 1.000 0.900 yes
|
||||
breathing f1 1.000 0.800 yes
|
||||
nocturnal_scratch f1 0.923 0.750 yes
|
||||
bed_exit f1 1.000 0.900 yes
|
||||
room_transition f1 1.000 0.850 yes
|
||||
-----------------------------------------------------------------------------------
|
||||
p50 latency: 0.0097 ms
|
||||
p95 latency: 0.0123 ms (target < 100 ms: PASS)
|
||||
provenance coverage: 100.0 % (target 100%: PASS)
|
||||
privacy violations: 0 (target 0: PASS)
|
||||
events=216 modalities=3 distinct_inferences=7
|
||||
```
|
||||
|
||||
All five scored §18 tasks meet their F1 targets **on synthetic ground truth**.
|
||||
`nocturnal_scratch` is 0.923 (one borderline noise tick at this seed) — reported
|
||||
honestly rather than tuned to 1.0. The fall-like / false-alarm-rate §18 rows are
|
||||
not scored in v0.1 (no fall is in the demo sequence) and are a follow-up. These
|
||||
numbers prove the fusion pipeline scores correctly against known truth; they say
|
||||
**nothing** about real-world accuracy, which requires the hardware adapters that
|
||||
v0.1 deliberately does not ship.
|
||||
|
||||
### Honest statement
|
||||
|
||||
Every metric here is simulator-based. No ESP32 CSI, mmWave, or thermal capture
|
||||
was used. RuField v0.1 is a working, honestly-measured reference pipeline —
|
||||
data model, provenance, privacy, fusion, and a deterministic benchmark — pending
|
||||
real hardware adapters.
|
||||
@@ -0,0 +1,172 @@
|
||||
# ADR-261: RuVector Graph-ANN Index — a real HNSW baseline + a SymphonyQG-style quantized variant, MEASURED
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Status** | Accepted |
|
||||
| **Date** | 2026-06-14 |
|
||||
| **Deciders** | ruv |
|
||||
| **Codebase target** | `wifi-densepose-ruvector` — `hnsw.rs`, `hnsw_quantized.rs`, `ann_measure.rs`, `benches/ann_bench.rs`, docs |
|
||||
| **Relates to** | ADR-084 (RaBitQ similarity sensor — 1-bit sketch), ADR-156 (RuVector beyond-SOTA sweep — §5 #1 SymphonyQG, §8/§10/§11 RaBitQ Pass-2/multi-bit/estimator), ADR-024 (AETHER re-ID), ADR-016/017 (RuVector integration) |
|
||||
| **Scope** | Build the **missing HNSW graph-ANN baseline** in the ruvector retrieval path, build a **SymphonyQG-style quantized-traversal variant** on the same graph, and **MEASURE** the real recall/QPS ratio between them — closing the ADR-156 §5 #1 gap honestly. Resolves ADR-156 §8 backlog item **"SymphonyQG reproduction"** from **CLAIMED-only** to **MEASURED-direction-tested**. |
|
||||
|
||||
---
|
||||
|
||||
## 0. PROOF discipline (this ADR's contract)
|
||||
|
||||
This project has been publicly accused of "AI slop." This ADR answers with **evidence, not adjectives** — the same contract as ADR-154/156:
|
||||
|
||||
- The HNSW index ships a **committed recall@10 correctness gate** (≥ 0.95 vs brute force on a planted-cluster fixture). Low recall means a graph bug; the gate is wired to fail in that case. It **did** fail first — and caught a real index-out-of-bounds bug in the insert path (§4) — which is exactly what a real gate is for.
|
||||
- Every QPS/recall number below is **MEASURED** on this box with a committed, deterministic, `--no-default-features`-runnable measurement (`src/ann_measure.rs`, `ann_bench_report`) and a committed criterion bench (`benches/ann_bench.rs`). Both call **one** shared fixture/measurement module, so the bench and the report can never measure different graphs.
|
||||
- The **headline result is an honest negative**: at our test scale the SymphonyQG-style quantized variant **does not beat float HNSW at equal recall** — the 1-bit Hamming traversal is too coarse to keep recall up. We report the real numbers, explain *why*, and state the expected large-N crossover. **We did not tune the quantized path to manufacture the 3.5–17× the source claims.** A measured negative + a scale caveat is a valid, publishable result.
|
||||
- We are explicit that this is **OUR HNSW + OUR 1-bit quantization, not SymphonyQG's exact system**. It tests the **direction** of the claim on our hardware/data, not a 1:1 reproduction.
|
||||
|
||||
Test machine: Windows 11, `cargo test --release`, `std::time::Instant` wall-clock. Numbers are warm medians on this box; the **ratio** is the claim, not the absolute QPS.
|
||||
|
||||
Reproduce:
|
||||
```bash
|
||||
cd v2 && cargo test -p wifi-densepose-ruvector --no-default-features --release \
|
||||
ann_bench_report -- --nocapture
|
||||
# Larger N: ANN_BENCH_N=50000 cargo test ... --release ann_bench_report -- --nocapture
|
||||
cargo bench -p wifi-densepose-ruvector --bench ann_bench
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 1. Context
|
||||
|
||||
The ruvector crate's retrieval path — AETHER re-ID hot-cache (ADR-024), the `sketch.rs` 1-bit prefilter (ADR-084), room fingerprinting — is, at its core, an **approximate nearest-neighbour (ANN)** problem: dense float embedding in, top-K similar ids out. But **the crate had no graph index**. Every `topk` was either a linear scan (`O(N·d)` per query) or a 1-bit Hamming prefilter over a linear scan. That is `O(N)` per query and does not scale.
|
||||
|
||||
[ADR-156 §5 #1](ADR-156-ruvector-fusion-beyond-sota.md) graded **SymphonyQG** (SIGMOD 2025) the **lead beyond-SOTA ANN candidate**, citing the source's claim of **3.5–17× QPS over HNSW at equal recall**, but marked it **CLAIMED**:
|
||||
|
||||
> *"author-measured; **not reproduced on our hardware** — reproduction is future work."*
|
||||
|
||||
And ADR-156 §8 was blunt about *why* it could not be reproduced: **there was no HNSW baseline to compare against.** You cannot measure a ratio against a baseline that does not exist. This ADR builds that missing baseline, builds the quantized variant that tests the direction of the SymphonyQG bet, and measures the real ratio.
|
||||
|
||||
---
|
||||
|
||||
## 2. Decision
|
||||
|
||||
1. Add a correct, dependency-free **float HNSW** graph index (`hnsw.rs`): the real Malkov & Yashunin (TPAMI 2018) algorithm — multi-layer navigable small-world graph, `ef_construction` / `ef_search`, the Algorithm-4 neighbour-selection heuristic, seeded-deterministic level assignment, L2 + cosine. This is the **baseline** ADR-156 said was missing.
|
||||
2. Add a **SymphonyQG-style quantized-traversal variant** (`hnsw_quantized.rs`): the *same* graph (same seed, same structure), but the beam search scores candidates with a **cheap 1-bit Hamming distance** over the RaBitQ Pass-2 rotated sign code (reusing `rotation.rs` + the sign-quantization of `sketch.rs`), then **exact-float reranks** the final candidate set. This is the SymphonyQG bet — cheaper per-node scoring, recovered by a final exact rerank.
|
||||
3. **Measure** linear vs float-HNSW vs quantized-HNSW (recall@10, QPS, equal-recall ratios) on one deterministic planted-cluster fixture, and record the honest verdict against the SymphonyQG 3.5–17× claim.
|
||||
|
||||
### Why 1-bit Hamming for the quantized traversal
|
||||
|
||||
The crate already had the exact pieces SymphonyQG fuses: a deterministic orthogonal rotation (`rotation.rs`, RaBitQ Pass-2) and sign-quantization (`sketch.rs`). A 1-bit code compares by POPCNT Hamming — a few machine words, no per-dimension float work — so it is the cheapest possible traversal score and the most direct test of "can a quantized score keep the beam on the right path." The cost (measured below): the 1-bit code is a *coarse* angle proxy (ADR-156 §10 measured ~46% strict-K coverage for sign-only), and that coarseness is what limits recall here.
|
||||
|
||||
---
|
||||
|
||||
## 3. Design
|
||||
|
||||
### 3.1 `hnsw.rs` — float HNSW (the baseline)
|
||||
|
||||
- **Graph.** `links[id][layer]` adjacency; layer 0 holds every node, higher layers exponentially sparser. `m_max` is `2·M` on layer 0, `M` above (the paper's asymmetric degree cap).
|
||||
- **Insert.** Greedy-descend the upper layers to a good entry point, then for each layer from the node's level down to 0: `search_layer` for `ef_construction` candidates, `select_neighbours` (Algorithm 4 — keep a candidate only if it is closer to the new node than to any already-selected neighbour, giving diverse navigable edges), wire bidirectional edges, re-prune any neighbour that overflows `m_max`. The node is pushed into the arrays **before** wiring so every `links[*]` index is valid mid-insert (§4 — the bug the gate caught).
|
||||
- **Search.** Greedy-descend layers `>0`, then best-first beam search of width `ef` on layer 0; return the closest `k`. Iterative (explicit heaps + visited set) — **no recursion**, bounded by the beam and the visited set.
|
||||
- **Determinism.** Level assignment is the only randomness and is driven by a **seeded SplitMix64** (the exact pattern from `rotation.rs`) — never `Date::now`/OS RNG/unseeded `rand`. Same `(seed, params, insertion order)` ⇒ bit-identical graph and search (pinned by `hnsw_is_deterministic_for_seed`).
|
||||
- **Robustness.** Empty index, `k==0`, `k>n`, single node, zero-dim, ragged query, `ef<k` all return cleanly — pinned by `*_no_panic` tests.
|
||||
|
||||
### 3.2 `hnsw_quantized.rs` — the SymphonyQG-style variant
|
||||
|
||||
Same graph as the float index (identical seed/structure — the **only** variable is the scoring), plus a per-node `ceil(D/8)`-byte 1-bit Pass-2 sign code (`D = next_pow2(dim)`). `search_quantized(query, k, ef, rerank)`:
|
||||
1. Encode the query to its 1-bit code (one rotation + sign pack).
|
||||
2. Greedy-descend + beam-search the graph scoring every visited node by **POPCNT Hamming** (query-code XOR node-code) — no per-dim float work.
|
||||
3. **Exact-float rerank** the top `rerank` Hamming candidates with the true L2/cosine metric, return the best `k`.
|
||||
|
||||
### 3.3 Security / robustness
|
||||
|
||||
Both indices: bounded **iterative** traversal (no unbounded recursion), no panic on empty/degenerate/ragged/zero-dim input (the metric compares over the shorter prefix; zero-norm cosine returns max distance, not NaN). The 1-bit encode handles padded dims via the existing `Rotation::apply_padded`.
|
||||
|
||||
---
|
||||
|
||||
## 4. The bug the correctness gate caught (disclosed, not hidden)
|
||||
|
||||
The first run of the recall@10 gate **panicked**: `index out of bounds: the len is 33 but the index is 33` in `search_layer`. Root cause: `insert` wired bidirectional edges (`links[nbr][l].push(id)`) **before** pushing the new node's own `links[id]` row into the array. A later traversal step in the *same* insert could hop to a neighbour that now pointed at `id` and read `links[id]` — which did not exist yet. Fix: push the node (with empty per-layer link lists) into `vectors`/`links`/`levels` **up front**, then wire edges into its existing slot. The new node has no incoming edges and empty outgoing lists until wiring, so it is unreachable by the searches that run first — pushing early is safe and keeps every index valid. This is exactly why the recall gate exists: a silent low-recall graph and an out-of-bounds panic are both "slop" the gate forces into the open.
|
||||
|
||||
---
|
||||
|
||||
## 5. The SymphonyQG claim being tested
|
||||
|
||||
| Source | Claim | Grade (before this ADR) |
|
||||
|--------|-------|-------------------------|
|
||||
| SymphonyQG, SIGMOD 2025 | **3.5–17× QPS over HNSW at equal recall**, via quantization unified with graph traversal, pure-CPU/edge-portable | **CLAIMED** — author-measured, *not reproduced on our hardware (no HNSW baseline existed)* |
|
||||
|
||||
The bet: a quantized traversal score is cheap enough — and accurate enough to keep the beam on-path — that you pay far less per visited node and recover the small recall loss with a final exact rerank.
|
||||
|
||||
---
|
||||
|
||||
## 6. MEASURED results
|
||||
|
||||
Fixture: planted-cluster synthetic, **dim=128, N=10,000, 64 clusters, 200 queries, K=10, noise=0.35**, L2 metric, `M=16`, `ef_construction=200`. Graph seed `0x6261524741484E53`, rotation seed `0x5EEDC0DE12345678`. `--release`, warm wall-clock on the test machine. (The fixture and both indices are shared by the criterion bench.)
|
||||
|
||||
| Method | recall@10 | QPS | latency (µs) |
|
||||
|--------|-----------|-----|--------------|
|
||||
| **linear scan (brute force)** | 1.0000 | 1,022 | 978 |
|
||||
| **float-HNSW** ef=16 | 0.9945 | **25,744** | 39 |
|
||||
| float-HNSW ef=32 | 0.9990 | 21,470 | 47 |
|
||||
| float-HNSW ef=64 | 1.0000 | 18,779 | 53 |
|
||||
| float-HNSW ef=128 | 1.0000 | 12,722 | 79 |
|
||||
| float-HNSW ef=256 | 1.0000 | 5,742 | 174 |
|
||||
| quant-HNSW ef=32 rr=20 | 0.1620 | 30,005 | 33 |
|
||||
| quant-HNSW ef=32 rr=100 | 0.2615 | 36,388 | 28 |
|
||||
| quant-HNSW ef=64 rr=100 | 0.4865 | 20,603 | 49 |
|
||||
| quant-HNSW ef=128 rr=100 | 0.6785 | 13,718 | 73 |
|
||||
| quant-HNSW ef=256 rr=100 | **0.7380** | 6,578 | 152 |
|
||||
|
||||
### Equal-recall QPS ratios
|
||||
|
||||
| Target recall | Fastest float-HNSW | Fastest quant-HNSW meeting it | quant/float | float/linear |
|
||||
|---------------|--------------------|-------------------------------|-------------|--------------|
|
||||
| ≥ 0.90 | ef=16 → 25,744 QPS | **none** (best quant recall = 0.738) | — | **25.19×** |
|
||||
| ≥ 0.95 | ef=16 → 25,744 QPS | **none** | — | **25.19×** |
|
||||
| ≥ 0.99 | ef=16 → 25,744 QPS | **none** | — | **25.19×** |
|
||||
|
||||
---
|
||||
|
||||
## 7. Honest verdict
|
||||
|
||||
**The HNSW baseline is a decisive win over linear scan: ~25× QPS at recall ≥ 0.99** (ef=16: 0.9945 recall, 25,744 QPS vs linear 1,022 QPS). The correctness gate (recall@10 ≥ 0.95 vs brute force, both L2 and cosine) holds. This is the baseline ADR-156 §5 #1 said did not exist — it now does.
|
||||
|
||||
**The SymphonyQG-style quantized variant does NOT beat float HNSW at our scale — direction REFUTED at N=10k.** The 1-bit Hamming traversal is too coarse: its best achievable recall is **0.738** (ef=256, rr=100), and it never reaches even the 0.90 equal-recall point where a fair QPS comparison could be made. Where the quantized score *is* faster (ef=32: ~30–36k QPS, beating float's 25.7k), its recall collapses to 0.16–0.26 — a meaningless win. There is **no equal-recall operating point** at which quantized is faster, so the SymphonyQG 3.5–17× claim is **not reproduced** by our 1-bit construction here.
|
||||
|
||||
**Why** (so the negative is understood, not just stated):
|
||||
1. The 1-bit sign code is a **coarse angle proxy** — ADR-156 §10 already measured it at ~46% strict-K coverage. Driving graph *traversal* by that coarse score steers the beam onto the wrong nodes, and the exact-float rerank can only recover what the beam actually visited. At N=10k, near-neighbours have nearly-identical sign codes, so Hamming cannot separate them.
|
||||
2. At this scale **float distance is already cheap**: one 128-d L2 is a handful of µs; the per-node float compute the quantization saves is small relative to the recall it costs. SymphonyQG's win shows up at **much larger N** (millions), where (a) the float-distance fraction of query time dominates and (b) their *multi-bit RaBitQ-fused* code (not our 1-bit sign code) keeps recall high. **Expected crossover: large N + a higher-bit code.** ADR-156 §10 already measured that a ≤4-bit code reaches ~74% strict coverage vs 1-bit's ~46%, so a multi-bit traversal score is the obvious next lever — deferred, not claimed.
|
||||
|
||||
**Caveat (stated plainly):** this is **our** HNSW + **our** 1-bit quantization, not SymphonyQG's system. We tested the *direction* of the claim ("does quantized traversal + rerank beat float HNSW at equal recall?") on our hardware/data and got a **measured no at N=10k**. That neither confirms nor refutes SymphonyQG's own published numbers on their system/scale — it refutes the direction *for our construction at our scale*, and identifies the two levers (scale, code bit-depth) a real reproduction would need.
|
||||
|
||||
---
|
||||
|
||||
## 8. Validation
|
||||
|
||||
- **`cd v2 && cargo test -p wifi-densepose-ruvector --no-default-features --lib`** — **151 passed / 0 failed** (was 131; +20 new tests: 10 `hnsw`, 7 `hnsw_quantized`, 3 `ann_measure`).
|
||||
- **`cargo test --workspace --no-default-features`** — GREEN (see §10 for the count).
|
||||
- **Correctness gate verified to bite:** the recall@10 gate **panicked** on the first (buggy) insert path (§4); after the fix it passes at 0.99+ recall (L2 and cosine).
|
||||
- **`cargo test -p wifi-densepose-ruvector --no-default-features --release ann_bench_report -- --nocapture`** — prints the §6 table; the numbers above are copied verbatim from that run.
|
||||
- **`cargo bench -p wifi-densepose-ruvector --bench ann_bench`** — compiles and runs the same fixture through criterion.
|
||||
- **`python archive/v1/data/proof/verify.py`** — **VERDICT: PASS** (the Rust ANN work is independent of the Python signal-proof pipeline; hash unchanged).
|
||||
|
||||
---
|
||||
|
||||
## 9. Consequences
|
||||
|
||||
**Positive.** ruvector now has a real, deterministic, pure-Rust HNSW graph index (25× over linear scan at high recall) usable by the AETHER re-ID / sketch-prefilter path — the ANN substrate ADR-156 §5 #1 wanted. The SymphonyQG claim is no longer CLAIMED-only: we built the missing baseline and **measured** the direction, with the bug-caught-by-the-gate disclosed.
|
||||
|
||||
**Negative / honest.** The 1-bit quantized variant is **not** an equal-recall QPS win at our scale; it is shipped as a measured experiment with a clearly-stated ceiling, not as a recommended default. Anyone reaching for it must read §7.
|
||||
|
||||
**Deferred (not silently dropped).**
|
||||
- **Multi-bit / RaBitQ-estimator traversal score.** Replace 1-bit Hamming traversal with a ≤4-bit code or the `estimator.rs` unbiased rescale (ADR-156 §10/§11) — the lever most likely to lift quantized recall to the equal-recall regime.
|
||||
- **Large-N crossover measurement.** Re-run §6 at N=100k–1M (`ANN_BENCH_N`) to find where quantization's per-node saving starts to dominate.
|
||||
- **Wiring HNSW into the live re-ID path** (AETHER hot-cache / sketch prefilter) behind a flag.
|
||||
|
||||
---
|
||||
|
||||
## 10. What changed, file by file
|
||||
|
||||
- `hnsw.rs` (new) — float HNSW: graph, seeded-deterministic level assignment, Algorithm-2 beam search, Algorithm-4 neighbour selection, L2/cosine, brute-force ground truth, full degenerate-case guards; 10 tests incl. the recall@10 correctness gate (L2 + cosine) and determinism. The insert-order bug fix (§4).
|
||||
- `hnsw_quantized.rs` (new) — SymphonyQG-style quantized-traversal index over the shared graph: 1-bit Pass-2 code per node, Hamming-scored greedy + beam, exact-float rerank; 7 tests incl. the rerank-recall gate and determinism.
|
||||
- `ann_measure.rs` (new) — shared deterministic fixture + recall/QPS measurement for linear / float-HNSW / quant-HNSW, the `ann_bench_report` test (the §6 source of truth), `ANN_BENCH_N` override.
|
||||
- `benches/ann_bench.rs` (new) + `Cargo.toml` `[[bench]]` — criterion bench over the same fixture/indices.
|
||||
- `lib.rs` — `pub mod hnsw / hnsw_quantized / ann_measure`; re-export `HnswIndex`, `HnswParams`, `Metric`, `QuantizedHnswIndex`.
|
||||
- `ADR-156-ruvector-fusion-beyond-sota.md` §5 #1 + §8 backlog — SymphonyQG regraded **CLAIMED → MEASURED-direction-tested (refuted at N=10k for our 1-bit construction)**, pointing here.
|
||||
- `CHANGELOG.md` — `[Unreleased]` entry.
|
||||
+2
-2
@@ -97,8 +97,8 @@ Statuses: **Proposed** (under discussion), **Accepted** (approved and/or impleme
|
||||
| [ADR-036](ADR-036-rvf-training-pipeline-ui.md) | Training Pipeline UI Integration | Proposed |
|
||||
| [ADR-043](ADR-043-sensing-server-ui-api-completion.md) | Sensing Server UI API Completion (14 endpoints) | Accepted |
|
||||
| [ADR-115](ADR-115-home-assistant-integration.md) | Home Assistant integration via MQTT auto-discovery + Matter bridge (HA-DISCO + HA-FABRIC + HA-MIND) | Accepted (MQTT track) / Proposed (Matter SDK P8b) |
|
||||
| [ADR-147](ADR-147-adam-mode-light-theme.md) | adam-mode — light theme toggle for the three.js realtime demo | Proposed |
|
||||
| [ADR-148](ADR-148-yoga-mode-pose-system.md) | yoga-mode — yoga pose detection, classification, and scoring for the three.js realtime demo | Proposed |
|
||||
| [ADR-169](ADR-169-adam-mode-light-theme.md) | adam-mode — light theme toggle for the three.js realtime demo | Proposed |
|
||||
| [ADR-170](ADR-170-yoga-mode-pose-system.md) | yoga-mode — yoga pose detection, classification, and scoring for the three.js realtime demo | Proposed |
|
||||
|
||||
### Architecture and infrastructure
|
||||
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# ADR Corpus Census
|
||||
|
||||
Full per-ADR census underpinning ADR-164. **162 ADR entries across 156 distinct files** (6 duplicate-number collisions). Source of truth for the gap-analysis lenses. Where the census is uncertain it is marked *needs verification*.
|
||||
Full per-ADR census underpinning ADR-164. **162 ADR entries across 156 distinct files** (the 5 duplicate-number collisions / 6 displaced files have been RESOLVED — displaced files renumbered to ADR-166…171 per ADR-164 G1; the ADR-134 identity split is tracked separately under G3). Source of truth for the gap-analysis lenses. Where the census is uncertain it is marked *needs verification*.
|
||||
|
||||
| ADR | Title | Status | impl_state | Flags |
|
||||
|-----|-------|--------|-----------|-------|
|
||||
@@ -53,10 +53,10 @@ Full per-ADR census underpinning ADR-164. **162 ADR entries across 156 distinct
|
||||
| ADR-047 | RuView Observatory — Three.js Visualization | Accepted (Implemented) | implemented | — |
|
||||
| ADR-048 | Adaptive CSI Activity Classifier | Accepted | implemented | depends on Proposed ADR-045 |
|
||||
| ADR-049 | Cross-Platform WiFi Detection & Graceful Degradation | Proposed | proposed-only | targets Python v1 legacy; abandonment risk |
|
||||
| ADR-050 | Provisioning Tool Enhancements | Proposed | partial | DUPLICATE NUMBER; partially fulfilled by ADR-060 |
|
||||
| ADR-050 | Quality Engineering Response — Security Hardening | Accepted | partial | DUPLICATE NUMBER; unverified claims (54K fps); findings #6-8 unconfirmed |
|
||||
| ADR-052 | DDD Bounded Contexts (appendix) | (none — appendix, no Status) | unknown | missing-status; DUPLICATE NUMBER; cross-ref errors (cites 044 for provisioning) |
|
||||
| ADR-052 | Tauri Desktop Frontend — Hardware Mgmt & Viz | Proposed | partial | DUPLICATE NUMBER; superseded_by ADR-054; status drift |
|
||||
| ADR-050 | Provisioning Tool Enhancements | Proposed | partial | keeps 050 (collision resolved); partially fulfilled by ADR-060 |
|
||||
| ADR-166 | Quality Engineering Response — Security Hardening | Accepted | partial | renumbered from ADR-050 (collision resolved); unverified claims (54K fps); findings #6-8 unconfirmed |
|
||||
| ADR-167 | DDD Bounded Contexts (appendix to ADR-052) | (none — appendix, no Status) | unknown | renumbered from ADR-052 (collision resolved); missing-status; cross-ref errors (cites 044 for provisioning) |
|
||||
| ADR-052 | Tauri Desktop Frontend — Hardware Mgmt & Viz | Proposed | partial | keeps 052 (collision resolved); superseded_by ADR-054; status drift |
|
||||
| ADR-053 | UI Design System — Dark Professional | Accepted | implemented | depends on Proposed ADR-052 |
|
||||
| ADR-054 | RuView Desktop Full Implementation | Accepted — in progress | partial | command matrix mostly Stub; espflash version drift vs 052 |
|
||||
| ADR-055 | Integrated Sensing Server in Desktop App | Accepted | implemented | — |
|
||||
@@ -145,13 +145,13 @@ Full per-ADR census underpinning ADR-164. **162 ADR entries across 156 distinct
|
||||
| ADR-144 | UWB Range-Constraint Fusion | Proposed | partial | header stale (commit b10bc2e9a); no UWB radio in fleet |
|
||||
| ADR-145 | Ablation Evaluation Harness | Proposed | partial | referenced as existing by 149/150/151; F4/UWB variant HW-gated |
|
||||
| ADR-146 | RF Encoder Multi-Task Heads + Uncertainty | Proposed | proposed-only | no Impl note (unlike 141-144); depends on tch/libtorch |
|
||||
| ADR-147 | adam-mode — light theme toggle | Proposed | proposed-only | DUPLICATE NUMBER (3 files); referenced as landed by 148-yoga |
|
||||
| ADR-147 | Occupancy World Model (OccWorld/RoboOccWorld) | Accepted | partial | DUPLICATE NUMBER; self-revised from Cosmos; Phase B gated |
|
||||
| ADR-147 | Benchmark Proof — OccWorld on RTX 5080 | (none) | unknown | MISSING STATUS; DUPLICATE NUMBER; baseline-without-fine-tuning (random weights) |
|
||||
| ADR-148 | Drone Swarm Control System | In Progress | partial | DUPLICATE NUMBER; re-routes 147 Cosmos item to 149 |
|
||||
| ADR-148 | yoga-mode — pose detection/scoring demo | Proposed | proposed-only | DUPLICATE NUMBER; no tracking issue |
|
||||
| ADR-149 | AetherArena — Spatial-Intelligence Benchmark (HF) | Accepted | partial | DUPLICATE NUMBER; external repo out-of-tree; Wi-Pose dropped |
|
||||
| ADR-149 | Drone Swarm Benchmarking Methodology | Accepted (peer-reviewed) | partial | DUPLICATE NUMBER; critiques 148's own numbers |
|
||||
| ADR-169 | adam-mode — light theme toggle | Proposed | proposed-only | renumbered from ADR-147 (collision resolved); referenced by ADR-170 yoga |
|
||||
| ADR-147 | Occupancy World Model (OccWorld/RoboOccWorld) | Accepted | partial | keeps 147 (collision resolved); self-revised from Cosmos; Phase B gated |
|
||||
| ADR-168 | Benchmark Proof — OccWorld on RTX 5080 | (none) | unknown | renumbered from ADR-147 (collision resolved); MISSING STATUS; baseline-without-fine-tuning (random weights) |
|
||||
| ADR-148 | Drone Swarm Control System | In Progress | partial | keeps 148 (collision resolved); re-routes 147 Cosmos item to 149 |
|
||||
| ADR-170 | yoga-mode — pose detection/scoring demo | Proposed | proposed-only | renumbered from ADR-148 (collision resolved); no tracking issue |
|
||||
| ADR-149 | AetherArena — Spatial-Intelligence Benchmark (HF) | Accepted | partial | keeps 149 (collision resolved); external repo out-of-tree; Wi-Pose dropped |
|
||||
| ADR-171 | Drone Swarm Benchmarking Methodology | Accepted (peer-reviewed) | partial | renumbered from ADR-149 (collision resolved); critiques 148's own numbers |
|
||||
| ADR-150 | RuView RF Foundation Encoder | Proposed | partial | status Proposed but cites measured 81.63% in-domain vs ~11.6% cross-subject |
|
||||
| ADR-151 | Per-Room Calibration & Specialized Model Training | Accepted — Stages 1-5 impl | partial | HF-backbone distillation pending |
|
||||
| ADR-152 | WiFi-Pose SOTA 2026 Intake | Proposed | partial | header stale; §2.1-2.3/2.6 impl, WiFlow-STD ~96% PCK; 1/25 claim REFUTED |
|
||||
|
||||
@@ -6,7 +6,7 @@ Research notes backing ADR-164. Each lens output is reproduced verbatim. Census:
|
||||
|
||||
## Lens 1: status-distribution
|
||||
|
||||
Confirmed: ADR-147-benchmark-proof.md and ADR-134-csi-to-cir have no `Status` line in their headers (the 052-ddd hits are Rust code in the body, not a header; the ADR-052 appendix lacks a real Status header per its first lines). Findings are evidence-grounded. Final analysis below.
|
||||
Confirmed: ADR-168-benchmark-proof.md (was ADR-147-benchmark-proof.md) and ADR-134-csi-to-cir have no `Status` line in their headers (the 167-ddd hits are Rust code in the body, not a header; the ADR-167 appendix, was ADR-052-ddd, lacks a real Status header per its first lines). Findings are evidence-grounded. Final analysis below.
|
||||
|
||||
### ADR Corpus — Status & Implementation Distribution
|
||||
|
||||
@@ -20,7 +20,7 @@ Census: **162 ADR entries** across **156 distinct files** (6 duplicate-number co
|
||||
| Proposed (incl. "Proposed — conditional/research-only") | ~88 |
|
||||
| Superseded | 1 (ADR-002) |
|
||||
| Rejected | 1 (ADR-098) |
|
||||
| Missing / no Status header | 3 (ADR-147-benchmark-proof, ADR-052-ddd appendix, ADR-134-CIR) |
|
||||
| Missing / no Status header | 3 (ADR-168-benchmark-proof [was 147], ADR-167-ddd appendix [was 052], ADR-134-CIR) |
|
||||
| Mixed/dual status in one ADR | 3 (ADR-115, ADR-149-AetherArena vs swarm, ADR-133) |
|
||||
|
||||
#### impl_state tally
|
||||
@@ -31,29 +31,29 @@ Census: **162 ADR entries** across **156 distinct files** (6 duplicate-number co
|
||||
| partial | ~50 |
|
||||
| proposed-only | ~64 |
|
||||
| stale-or-contradicted | 3 (ADR-029, 030, 031) |
|
||||
| unknown | 5 (ADR-034, 044, 052-ddd, 147-proof, …) |
|
||||
| unknown | 5 (ADR-034, 044, 167-ddd [was 052], 168-proof [was 147], …) |
|
||||
| superseded | 1 (ADR-002) |
|
||||
|
||||
**Headline:** ~114 of 162 ADRs (70%) are decisions that never fully landed (proposed-only + partial + stale + unknown). The dominant failure mode is **stale Status headers** — Accepted/implemented work still labeled "Proposed."
|
||||
|
||||
#### SEVERITY: CRITICAL — Status header missing or structurally absent (cannot triage)
|
||||
|
||||
- **ADR-147-benchmark-proof.md** — *No `Status` header at all* (grep confirmed). Not a true ADR; it's a benchmark artifact (OccWorld @ ~213ms on RTX 5080, random weights) misfiled under the ADR-147 number. **Action: relocate to `docs/proof/` or `benchmarks/`, remove ADR number.**
|
||||
- **ADR-168-benchmark-proof.md** (renumbered from ADR-147 to resolve the 147 collision) — *No `Status` header at all* (grep confirmed). Not a true ADR; it's a benchmark artifact (OccWorld @ ~213ms on RTX 5080, random weights) that was misfiled under the ADR-147 number. **Action: relocate to `docs/proof/` or `benchmarks/`, remove ADR number.**
|
||||
- **ADR-134-csi-to-cir-time-domain-multipath.md** — *No `Status` header* (grep confirmed) in the header region. Body says Proposed but the field is not in canonical position. Compounded by a **number collision**: ADR-126/129 reference "ADR-134" as HOMECORE-MIGRATE, but the on-disk file is CIR. **Action: add canonical `## Status` line; resolve the 134 identity split.**
|
||||
- **ADR-052-ddd-bounded-contexts.md** — Appendix doc with no Status/Date header (grep found only Rust code, no header field). **Action: mark explicitly "Appendix to ADR-052 (no independent status)".**
|
||||
- **ADR-167-ddd-bounded-contexts.md** (renumbered from ADR-052 to resolve the 052 collision; still an appendix to parent ADR-052) — Appendix doc with no Status/Date header (grep found only Rust code, no header field). **Action: mark explicitly "Appendix to ADR-052 (no independent status)".**
|
||||
|
||||
#### SEVERITY: CRITICAL — Duplicate ADR numbers (6 collisions, all verified on disk)
|
||||
|
||||
| Number | Colliding files | Action |
|
||||
|---|---|---|
|
||||
| **147** | adam-mode-light-theme · nvidia-cosmos/OccWorld · benchmark-proof | Renumber 2 of 3 |
|
||||
| **148** | drone-swarm-control-system · yoga-mode-pose-system | Renumber 1 |
|
||||
| **149** | AetherArena-leaderboard · swarm-benchmarking | Renumber 1 |
|
||||
| **050** | provisioning-tool-enhancements · quality-engineering-security-hardening | Renumber 1 |
|
||||
| **052** | tauri-desktop-frontend · ddd-bounded-contexts (appendix) | Demote appendix |
|
||||
| **134** | csi-to-cir (on disk) · HOMECORE-MIGRATE (referenced, no file) | Resolve identity |
|
||||
| Number | Colliding files | Action | Resolution |
|
||||
|---|---|---|---|
|
||||
| **147** | adam-mode-light-theme · nvidia-cosmos/OccWorld · benchmark-proof | Renumber 2 of 3 | **RESOLVED** — 147 keeps nvidia-cosmos/OccWorld; benchmark-proof → **ADR-168**, adam-mode → **ADR-169** |
|
||||
| **148** | drone-swarm-control-system · yoga-mode-pose-system | Renumber 1 | **RESOLVED** — 148 keeps drone-swarm; yoga-mode → **ADR-170** |
|
||||
| **149** | AetherArena-leaderboard · swarm-benchmarking | Renumber 1 | **RESOLVED** — 149 keeps AetherArena; swarm-benchmarking → **ADR-171** |
|
||||
| **050** | provisioning-tool-enhancements · quality-engineering-security-hardening | Renumber 1 | **RESOLVED** — 050 keeps provisioning (5 refs vs 1); quality-engineering → **ADR-166** |
|
||||
| **052** | tauri-desktop-frontend · ddd-bounded-contexts (appendix) | Demote appendix | **RESOLVED** — 052 keeps tauri; ddd appendix renumbered → **ADR-167** (still linked to parent 052) |
|
||||
| **134** | csi-to-cir (on disk) · HOMECORE-MIGRATE (referenced, no file) | Resolve identity | Identity split (not a filename collision); resolved separately via G3 → ADR-165 |
|
||||
|
||||
These break the ADR index and `/adr` tooling — two ADRs answering to one number is a corpus-integrity defect, not cosmetics.
|
||||
These broke the ADR index and `/adr` tooling — two ADRs answering to one number is a corpus-integrity defect, not cosmetics. The five filename collisions are now resolved (six displaced files renumbered 166–171); see ADR-164 Gap Register G1.
|
||||
|
||||
#### SEVERITY: HIGH — Status header stale vs. shipped reality (Proposed header on landed code)
|
||||
|
||||
@@ -91,7 +91,7 @@ Cluster heads where the whole chain is Proposed with zero implementation evidenc
|
||||
|
||||
#### Ranked actionable backlog (do in this order)
|
||||
|
||||
1. **Resolve 6 duplicate ADR numbers + 3 missing-header files** (CRITICAL — breaks the index/tooling). Renumber 147×2, 148, 149, 050; demote 052-ddd appendix; resolve the 134 identity split; add Status headers to 147-proof, 134, 052-ddd.
|
||||
1. **Resolve 6 duplicate ADR numbers + 3 missing-header files** (CRITICAL — breaks the index/tooling). **Number collisions RESOLVED:** renumbered 147×2 (benchmark-proof→168, adam-mode→169), 148 (yoga→170), 149 (swarm-benchmarking→171), 050 (quality-engineering→166), 052 ddd appendix→167. Remaining: resolve the 134 identity split (done via G3→165); add Status headers to 168-proof, 134, 167-ddd (owner-gated).
|
||||
2. **Bulk-flip the 10 streaming-engine headers (ADR-136–145)** from Proposed → "Accepted — partial" — they have commit-pinned, test-backed Implementation Status notes. Highest ROI: one batch fixes the largest stale-status cluster.
|
||||
3. **Fix the status-graph inversions** (032/053/048/077 depend on Proposed parents; promote parents 029/030/031/045/052/075/076 to match their built reality, or downgrade the dependents).
|
||||
4. **Reconcile CLAUDE.md vs ADR headers** for 017, 024, 027, 072, 152 (doc says one thing, header another).
|
||||
@@ -184,7 +184,7 @@ The sweep (ADR-154–163) is itself a structured retraction layer: each "Beyond-
|
||||
|
||||
**[MEDIUM] ADR-098 → ADR-099 partial reversal.** ADR-098 **Rejected** midstream as a system component; ADR-099 (Proposed) **adopts** midstream's temporal-compare (DTW) + temporal-attractor-studio as a parallel tap. Framed as "complementary," but it revives the exact carve-outs ADR-098 declined to integrate — a live decision conflict pending resolution.
|
||||
|
||||
**[MEDIUM] ADR-147 (OccWorld) self-retracts Cosmos.** The accepted ADR-147 title/decision was revised from "NVIDIA Cosmos WFM Integration" to OccWorld after a hardware finding (Cosmos needs 32.5 GB VRAM); Cosmos is retracted as primary. The companion ADR-147-benchmark-proof reports 213 ms/inference on **random weights, no checkpoint** — a baseline-without-fine-tuning number that must not be cited as a quality/target metric.
|
||||
**[MEDIUM] ADR-147 (OccWorld) self-retracts Cosmos.** The accepted ADR-147 title/decision was revised from "NVIDIA Cosmos WFM Integration" to OccWorld after a hardware finding (Cosmos needs 32.5 GB VRAM); Cosmos is retracted as primary. The companion ADR-168-benchmark-proof (renumbered from ADR-147) reports 213 ms/inference on **random weights, no checkpoint** — a baseline-without-fine-tuning number that must not be cited as a quality/target metric.
|
||||
|
||||
#### B. Pairs making CONFLICTING decisions on the same topic
|
||||
|
||||
|
||||
@@ -181,7 +181,7 @@ A facade hides its failures. We document ours in detail:
|
||||
a 20 KB int4 edge model, with the quantization trade-offs shown.
|
||||
- **Retractions** — the "100% presence" figure was withdrawn in-place rather than quietly
|
||||
edited away.
|
||||
- **[ADR-147 benchmark proof](adr/ADR-147-benchmark-proof.md)** and
|
||||
- **[ADR-168 benchmark proof](adr/ADR-168-benchmark-proof.md)** and
|
||||
**[WITNESS-LOG-028](WITNESS-LOG-028.md)** — how the numbers are produced and a 33-row
|
||||
per-claim attestation matrix.
|
||||
|
||||
|
||||
@@ -33,11 +33,11 @@ Role mapping is normative per ADR-136 §2.1; maturity is this review's judgment
|
||||
| **signal** | `wifi-densepose-signal` (incl. `ruvsense/`) | 6-stage pipeline (`ruvsense/mod.rs:9-23`), `cir.rs`, `calibration.rs`, `hampel.rs`, `fresnel.rs`, `phase_sanitizer.rs` | 473 | **Production** (unit level); live multistatic wiring **beta** | §3 below; ADR-014 Accepted, ADR-029 Proposed |
|
||||
| **fusion** | `ruvsense/multistatic.rs`, `ruvsense/fusion_quality.rs`, `wifi-densepose-ruvector/src/viewpoint/` | `MultistaticFuser`, `QualityScore`, `CrossViewpointAttention`, GDI/Cramér-Rao (`viewpoint/geometry.rs`) | 20 (multistatic.rs), 3 (fusion_quality.rs), 136 (ruvector crate) | **Beta** — tested building blocks, composed only in `wifi-densepose-engine` tests | `viewpoint/mod.rs:1-30`; engine `lib.rs:317-319` |
|
||||
| **world** | `homecore`, `wifi-densepose-worldgraph`, `wifi-densepose-geo`, `wifi-densepose-worldmodel` | `StateMachine`, `EventBus`, `WorldGraph` (rooms/sensors/person-tracks/semantic states), ENU geo registration | 9+11, 7, 16+1, 12+1 | **Beta** — homecore is explicit "P1 scaffold"; persistence/service dispatch deferred to P2 | `homecore/src/lib.rs:7, 24-31`; ADR-127 Proposed |
|
||||
| **models** | `cog-pose-estimation`, `cog-person-count`, `wifi-densepose-nn`, `wifi-densepose-train`, `wifi-densepose-occworld-candle` | ONNX/Candle inference, training pipeline, OccWorld bridge | 7, 15, 30+1, 312, 12 | **Experimental** — no trained RF foundation encoder exists; ADR-147 benchmarked OccWorld with **random weights** | `ADR-147-benchmark-proof.md` ("random weights — pre-domain-fine-tuning baseline"); ADR-146/150 Proposed |
|
||||
| **models** | `cog-pose-estimation`, `cog-person-count`, `wifi-densepose-nn`, `wifi-densepose-train`, `wifi-densepose-occworld-candle` | ONNX/Candle inference, training pipeline, OccWorld bridge | 7, 15, 30+1, 312, 12 | **Experimental** — no trained RF foundation encoder exists; ADR-147 benchmarked OccWorld with **random weights** | `ADR-168-benchmark-proof.md` ("random weights — pre-domain-fine-tuning baseline"); ADR-146/150 Proposed |
|
||||
| **privacy** | `wifi-densepose-bfld` | `privacy_gate.rs`, `privacy_mode.rs` (mode registry + hash-chained attestation), `identity_risk.rs`, `signature_hasher.rs`, `embedding_ring.rs` | 369 | **Beta** — strongest-tested layer, but lib header still says "Status: P1 in progress" (`lib.rs:12`, stale vs 20 implemented modules) | ADR-118–123, 141 all Proposed |
|
||||
| **store** | `homecore-recorder` | trajectory/event recording | 8+12 | **Experimental** | ADR-136 §2.1 |
|
||||
| **api** | `homecore-api`, `homecore-server`, `cog-ha-matter`, `homecore-hap` | REST/WS, HA discovery, Matter, HomeKit | 7+11, 0, 63+1, 15+2 | **Experimental→Beta** (`homecore-server` has zero tests) | ADR-130/125/115 Proposed |
|
||||
| **eval** | `wifi-densepose-train/src/ablation.rs`, `ruview-swarm/src/evals/` | ablation harness (ADR-145), swarm eval suite (ADR-149) | included in 312 / 115 | **Experimental** — ADR-145 self-labels "skeleton/scaffolding, mostly not yet on the live 20 Hz path" | `ablation.rs` exists; ADR-149 (swarm benchmarking) Accepted |
|
||||
| **eval** | `wifi-densepose-train/src/ablation.rs`, `ruview-swarm/src/evals/` | ablation harness (ADR-145), swarm eval suite (ADR-171) | included in 312 / 115 | **Experimental** — ADR-145 self-labels "skeleton/scaffolding, mostly not yet on the live 20 Hz path" | `ablation.rs` exists; ADR-171 (swarm benchmarking, renumbered from ADR-149) Accepted |
|
||||
| **observe** | `homecore-automation`, `homecore-assist` | automation engine, assistant/Ruflo bridge | 20+14, 3+20 | **Experimental** | ADR-129/133 Proposed |
|
||||
| **(integration root)** | `wifi-densepose-engine` | `StreamingEngine`, `TrustedOutput`, privacy demotion, witness | 11 | **Beta** — the only crate that proves cross-role composition; not on a live I/O path | `engine/src/lib.rs:1-29, 457-751` |
|
||||
| **(swarm)** | `ruview-swarm` | Raft/gossip topology, RRT-APF planning, Candle PPO MARL, CSI sensing payload, failsafe, Ruflo | 115+19 | **Experimental/simulation** — M3 needs real ESP32-S3 hardware | ADR-148:940-953 ("Overall ~98%", M3 85%) |
|
||||
@@ -148,7 +148,7 @@ This is genuinely strong design. But all inputs are synthetic `MultiBandCsiFrame
|
||||
| R5 | **Float nondeterminism in fusion** across thread counts could silently break the witness/replay contract once wired | Medium | High | ADR-136 §3.3 risk table (project's own assessment) |
|
||||
| R6 | **Privacy bypass via unwired paths**: BFLD invariants are enforced per-module, but until the engine is the *only* route from ingest to API, a sensing-server endpoint can emit ungated state (sensing-server already has 30+ modules incl. pose/vitals APIs predating the control plane) | Medium | Critical | `sensing-server/src/` module list vs engine isolation |
|
||||
| R7 | **Hardware dependence + scale**: multistatic TDMA/channel-hopping timing validated on small ESP32 sets; ADR-148 M3 explicitly blocked on real hardware; clock-quality model in engine uses a hardcoded `ClockQualityScore` (`engine/src/lib.rs:384`) | Medium | High | ADR-148:946; hardcoded 50 µs stdev |
|
||||
| R8 | **ADR/doc/status drift**: 150 ADRs with near-universal "Proposed" status, stale in-source status headers (`bfld/src/lib.rs:12`), CLAUDE.md "16 ruvsense modules" vs 22 on disk, duplicate ADR numbers (two ADR-050s, two ADR-147s, two ADR-149s, ADR-052 ×2) — institutional-memory value degrades | High | Medium | `ls docs/adr/`; this review §3 |
|
||||
| R8 | **ADR/doc/status drift**: 150 ADRs with near-universal "Proposed" status, stale in-source status headers (`bfld/src/lib.rs:12`), CLAUDE.md "16 ruvsense modules" vs 22 on disk, duplicate ADR numbers (two ADR-050s, two ADR-147s, two ADR-149s, ADR-052 ×2 — **now RESOLVED: displaced files renumbered to ADR-166…171 per ADR-164 G1**) — institutional-memory value degrades | High | Medium | `ls docs/adr/`; this review §3 |
|
||||
| R9 | **Workspace breadth vs maintenance capacity**: 38 workspace crates + 4 vendored subtrees + Python archive + firmware; several crates have 0 tests (`homecore-server`, `nvsim-server`, `wifi-densepose-wasm`, `homecore-plugin-example`); bus factor appears to be ~1 | High | Medium | crate test-count table §2 |
|
||||
| R10 | **Eval debt**: no end-to-end accuracy benchmark on real CSI with ground truth exists in-repo (ADR-145 harness is scaffolding; ADR-079 camera ground truth not exercised here) — "beyond SOTA" claims are currently unfalsifiable | High | High | ADR-145 status note; absence of ground-truth datasets in tree |
|
||||
|
||||
|
||||
@@ -18,7 +18,7 @@ published from the layer it lives at.
|
||||
|-------|----------------|---------|-----------|-------------|
|
||||
| **L0** Unit/integration tests | Code correctness | `cargo test --workspace --no-default-features` + pytest | per commit | exact |
|
||||
| **L1** Deterministic proof + witness bundle | Pipeline is real, unchanged, reproducible | `archive/v1/data/proof/verify.py`, `scripts/generate-witness-bundle.sh` | per merge / release | exact (SHA-256) |
|
||||
| **L2** Criterion micro-benchmarks | Compute latency only — never quality (ADR-149 §2) | 15 bench targets across `v2/crates/*/benches/` | nightly / pre-release | statistical |
|
||||
| **L2** Criterion micro-benchmarks | Compute latency only — never quality (ADR-171 §2) | 15 bench targets across `v2/crates/*/benches/` | nightly / pre-release | statistical |
|
||||
| **L3** Dataset-level accuracy eval | Pose/presence/vitals quality vs published SOTA | MM-Fi / Wi-Pose (ADR-015), `ruview_metrics.rs` tiers, ADR-145 ablation harness | per model release | seeded |
|
||||
| **L4** Hardware-in-loop | Real CSI on real ESP32, no mocks | COM9 (S3) / COM12 (C6) protocol, witness firmware hashes | per firmware release | A/B controlled |
|
||||
| **L5** Field trials / live capture | End-to-end behavior in a real room | live-session captures (e.g. `benchmark_baseline.json`) | campaign | statistical |
|
||||
@@ -69,7 +69,7 @@ from the check inventory.
|
||||
|
||||
### 1.3 L2 — Criterion micro-benchmark inventory (all 15 targets)
|
||||
|
||||
All bench sources read directly. Per ADR-149 §2 these are **latency regression gates
|
||||
All bench sources read directly. Per ADR-171 §2 these are **latency regression gates
|
||||
only, never quality evidence**.
|
||||
|
||||
| Bench target | Crate | Benchmark functions / groups | What it measures | Recorded value or in-source target (citation) |
|
||||
@@ -86,7 +86,7 @@ only, never quality evidence**.
|
||||
| `detection_bench.rs` | wifi-densepose-mat | `breathing_detection`, `heartbeat_detection`, `movement_classification`, `detection_pipeline`, localization (triangulation/depth), alert generation | MAT survivor-detection algorithms at varying signal lengths / noise | no recorded baseline |
|
||||
| `transport_bench.rs` | wifi-densepose-hardware | `beacon_serialize_16byte/28byte_auth/quic_framed`, `auth_beacon_verify`, `replay_window`, `framed_message` encode/decode, `secure_tdm_cycle` (manual vs QUIC) | TDM beacon crypto + transport | no recorded baseline |
|
||||
| `mqtt_throughput.rs` | wifi-densepose-sensing-server | `discovery::build_*`, `state::*`, `rate_limiter::allow_*`, `privacy::decide_*`, `semantic::bus_tick_all_10_primitives` | ADR-115 MQTT hot path | Targets (header): discovery **<5 µs**, state encode **<2 µs**, rate limit **<100 ns**, privacy **<50 ns**, bus tick **<10 µs** |
|
||||
| `swarm_bench.rs` | ruview-swarm | `marl_actor_inference`, `rrt_apf_100iter`, `multiview_fusion_3drones`, `demo_coverage_estimate`, `ppo_update_64transitions` | ADR-148 swarm control-loop compute | Measured: **3.3 µs / 43 µs / 54–58.5 ns / 100 ps / 248 µs** (ADR-149 §4.3; `CHANGELOG.md` Performance section) |
|
||||
| `swarm_bench.rs` | ruview-swarm | `marl_actor_inference`, `rrt_apf_100iter`, `multiview_fusion_3drones`, `demo_coverage_estimate`, `ppo_update_64transitions` | ADR-148 swarm control-loop compute | Measured: **3.3 µs / 43 µs / 54–58.5 ns / 100 ps / 248 µs** (ADR-171 §4.3; `CHANGELOG.md` Performance section) |
|
||||
| `pipeline_throughput.rs` | nvsim | `pipeline_run` (sample-count sweep), `witness::run` vs `run_with_witness` | NV-diamond sim throughput + witness overhead | Acceptance: **≥1 kHz** simulated samples/s on Cortex-A53-class CPU — bench header |
|
||||
| `state_machine.rs` | homecore | `set` first/warm/no-op, `get` hit/miss, `all_snapshot`, `all_by_domain_light_20_of_100`, `broadcast_fan_out` | HOMECORE state-machine hot paths | no recorded baseline |
|
||||
|
||||
@@ -109,7 +109,7 @@ file itself); its producer must be identified and committed (§5.3). Summary val
|
||||
| `person_count_changes` | 10 |
|
||||
|
||||
Criterion latencies that *have* been recorded live in ADR documents instead
|
||||
(ADR-147-benchmark-proof.md, ADR-149 §4.3, CHANGELOG Performance) — §5 below defines
|
||||
(ADR-168-benchmark-proof.md, ADR-171 §4.3, CHANGELOG Performance) — §5 below defines
|
||||
how to consolidate them into a real machine-readable criterion baseline.
|
||||
|
||||
### 1.4 L3 — Dataset-level accuracy evaluation
|
||||
@@ -150,7 +150,7 @@ how to consolidate them into a real machine-readable criterion baseline.
|
||||
### 1.6 L5 — Field trials
|
||||
|
||||
Live multi-node sessions captured as JSONL/JSON with summary statistics —
|
||||
`benchmark_baseline.json` (§1.3) is the existing exemplar. ADR-149 §6 adds the seeded
|
||||
`benchmark_baseline.json` (§1.3) is the existing exemplar. ADR-171 §6 adds the seeded
|
||||
`evals/` episode harness (Stage 1 kinematic full-matrix, Stage 2 Gazebo/PX4 SITL on the
|
||||
3 median seeds) for the swarm domain.
|
||||
|
||||
@@ -168,42 +168,42 @@ statistical procedure of §3 followed. Current axes with measured status:
|
||||
| Edge efficiency frontier | torso-PCK@20 at deployed precision + params + batch-1 latency | same | MultiFormer 72.25% at full size | Pareto-dominance: smaller **and** above 72.25% at the deployed precision | int8 73.5 KB **74.70%**; int4-QAT 36.7 KB **74.46%**; shipped int4 verified **74.08%**, 0.135 ms 1-thread x86 (same file) |
|
||||
| Cross-subject generalization | torso-PCK@20, official MM-Fi cross-subject split (256,608 train / 64,152 test) | leakage-free split | own zero-shot baseline 63.99% | ADR-150 §4 gate: **+≥6 pts cross-subject without losing >2 pts random-split** | Best zero-shot **64.92%** (mixup+TTA+3-seed); gate judged unreachable without new capture (ADR-150 §3.2) |
|
||||
| Few-shot calibration (deployment) | PCK@20 after K labeled in-room samples; adapter size | MM-Fi cross-subject & cross-environment splits | zero-shot (64% / 10.6%) | SOTA-level (≳72%) from ≤200 samples with ≤~11 KB per-room adapter | cross-subject ~**72%** @100–200 samples (3 seeds); cross-env **10.6→73.1%** @200, 60.1% @5 (ADR-150 §3.5–3.6) |
|
||||
| Swarm SAR localization | CEP50/CEP95 (m), GDOP-stratified | seeded episode distribution (ADR-149 §6), not single geometry | Wi2SAR **5 m** (arxiv 2604.09115, paper-to-paper) | CEP50 < 5 m, IQM over ≥10 seeds, 95% CI excluding 5 m | 1.732 m single synthetic geometry — graded **Low–Medium**, not yet claimable (ADR-149 §7) |
|
||||
| Swarm coverage | coverage-rate@240 s; time-to-95% | episode rollouts | Wi2SAR 160k m²/13.5 min | rollout (not analytic) mean+CI beating baseline | 223 s is an analytic estimate — graded **Low** (ADR-149 §7) |
|
||||
| Control-loop latency | criterion wall-clock | local hardware, named | 10 ms / 100 Hz budget | all stages ≪ budget | 3.3 µs MARL / 43 µs RRT-APF / 54 ns fusion / 248 µs PPO (ADR-149 §4.3) |
|
||||
| World-model trajectory | MDE (m) at 5-frame horizon | RuView CSI-derived occupancy | pre-fine-tune random-weight baseline 9.49 m MDE | **≤1.0 m (2.0 vox)** at 5-frame horizon (ADR-147 §5 target, cited in benchmark-proof §4) | 9.49 m / FDE 16.23 m random weights; 208.45 ms median latency on real CSI (ADR-147-benchmark-proof §4, §7) |
|
||||
| Swarm SAR localization | CEP50/CEP95 (m), GDOP-stratified | seeded episode distribution (ADR-171 §6), not single geometry | Wi2SAR **5 m** (arxiv 2604.09115, paper-to-paper) | CEP50 < 5 m, IQM over ≥10 seeds, 95% CI excluding 5 m | 1.732 m single synthetic geometry — graded **Low–Medium**, not yet claimable (ADR-171 §7) |
|
||||
| Swarm coverage | coverage-rate@240 s; time-to-95% | episode rollouts | Wi2SAR 160k m²/13.5 min | rollout (not analytic) mean+CI beating baseline | 223 s is an analytic estimate — graded **Low** (ADR-171 §7) |
|
||||
| Control-loop latency | criterion wall-clock | local hardware, named | 10 ms / 100 Hz budget | all stages ≪ budget | 3.3 µs MARL / 43 µs RRT-APF / 54 ns fusion / 248 µs PPO (ADR-171 §4.3) |
|
||||
| World-model trajectory | MDE (m) at 5-frame horizon | RuView CSI-derived occupancy | pre-fine-tune random-weight baseline 9.49 m MDE | **≤1.0 m (2.0 vox)** at 5-frame horizon (ADR-147 §5 target, cited in benchmark-proof §4) | 9.49 m / FDE 16.23 m random weights; 208.45 ms median latency on real CSI (ADR-168-benchmark-proof §4, §7) |
|
||||
| Privacy leakage | MIA `leakage_score = 2·(AUC−0.5)` | fixed replay, fixed-seed shadow classifier | chance (0) | ≤ **0.05** (attacker AUC ≤ 0.525) | gate defined, harness built (ADR-145 §2.3) |
|
||||
| Vitals (hardware) | BPM error vs wearable ground truth | live A/B board protocol | control board behavior | within physiological agreement of ground truth, stable spread | 88–91 BPM vs 87 GT, spread 59→0 (CHANGELOG #987) |
|
||||
|
||||
### Claim-language discipline (from ADR-149 §7 grading)
|
||||
### Claim-language discipline (from ADR-171 §7 grading)
|
||||
|
||||
| Evidence | Permitted language |
|
||||
|---|---|
|
||||
| Single run / single geometry / analytic estimate | "directional", never "beats SOTA" |
|
||||
| Seeded multi-run with CIs vs paper baseline | "exceeds the published X result paper-to-paper" |
|
||||
| Same metric, same split, same protocol, CI excludes baseline | "beyond SOTA on <dataset>/<split>" |
|
||||
| No public leaderboard exists (swarm CSI-SAR) | never claim "leaderboard standing" (ADR-149 §3) |
|
||||
| No public leaderboard exists (swarm CSI-SAR) | never claim "leaderboard standing" (ADR-171 §3) |
|
||||
|
||||
---
|
||||
|
||||
## 3. Statistical Procedure for Honest Claims
|
||||
|
||||
Adopted from ADR-149 §5 (Agarwal 2021 / Gorsane 2022 standard) and the practices
|
||||
Adopted from ADR-171 §5 (Agarwal 2021 / Gorsane 2022 standard) and the practices
|
||||
already used in ADR-150/efficiency-frontier measurements:
|
||||
|
||||
1. **Seeds.** ≥10 independent seeds for RL/episodic claims (ADR-149 §5); ≥3 seeds
|
||||
1. **Seeds.** ≥10 independent seeds for RL/episodic claims (ADR-171 §5); ≥3 seeds
|
||||
minimum for supervised dataset evals (ADR-150 §3.5 used 3 seeds; report all).
|
||||
Training seeds, eval seeds, and split files are versioned and committed.
|
||||
2. **Aggregate.** IQM (not mean/median) for episodic metrics + performance profiles;
|
||||
for dataset accuracy report mean across seeds with each seed's value listed.
|
||||
3. **Confidence intervals.** 95% stratified bootstrap, 1,000 resamples (ADR-149 §5;
|
||||
3. **Confidence intervals.** 95% stratified bootstrap, 1,000 resamples (ADR-171 §5;
|
||||
reference impl: `rliable`).
|
||||
4. **Paired comparisons.** When comparing model A vs B (e.g. `csi_plus_cir` vs
|
||||
`csi_only`, or ours vs a reproduced baseline), evaluate both on the **identical
|
||||
frozen test frames** and use a paired bootstrap over per-sample correctness
|
||||
(PCK hit/miss is per-joint binary — pair at the joint-sample level). For
|
||||
paper-to-paper comparisons where the baseline cannot be re-run, state so
|
||||
explicitly ("paper-to-paper", ADR-149 §2) and require the CI lower bound to clear
|
||||
explicitly ("paper-to-paper", ADR-171 §2) and require the CI lower bound to clear
|
||||
the published point value.
|
||||
5. **Pre-registration.** The threshold lives in an ADR **before** the run
|
||||
(precedent: ADR-150 §4 gate written before §3.2 measurements; the measurements
|
||||
@@ -212,9 +212,9 @@ already used in ADR-150/efficiency-frontier measurements:
|
||||
capacity-hurts, and KD-didn't-help results in the record — required practice.
|
||||
7. **Eval episodes (swarm):** 50 fixed, versioned episodes per policy
|
||||
(10 victim layouts × 5 CSI-noise levels), ≥3 baselines (random walk,
|
||||
boustrophedon+triangulation, IPPO) (ADR-149 §5).
|
||||
boustrophedon+triangulation, IPPO) (ADR-171 §5).
|
||||
8. **GDOP stratification** for any localization claim, so geometry artifacts cannot
|
||||
produce the headline (ADR-149 §6.3).
|
||||
produce the headline (ADR-171 §6.3).
|
||||
|
||||
---
|
||||
|
||||
@@ -230,7 +230,7 @@ already used in ADR-150/efficiency-frontier measurements:
|
||||
|
||||
### 4.2 Criterion baseline file (replaces the current gap)
|
||||
|
||||
Today criterion numbers live in prose (ADR-147-benchmark-proof, ADR-149 §4.3,
|
||||
Today criterion numbers live in prose (ADR-168-benchmark-proof, ADR-171 §4.3,
|
||||
CHANGELOG). Formalize:
|
||||
|
||||
1. `cargo bench --workspace -- --save-baseline main` on a **named, fixed runner**
|
||||
@@ -293,7 +293,7 @@ Anyone outside the project must be able to re-run every claimed result:
|
||||
(`calibration_proof_runner.rs` pattern, ADR-145 §2.6) for libm portability.
|
||||
3. **Seeds are constants, committed:** `PROOF_SEED=42`, `MODEL_SEED=0`
|
||||
(`proof.rs`, ADR-015 Phase 5); dataset splits committed as `.npy`
|
||||
(`split_random.npy`); swarm configs as versioned YAML with all seeds (ADR-149 §5).
|
||||
(`split_random.npy`); swarm configs as versioned YAML with all seeds (ADR-171 §5).
|
||||
4. **Artifacts carry hashes.** Published model artifacts include SHA-256 (HuggingFace
|
||||
`pose_micro_int4.npz`, sha256 `c03eeb…` — efficiency-frontier doc); witness bundle
|
||||
has a `MANIFEST.sha256` over every file; provenance fields
|
||||
@@ -318,9 +318,9 @@ Anyone outside the project must be able to re-run every claimed result:
|
||||
| 1 | **Subject leakage / split optimism.** In-domain `random_split` has temporal/subject-adjacency effects; the same model family scores 83.6% random-split but ~11.6% torso-PCK on the leakage-free cross-subject split | efficiency-frontier "Controlled claim" footnote; ADR-150 §1, §3.2 | Always report the split name; publish random-split and cross-subject numbers side by side; cross-subject claims only on the official split |
|
||||
| 2 | **Per-environment overfitting.** Zero-shot cross-environment collapses to 10.6%; subject-scaling saturates ~63.7% past 16–20 subjects because the residual is room/device shift | ADR-150 §3.3, §3.6 | Cross-room degradation + 17-joint heatmap in every ablation (ADR-145 §2.5); claim deployment accuracy only with the calibration protocol stated (K samples, adapter size) |
|
||||
| 3 | **Mock-mode contamination.** Mock firmware missed a real Kconfig threshold bug; the nn crate ships a `mock_inference` criterion group that must never be quoted as pipeline performance | `CLAUDE.md` firmware rule 7; `inference_bench.rs` `bench_mock_inference` | L4 mandatory before firmware release ("Always test with real WiFi CSI, not mock mode"); label mock benches in reports; ADR-147 §7 re-ran the benchmark on real CSI explicitly "no mocks" |
|
||||
| 4 | **Single-run point estimates.** 1.732 m localization from one synthetic geometry; 223 s coverage from an analytic formula | ADR-149 §1, §7 | §3 seed/CI protocol; evidence-grade table before publication |
|
||||
| 5 | **Random-weight / untrained baselines read as results.** OccWorld MDE 9.49 m is a pre-fine-tuning random-weight reading | ADR-147-benchmark-proof §4 | Label baseline-vs-target explicitly; never aggregate untrained-model numbers into capability claims |
|
||||
| 6 | **Latency conflated with quality.** Criterion µs numbers prove no compute bottleneck, nothing about accuracy | ADR-149 §2, §4.3 | L2 is gate-only; quality claims live in L3+ |
|
||||
| 4 | **Single-run point estimates.** 1.732 m localization from one synthetic geometry; 223 s coverage from an analytic formula | ADR-171 §1, §7 | §3 seed/CI protocol; evidence-grade table before publication |
|
||||
| 5 | **Random-weight / untrained baselines read as results.** OccWorld MDE 9.49 m is a pre-fine-tuning random-weight reading | ADR-168-benchmark-proof §4 | Label baseline-vs-target explicitly; never aggregate untrained-model numbers into capability claims |
|
||||
| 6 | **Latency conflated with quality.** Criterion µs numbers prove no compute bottleneck, nothing about accuracy | ADR-171 §2, §4.3 | L2 is gate-only; quality claims live in L3+ |
|
||||
| 7 | **Floating-point nondeterminism breaking proofs.** SciPy FFT SIMD reordering + multithreaded BLAS produced different hashes across CI microarchitectures | CHANGELOG #560; `calibration_proof_runner.rs` lines 1–13 (cited in ADR-145 §2.3) | Quantize before hashing; pin thread env vars; exclude wall-clock from hashes |
|
||||
| 8 | **Hash churn without procedure.** Three distinct historical values of the proof hash exist (`8c0680d7…` ADR-028, `667eb054…` CHANGELOG #560, `f8e76f21…` current file) | cited files | Every regeneration via `--generate-hash` + re-verify + CHANGELOG entry + witness bundle refresh |
|
||||
| 9 | **Aggregation bugs masking accuracy.** Person count clamped to 1 by EMA mapping; eigenvalue path leaking counts up to 10; both invisible to unit tests for months | CHANGELOG #803, #894 | L5 summary gates on `person_count_changes`/count distributions; convergence tests replaying the live loop |
|
||||
@@ -336,7 +336,7 @@ Anyone outside the project must be able to re-run every claimed result:
|
||||
| Machine-readable criterion baseline (`v2/benchmarks/criterion-baseline.json`) + CI comparison job | L2 | §4.2 (numbers currently only in ADR prose) |
|
||||
| Provenance + producer script for `benchmark_baseline.json`; soft-gate job | L5 | §1.3, §4.3 (zero code references today) |
|
||||
| `ruview-cli --ablation mode=auto` wiring + `expected_ablation_<slug>.sha256` (currently placeholders → exit 2) | L3 | ADR-145 implementation status |
|
||||
| Seeded swarm `evals/` harness + `evals/RESULTS.md` internal leaderboard | L3/L5 | ADR-149 §6, §8 open issues |
|
||||
| Seeded swarm `evals/` harness + `evals/RESULTS.md` internal leaderboard | L3/L5 | ADR-171 §6, §8 open issues |
|
||||
| Fix `VERIFY.sh` hardcoded verdict count; reconcile `CLAUDE.md` "7/7" | L1 | §1.2 |
|
||||
| Curated paired room-A/room-B labeled replay set (frozen, SHA-pinned, never trained on) | L3 | ADR-145 §3.2 |
|
||||
| ARM/edge on-device latency validation for the int4 model (x86-only today) | L4 | efficiency-frontier doc ("Pi fleet pending") |
|
||||
@@ -372,8 +372,8 @@ failing test, not a slogan.
|
||||
---
|
||||
|
||||
*All values cited from: `benchmark_baseline.json`, `v2/crates/*/benches/*.rs` (15
|
||||
files), `docs/adr/ADR-147-benchmark-proof.md`,
|
||||
`docs/adr/ADR-149-swarm-benchmarking-evaluation-methodology.md`,
|
||||
files), `docs/adr/ADR-168-benchmark-proof.md`,
|
||||
`docs/adr/ADR-171-swarm-benchmarking-evaluation-methodology.md`,
|
||||
`docs/adr/ADR-145-ablation-eval-harness-privacy-leakage.md`,
|
||||
`docs/adr/ADR-028-esp32-capability-audit.md`,
|
||||
`docs/adr/ADR-015-public-dataset-training-strategy.md`,
|
||||
|
||||
@@ -15,7 +15,7 @@ validation pass run against the working tree.
|
||||
| [00-system-review.md](00-system-review.md) | Capability audit of the current engine | Signal layer is the deepest asset (`ruvsense/` ≈14.4k lines, 310 in-module tests); the model tier is the emptiest (no trained checkpoint in-tree); the live 20 Hz path is the main integration gap |
|
||||
| [01-sota-landscape-2026.md](01-sota-landscape-2026.md) | Published SOTA per capability axis (web-verified) | Defines the beyond-SOTA bar: 12-row capability → published SOTA → RuView-today → target table; IEEE 802.11bf-2025 is ratified and moves the moat up-stack |
|
||||
| [02-beyond-sota-architecture.md](02-beyond-sota-architecture.md) | Target architecture | 8 pillars (RF foundation encoder + UQ heads, differentiable RF forward model, RF-SLAM×WorldGraph loop, camera→RF distillation, swarm apertures, continual adaptation, deterministic WASM edge, NV fusion) — all landing inside existing crates, no rewrite (per ADR-136 §2.1) |
|
||||
| [03-benchmark-validation-methodology.md](03-benchmark-validation-methodology.md) | Test/validation/benchmark methodology | 6-layer validation pyramid; 15 criterion bench targets inventoried; `benchmark_baseline.json` is a live-capture anchor, not a criterion baseline; statistical protocol from ADR-149 (≥10 seeds, IQM, bootstrap CIs) |
|
||||
| [03-benchmark-validation-methodology.md](03-benchmark-validation-methodology.md) | Test/validation/benchmark methodology | 6-layer validation pyramid; 15 criterion bench targets inventoried; `benchmark_baseline.json` is a live-capture anchor, not a criterion baseline; statistical protocol from ADR-171 (≥10 seeds, IQM, bootstrap CIs) |
|
||||
| [04-optimization-roadmap.md](04-optimization-roadmap.md) | Performance review + 90-day plan | ISTA CIR solver is the dominant latency hazard (~1.1 GFLOP/frame at HE40); exact zero-risk wins identified; WorldGraph grows unboundedly (no eviction) — a real bug-class |
|
||||
|
||||
## Validation results (this session, 2026-06-09)
|
||||
@@ -83,7 +83,7 @@ Correctness post-optimization: `wifi-densepose-signal` 456 tests green;
|
||||
|
||||
1. **"Beyond SOTA" is currently unfalsifiable** without a real-CSI
|
||||
ground-truth benchmark — standing one up (per doc 03's acceptance table
|
||||
and ADR-149's statistical protocol) is the highest-leverage next step.
|
||||
and ADR-171's statistical protocol) is the highest-leverage next step.
|
||||
2. **The path is evolution, not rewrite**: all eight architecture pillars in
|
||||
doc 02 land inside existing crates on the ADR-136 `Stage<I,O>`/`FrameMeta`
|
||||
contract spine.
|
||||
|
||||
+3
-3
@@ -1113,7 +1113,7 @@ The Observatory is an immersive Three.js visualization that renders WiFi sensing
|
||||
|
||||
A pretrained CSI encoder + presence-detection head is published on Hugging Face at [`ruvnet/wifi-densepose-pretrained`](https://huggingface.co/ruvnet/wifi-densepose-pretrained). It was trained on 60,630 frames / 610,615 contrastive triplets (12.2M steps, final loss 0.065) and reports **82.3% held-out temporal-triplet accuracy** (the older "100% presence" figure was measured on a single-class recording and has been retracted) and ~164k embeddings/sec on an Apple M4 Pro.
|
||||
|
||||
> **Results & proof.** The SOTA 17-keypoint pose model is published separately at [`ruvnet/wifi-densepose-mmfi-pose`](https://huggingface.co/ruvnet/wifi-densepose-mmfi-pose) — **82.69% torso-PCK@20** on MM-Fi (83.59% ensemble + TTA), beating MultiFormer (72.25%) and CSI2Pose (68.41%). Browse the auditable [AetherArena leaderboard Space](https://huggingface.co/spaces/ruvnet/aether-arena), the full [MM-Fi study](benchmarks/mmfi-wifi-sensing-study.md), and the [efficiency frontier](benchmarks/wifi-pose-efficiency-frontier.md). Reproduce the deterministic pipeline proof with `python archive/v1/data/proof/verify.py` (must print `VERDICT: PASS`; see [ADR-147 benchmark proof](adr/ADR-147-benchmark-proof.md) and [WITNESS-LOG-028](WITNESS-LOG-028.md)).
|
||||
> **Results & proof.** The SOTA 17-keypoint pose model is published separately at [`ruvnet/wifi-densepose-mmfi-pose`](https://huggingface.co/ruvnet/wifi-densepose-mmfi-pose) — **82.69% torso-PCK@20** on MM-Fi (83.59% ensemble + TTA), beating MultiFormer (72.25%) and CSI2Pose (68.41%). Browse the auditable [AetherArena leaderboard Space](https://huggingface.co/spaces/ruvnet/aether-arena), the full [MM-Fi study](benchmarks/mmfi-wifi-sensing-study.md), and the [efficiency frontier](benchmarks/wifi-pose-efficiency-frontier.md). Reproduce the deterministic pipeline proof with `python archive/v1/data/proof/verify.py` (must print `VERDICT: PASS`; see [ADR-168 benchmark proof](adr/ADR-168-benchmark-proof.md) and [WITNESS-LOG-028](WITNESS-LOG-028.md)).
|
||||
|
||||
What it ships (and what it does not):
|
||||
|
||||
@@ -1289,7 +1289,7 @@ Once trained, the adaptive model runs automatically:
|
||||
RuView integrates [OccWorld](https://github.com/wzzheng/OccWorld) (ECCV 2024) to predict
|
||||
future 3D occupancy from WiFi CSI — extending the Kalman tracker's 5-frame horizon to
|
||||
15 predicted frames (~7 s). See [ADR-147](adr/ADR-147-nvidia-cosmos-world-foundation-model-integration.md)
|
||||
and the [benchmark proof](adr/ADR-147-benchmark-proof.md) for full details.
|
||||
and the [benchmark proof](adr/ADR-168-benchmark-proof.md) for full details.
|
||||
|
||||
**Hardware requirement:** NVIDIA GPU with ≥4 GB VRAM (validated: RTX 5080 at 209 ms / 3.4 GB).
|
||||
|
||||
@@ -1869,7 +1869,7 @@ Pre-trained models are available on HuggingFace:
|
||||
- **SOTA MM-Fi pose model** (82.69% torso-PCK@20) — https://huggingface.co/ruvnet/wifi-densepose-mmfi-pose
|
||||
- **AetherArena leaderboard Space** — https://huggingface.co/spaces/ruvnet/aether-arena
|
||||
|
||||
Download and start sensing immediately — no datasets, no GPU, no training needed. Results are reproducible via `python archive/v1/data/proof/verify.py` (deterministic SHA-256 proof) — see [ADR-147](adr/ADR-147-benchmark-proof.md).
|
||||
Download and start sensing immediately — no datasets, no GPU, no training needed. Results are reproducible via `python archive/v1/data/proof/verify.py` (deterministic SHA-256 proof) — see [ADR-168](adr/ADR-168-benchmark-proof.md).
|
||||
|
||||
### Quick Start with Pre-Trained Models
|
||||
|
||||
|
||||
@@ -367,6 +367,7 @@ static float s_heartrate_bpm;
|
||||
static float s_motion_energy;
|
||||
static float s_presence_score;
|
||||
static bool s_presence_detected;
|
||||
static uint8_t s_presence_below_count; /**< Consecutive frames below low thresh (issue #996). */
|
||||
static bool s_fall_detected;
|
||||
static int8_t s_latest_rssi;
|
||||
static uint32_t s_frame_count;
|
||||
@@ -398,6 +399,11 @@ static uint16_t s_feature_seq;
|
||||
|
||||
/** Multi-person vitals state. */
|
||||
static edge_person_vitals_t s_persons[EDGE_MAX_PERSONS];
|
||||
|
||||
/** Person-count persistence debounce (issue #998). */
|
||||
static uint8_t s_person_count_candidate; /**< Last raw (gated) candidate count. */
|
||||
static uint8_t s_person_count_streak; /**< Consecutive frames at the candidate. */
|
||||
static uint8_t s_person_count_stable; /**< Emitted (debounced) count. */
|
||||
static edge_biquad_t s_person_bq_br[EDGE_MAX_PERSONS];
|
||||
static edge_biquad_t s_person_bq_hr[EDGE_MAX_PERSONS];
|
||||
static float s_person_br_filt[EDGE_MAX_PERSONS][EDGE_PHASE_HISTORY_LEN];
|
||||
@@ -446,6 +452,61 @@ static void update_top_k(uint16_t n_subcarriers)
|
||||
s_top_k_count = k;
|
||||
}
|
||||
|
||||
/* ======================================================================
|
||||
* Presence Flag Hysteresis + Debounce (issue #996)
|
||||
* ====================================================================== */
|
||||
|
||||
/**
|
||||
* Schmitt-trigger presence decision with a clear-debounce.
|
||||
*
|
||||
* Pure function (no globals) so it is host-testable: feed a presence_score
|
||||
* trace and assert the boolean flag is stable. Replaces the old single-
|
||||
* threshold `score > threshold` compare that chattered when a noisy score
|
||||
* dithered around the boundary (observed 2.6-26.7 for one stationary person).
|
||||
*
|
||||
* - score > threshold → assert presence (enter immediately)
|
||||
* - score >= threshold * HYST_RATIO → hold current state (dead band)
|
||||
* - score < threshold * HYST_RATIO → count toward clearing; only clear
|
||||
* after CLEAR_FRAMES consecutive frames
|
||||
*
|
||||
* @param prev Current presence flag (in/out via return + below_count).
|
||||
* @param score Latest presence score.
|
||||
* @param threshold High (enter) threshold.
|
||||
* @param below_count In/out: consecutive frames the score has been below the
|
||||
* low threshold. Reset to 0 whenever the score recovers.
|
||||
* @return New presence flag.
|
||||
*/
|
||||
static bool presence_flag_update(bool prev, float score, float threshold,
|
||||
uint8_t *below_count)
|
||||
{
|
||||
float low_thresh = threshold * EDGE_PRESENCE_HYST_RATIO;
|
||||
|
||||
if (score > threshold) {
|
||||
/* Clearly present — assert and reset the clear debounce. */
|
||||
*below_count = 0;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (score >= low_thresh) {
|
||||
/* Dead band: hold whatever we had, no flicker. Recovery above the low
|
||||
* threshold also resets the clear debounce so a brief dip doesn't
|
||||
* accumulate toward a false clear. */
|
||||
*below_count = 0;
|
||||
return prev;
|
||||
}
|
||||
|
||||
/* Below the low threshold — candidate for clearing. */
|
||||
if (*below_count < 0xFF) (*below_count)++;
|
||||
if (!prev) {
|
||||
return false; /* Already cleared. */
|
||||
}
|
||||
if (*below_count >= EDGE_PRESENCE_CLEAR_FRAMES) {
|
||||
*below_count = 0;
|
||||
return false; /* Sustained absence — clear. */
|
||||
}
|
||||
return true; /* Still within the hold window — keep asserting. */
|
||||
}
|
||||
|
||||
/* ======================================================================
|
||||
* Adaptive Presence Calibration
|
||||
* ====================================================================== */
|
||||
@@ -581,6 +642,112 @@ store_prev:
|
||||
* Multi-Person Vitals
|
||||
* ====================================================================== */
|
||||
|
||||
/**
|
||||
* Count distinct persons from per-group energy + representative subcarrier (issue #998).
|
||||
*
|
||||
* Pure function (no globals) so it is host-testable. Each of the `n_groups`
|
||||
* subcarrier groups is a *candidate* person. A candidate is counted only if:
|
||||
* 1. Energy gate — its energy >= EDGE_PERSON_MIN_ENERGY_RATIO * max energy.
|
||||
* One body's multipath spreads energy unevenly across the
|
||||
* groups; weak groups are reflections, not extra people.
|
||||
* 2. Spatial dedup — its representative subcarrier is at least
|
||||
* EDGE_PERSON_MIN_SC_SEP away from every already-counted
|
||||
* person. Adjacent subcarriers see the same reflection, so
|
||||
* a near-duplicate group is the same body.
|
||||
*
|
||||
* The strongest group is always counted (so a present body yields >= 1).
|
||||
*
|
||||
* @param energy Per-group energy (e.g. phase variance), length n_groups.
|
||||
* @param sc_idx Per-group representative subcarrier index, length n_groups.
|
||||
* @param n_groups Number of candidate groups (<= EDGE_MAX_PERSONS).
|
||||
* @return Distinct person count in [0, n_groups].
|
||||
*/
|
||||
static uint8_t count_distinct_persons(const float *energy, const uint8_t *sc_idx,
|
||||
uint8_t n_groups)
|
||||
{
|
||||
if (n_groups == 0) return 0;
|
||||
|
||||
/* Strongest group sets the reference energy. */
|
||||
float max_energy = 0.0f;
|
||||
for (uint8_t g = 0; g < n_groups; g++) {
|
||||
if (energy[g] > max_energy) max_energy = energy[g];
|
||||
}
|
||||
/* No real signal anywhere → no persons. */
|
||||
if (max_energy <= 0.0f) return 0;
|
||||
|
||||
float min_energy = max_energy * EDGE_PERSON_MIN_ENERGY_RATIO;
|
||||
|
||||
uint8_t counted_sc[EDGE_MAX_PERSONS];
|
||||
uint8_t count = 0;
|
||||
|
||||
/* Greedy by descending energy: take the strongest unclaimed group that is
|
||||
* spatially separated from everything already counted. */
|
||||
bool used[EDGE_MAX_PERSONS];
|
||||
for (uint8_t g = 0; g < n_groups && g < EDGE_MAX_PERSONS; g++) used[g] = false;
|
||||
|
||||
for (uint8_t iter = 0; iter < n_groups && iter < EDGE_MAX_PERSONS; iter++) {
|
||||
/* Find the strongest still-unused group above the energy gate. */
|
||||
int best = -1;
|
||||
float best_e = min_energy; /* must beat the gate */
|
||||
for (uint8_t g = 0; g < n_groups && g < EDGE_MAX_PERSONS; g++) {
|
||||
if (used[g]) continue;
|
||||
if (energy[g] >= best_e) { best_e = energy[g]; best = g; }
|
||||
}
|
||||
if (best < 0) break; /* nothing left above the gate */
|
||||
used[best] = true;
|
||||
|
||||
/* Spatial dedup against already-counted persons. */
|
||||
bool duplicate = false;
|
||||
for (uint8_t c = 0; c < count; c++) {
|
||||
int sep = (int)sc_idx[best] - (int)counted_sc[c];
|
||||
if (sep < 0) sep = -sep;
|
||||
if (sep < EDGE_PERSON_MIN_SC_SEP) { duplicate = true; break; }
|
||||
}
|
||||
if (duplicate) continue;
|
||||
|
||||
counted_sc[count++] = sc_idx[best];
|
||||
}
|
||||
|
||||
/* The strongest group always represents at least one body. */
|
||||
if (count == 0) count = 1;
|
||||
return count;
|
||||
}
|
||||
|
||||
/**
|
||||
* Debounce a raw person count so a single noisy frame can't change the emitted
|
||||
* value (issue #998). A new candidate must hold for EDGE_PERSON_PERSIST_FRAMES
|
||||
* consecutive frames before it replaces the stable count.
|
||||
*
|
||||
* Pure function (state passed by pointer) → host-testable.
|
||||
*
|
||||
* @param raw Raw (gated) count this frame.
|
||||
* @param candidate In/out: the candidate being accumulated.
|
||||
* @param streak In/out: consecutive frames the candidate has held.
|
||||
* @param stable In/out: the currently emitted count.
|
||||
* @return The (possibly updated) stable count.
|
||||
*/
|
||||
static uint8_t person_count_debounce(uint8_t raw, uint8_t *candidate,
|
||||
uint8_t *streak, uint8_t *stable)
|
||||
{
|
||||
if (raw == *stable) {
|
||||
/* Agrees with what we emit — reset any pending change. */
|
||||
*candidate = raw;
|
||||
*streak = 0;
|
||||
return *stable;
|
||||
}
|
||||
if (raw == *candidate) {
|
||||
if (*streak < 0xFF) (*streak)++;
|
||||
} else {
|
||||
*candidate = raw;
|
||||
*streak = 1;
|
||||
}
|
||||
if (*streak >= EDGE_PERSON_PERSIST_FRAMES) {
|
||||
*stable = *candidate;
|
||||
*streak = 0;
|
||||
}
|
||||
return *stable;
|
||||
}
|
||||
|
||||
/**
|
||||
* Update multi-person vitals by assigning top-K subcarriers to person groups.
|
||||
*
|
||||
@@ -600,10 +767,25 @@ static void update_multi_person_vitals(const uint8_t *iq_data, uint16_t n_sc,
|
||||
|
||||
uint8_t subs_per_person = s_top_k_count / n_persons;
|
||||
|
||||
/* Per-group energy + representative subcarrier, for the #998 person gate. */
|
||||
float group_energy[EDGE_MAX_PERSONS] = {0};
|
||||
uint8_t group_sc[EDGE_MAX_PERSONS] = {0};
|
||||
|
||||
for (uint8_t p = 0; p < n_persons; p++) {
|
||||
edge_person_vitals_t *pv = &s_persons[p];
|
||||
pv->active = true;
|
||||
pv->subcarrier_idx = s_top_k[p * subs_per_person];
|
||||
group_sc[p] = s_top_k[p * subs_per_person];
|
||||
|
||||
/* Group energy = max Welford variance over its subcarriers. This is the
|
||||
* same variance used for top-K selection, so a multipath group (weak,
|
||||
* adjacent to the strong one) registers low energy and gets gated out. */
|
||||
float energy = 0.0f;
|
||||
for (uint8_t s = 0; s < subs_per_person; s++) {
|
||||
uint8_t sc = s_top_k[p * subs_per_person + s];
|
||||
float v = (float)welford_variance(&s_subcarrier_var[sc]);
|
||||
if (v > energy) energy = v;
|
||||
}
|
||||
group_energy[p] = energy;
|
||||
|
||||
/* Average phase across this person's subcarrier group. */
|
||||
float avg_phase = 0.0f;
|
||||
@@ -662,10 +844,32 @@ static void update_multi_person_vitals(const uint8_t *iq_data, uint16_t n_sc,
|
||||
}
|
||||
}
|
||||
|
||||
/* Mark remaining persons as inactive. */
|
||||
for (uint8_t p = n_persons; p < EDGE_MAX_PERSONS; p++) {
|
||||
/* --- Issue #998: gate phantom persons by energy + spatial dedup,
|
||||
* then debounce so a single noisy frame can't change the count. --- */
|
||||
uint8_t raw_count = count_distinct_persons(group_energy, group_sc, n_persons);
|
||||
uint8_t stable_count = person_count_debounce(raw_count,
|
||||
&s_person_count_candidate,
|
||||
&s_person_count_streak,
|
||||
&s_person_count_stable);
|
||||
|
||||
/* Mark the strongest `stable_count` groups active (descending energy); the
|
||||
* rest — including phantom multipath groups — are inactive. */
|
||||
bool used[EDGE_MAX_PERSONS];
|
||||
for (uint8_t p = 0; p < EDGE_MAX_PERSONS; p++) {
|
||||
used[p] = false;
|
||||
s_persons[p].active = false;
|
||||
}
|
||||
for (uint8_t n = 0; n < stable_count && n < n_persons; n++) {
|
||||
int best = -1;
|
||||
float best_e = -1.0f;
|
||||
for (uint8_t p = 0; p < n_persons; p++) {
|
||||
if (used[p]) continue;
|
||||
if (group_energy[p] > best_e) { best_e = group_energy[p]; best = p; }
|
||||
}
|
||||
if (best < 0) break;
|
||||
used[best] = true;
|
||||
s_persons[best].active = true;
|
||||
}
|
||||
}
|
||||
|
||||
/* ======================================================================
|
||||
@@ -960,7 +1164,12 @@ static void process_frame(const edge_ring_slot_t *slot)
|
||||
} else if (threshold == 0.0f) {
|
||||
threshold = 0.05f; /* Default until calibrated. */
|
||||
}
|
||||
s_presence_detected = (s_presence_score > threshold);
|
||||
/* Issue #996: hysteresis + clear-debounce instead of a bare threshold
|
||||
* compare, so a noisy score dithering around the boundary doesn't flicker
|
||||
* the boolean flag. */
|
||||
s_presence_detected = presence_flag_update(s_presence_detected,
|
||||
s_presence_score, threshold,
|
||||
&s_presence_below_count);
|
||||
|
||||
/* --- Step 10: Fall detection (phase acceleration + debounce, issue #263) --- */
|
||||
if (s_history_len >= 3) {
|
||||
@@ -1160,6 +1369,7 @@ esp_err_t edge_processing_init(const edge_config_t *cfg)
|
||||
s_motion_energy = 0.0f;
|
||||
s_presence_score = 0.0f;
|
||||
s_presence_detected = false;
|
||||
s_presence_below_count = 0;
|
||||
s_fall_detected = false;
|
||||
s_latest_rssi = 0;
|
||||
s_frame_count = 0;
|
||||
@@ -1183,6 +1393,9 @@ esp_err_t edge_processing_init(const edge_config_t *cfg)
|
||||
for (uint8_t p = 0; p < EDGE_MAX_PERSONS; p++) {
|
||||
s_persons[p].active = false;
|
||||
}
|
||||
s_person_count_candidate = 0;
|
||||
s_person_count_streak = 0;
|
||||
s_person_count_stable = 0;
|
||||
|
||||
/* Design biquad bandpass filters.
|
||||
* Sampling rate ~20 Hz (typical ESP32 CSI callback rate). */
|
||||
|
||||
@@ -38,6 +38,30 @@
|
||||
/* ---- Multi-person ---- */
|
||||
#define EDGE_MAX_PERSONS 4 /**< Max simultaneous persons. */
|
||||
|
||||
/* ---- Multi-person counting gates (issue #998) ----
|
||||
*
|
||||
* Over-counting root cause: the multi-person path used to split the top-K
|
||||
* subcarriers into EDGE_MAX_PERSONS groups and mark EVERY group active,
|
||||
* so one body's multipath always reported the full EDGE_MAX_PERSONS. These
|
||||
* gates promote a subcarrier group to a real "person" only when it carries
|
||||
* genuine, distinct, persistent energy:
|
||||
*
|
||||
* 1. Energy gate — a group's phase variance must exceed a fraction of the
|
||||
* strongest group's variance, else it is multipath/noise.
|
||||
* 2. Spatial dedup — two groups whose representative subcarriers sit within
|
||||
* EDGE_PERSON_MIN_SC_SEP of each other are the same body
|
||||
* (adjacent subcarriers see correlated reflections), so
|
||||
* the weaker one is merged away.
|
||||
* 3. Persistence — a candidate count must hold for EDGE_PERSON_PERSIST_FRAMES
|
||||
* consecutive decisions before it is emitted, so a single
|
||||
* noisy frame cannot promote a phantom person.
|
||||
*
|
||||
* These are robustness gates on the existing heuristic, not a calibrated
|
||||
* occupancy model — true count accuracy vs ground truth remains data-gated. */
|
||||
#define EDGE_PERSON_MIN_ENERGY_RATIO 0.35f /**< Group var must be >= this * max group var to count. */
|
||||
#define EDGE_PERSON_MIN_SC_SEP 4 /**< Min subcarrier separation between distinct persons. */
|
||||
#define EDGE_PERSON_PERSIST_FRAMES 3 /**< Consecutive decisions a count must hold before emit. */
|
||||
|
||||
/* ---- Calibration ---- */
|
||||
#define EDGE_CALIB_FRAMES 1200 /**< Frames for adaptive calibration (~60s at 20 Hz). */
|
||||
#define EDGE_CALIB_SIGMA_MULT 3.0f /**< Threshold = mean + 3*sigma of ambient. */
|
||||
@@ -46,6 +70,27 @@
|
||||
#define EDGE_FALL_COOLDOWN_MS 5000 /**< Minimum ms between fall alerts (debounce). */
|
||||
#define EDGE_FALL_CONSEC_MIN 3 /**< Consecutive frames above threshold to trigger. */
|
||||
|
||||
/* ---- Presence flag hysteresis + debounce (issue #996) ----
|
||||
*
|
||||
* Flicker root cause: the presence flag was a single-threshold compare on a
|
||||
* noisy presence_score (observed 2.6-26.7 frame-to-frame for one stationary
|
||||
* person), so the boolean chattered at the boundary even while the score
|
||||
* clearly indicated a person. Fix: Schmitt-trigger hysteresis plus a clear
|
||||
* debounce.
|
||||
*
|
||||
* - Assert presence when score > threshold (enter immediately).
|
||||
* - Hold presence while score >= threshold * HYST_RATIO (no flicker in the
|
||||
* gap band).
|
||||
* - Clear presence only after the score stays below the low threshold for
|
||||
* EDGE_PRESENCE_CLEAR_FRAMES consecutive frames (genuine departure).
|
||||
*
|
||||
* HYST_RATIO < 1.0 sets the low threshold below the high threshold; a wider gap
|
||||
* (smaller ratio) is more flicker-immune but slower to clear on real exit. The
|
||||
* exact ratio that best matches a given room's score scale remains an on-device
|
||||
* tuning parameter — this removes the logic bug (no hysteresis at all). */
|
||||
#define EDGE_PRESENCE_HYST_RATIO 0.5f /**< Low thresh = HYST_RATIO * high thresh. */
|
||||
#define EDGE_PRESENCE_CLEAR_FRAMES 5 /**< Frames below low thresh before clearing. */
|
||||
|
||||
/* ---- DSP task tuning ---- */
|
||||
#define EDGE_BATCH_LIMIT 4 /**< Max frames per batch before longer yield. */
|
||||
|
||||
|
||||
@@ -43,9 +43,10 @@ MAIN_DIR = ../main
|
||||
FUZZ_DURATION ?= 30
|
||||
FUZZ_JOBS ?= 1
|
||||
|
||||
.PHONY: all clean run_serialize run_edge run_nvs run_all test_adr110 run_adr110 host_tests
|
||||
.PHONY: all clean run_serialize run_edge run_nvs run_all test_adr110 run_adr110 \
|
||||
test_vitals run_vitals host_tests
|
||||
|
||||
all: fuzz_serialize fuzz_edge fuzz_nvs test_adr110
|
||||
all: fuzz_serialize fuzz_edge fuzz_nvs test_adr110 test_vitals
|
||||
|
||||
# --- ADR-110 encoding unit tests ---
|
||||
# Host-side, no libFuzzer needed — plain C99 deterministic table tests
|
||||
@@ -57,8 +58,19 @@ test_adr110: test_adr110_encoding.c
|
||||
run_adr110: test_adr110
|
||||
./test_adr110
|
||||
|
||||
host_tests: run_adr110
|
||||
@echo "ADR-110 host tests passed"
|
||||
# --- Vitals count + presence logic unit tests (issue #998 / #996) ---
|
||||
# Host-side, no libFuzzer. Pins the person-count gate (no over-count for one
|
||||
# body) and the presence hysteresis (no flicker on a dithering score). Pulls
|
||||
# the named tuning constants from ../main/edge_processing.h so the test and the
|
||||
# firmware can never disagree on thresholds.
|
||||
test_vitals: test_vitals_count_presence.c $(MAIN_DIR)/edge_processing.h
|
||||
cc -std=c99 -Wall -Wextra -Istubs -I$(MAIN_DIR) -o $@ $< -lm
|
||||
|
||||
run_vitals: test_vitals
|
||||
./test_vitals
|
||||
|
||||
host_tests: run_adr110 run_vitals
|
||||
@echo "Host tests passed (ADR-110 + vitals #998/#996)"
|
||||
|
||||
# --- Serialize fuzzer ---
|
||||
# Tests csi_serialize_frame() with random wifi_csi_info_t inputs.
|
||||
@@ -94,5 +106,5 @@ run_nvs: fuzz_nvs
|
||||
run_all: run_serialize run_edge run_nvs
|
||||
|
||||
clean:
|
||||
rm -f fuzz_serialize fuzz_edge fuzz_nvs test_adr110
|
||||
rm -f fuzz_serialize fuzz_edge fuzz_nvs test_adr110 test_vitals
|
||||
rm -rf corpus_serialize/ corpus_edge/ corpus_nvs/
|
||||
|
||||
@@ -0,0 +1,387 @@
|
||||
/**
|
||||
* @file test_vitals_count_presence.c
|
||||
* @brief Host-side unit tests for the issue #998 / #996 vitals logic fixes.
|
||||
*
|
||||
* Covers two pure decision functions extracted from edge_processing.c:
|
||||
* 1. count_distinct_persons() — issue #998 person over-count gate
|
||||
* (energy gate + spatial dedup).
|
||||
* 2. person_count_debounce() — issue #998 count persistence debounce.
|
||||
* 3. presence_flag_update() — issue #996 presence hysteresis + clear
|
||||
* debounce (Schmitt trigger).
|
||||
*
|
||||
* Build (Linux/macOS/Windows with any C99 compiler):
|
||||
* cc -std=c99 -Wall -I../main -o test_vitals \
|
||||
* test_vitals_count_presence.c && ./test_vitals
|
||||
*
|
||||
* Exits 0 on all-pass, prints which assertion failed otherwise.
|
||||
*
|
||||
* Why a separate host test file: these are deterministic logic checks for the
|
||||
* exact boundary behaviour the issues describe; libFuzzer adds no signal here.
|
||||
*
|
||||
* IMPORTANT — these three functions are copied VERBATIM from
|
||||
* firmware/esp32-csi-node/main/edge_processing.c. They are pure (no globals,
|
||||
* no ESP-IDF). If the firmware copy changes, update the copy here and re-run
|
||||
* this test before the firmware change merges. The named tuning constants are
|
||||
* pulled from the real header so the test and firmware can never disagree on
|
||||
* thresholds.
|
||||
*
|
||||
* HARDWARE-GATED CAVEAT: these tests pin the *logic* (no flicker / no
|
||||
* over-count for the synthetic traces). True count accuracy and the exact
|
||||
* energy/separation/hysteresis thresholds that best match a real room vs
|
||||
* labelled ground truth remain hardware- and data-gated (COM9 ESP32-S3 +
|
||||
* labelled occupancy). This is a robustness/logic fix, not a validated
|
||||
* accuracy claim.
|
||||
*/
|
||||
|
||||
#include <stdint.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdio.h>
|
||||
|
||||
/* Named tuning constants come from the real firmware header so the test can
|
||||
* never silently diverge from the constants the firmware compiles with. */
|
||||
#include "edge_processing.h"
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────────
|
||||
* System under test — copied VERBATIM from edge_processing.c.
|
||||
* ────────────────────────────────────────────────────────────────────── */
|
||||
|
||||
/* count_distinct_persons() — issue #998 energy gate + spatial dedup. */
|
||||
static uint8_t count_distinct_persons(const float *energy, const uint8_t *sc_idx,
|
||||
uint8_t n_groups)
|
||||
{
|
||||
if (n_groups == 0) return 0;
|
||||
|
||||
float max_energy = 0.0f;
|
||||
for (uint8_t g = 0; g < n_groups; g++) {
|
||||
if (energy[g] > max_energy) max_energy = energy[g];
|
||||
}
|
||||
if (max_energy <= 0.0f) return 0;
|
||||
|
||||
float min_energy = max_energy * EDGE_PERSON_MIN_ENERGY_RATIO;
|
||||
|
||||
uint8_t counted_sc[EDGE_MAX_PERSONS];
|
||||
uint8_t count = 0;
|
||||
|
||||
bool used[EDGE_MAX_PERSONS];
|
||||
for (uint8_t g = 0; g < n_groups && g < EDGE_MAX_PERSONS; g++) used[g] = false;
|
||||
|
||||
for (uint8_t iter = 0; iter < n_groups && iter < EDGE_MAX_PERSONS; iter++) {
|
||||
int best = -1;
|
||||
float best_e = min_energy;
|
||||
for (uint8_t g = 0; g < n_groups && g < EDGE_MAX_PERSONS; g++) {
|
||||
if (used[g]) continue;
|
||||
if (energy[g] >= best_e) { best_e = energy[g]; best = g; }
|
||||
}
|
||||
if (best < 0) break;
|
||||
used[best] = true;
|
||||
|
||||
bool duplicate = false;
|
||||
for (uint8_t c = 0; c < count; c++) {
|
||||
int sep = (int)sc_idx[best] - (int)counted_sc[c];
|
||||
if (sep < 0) sep = -sep;
|
||||
if (sep < EDGE_PERSON_MIN_SC_SEP) { duplicate = true; break; }
|
||||
}
|
||||
if (duplicate) continue;
|
||||
|
||||
counted_sc[count++] = sc_idx[best];
|
||||
}
|
||||
|
||||
if (count == 0) count = 1;
|
||||
return count;
|
||||
}
|
||||
|
||||
/* person_count_debounce() — issue #998 count persistence. */
|
||||
static uint8_t person_count_debounce(uint8_t raw, uint8_t *candidate,
|
||||
uint8_t *streak, uint8_t *stable)
|
||||
{
|
||||
if (raw == *stable) {
|
||||
*candidate = raw;
|
||||
*streak = 0;
|
||||
return *stable;
|
||||
}
|
||||
if (raw == *candidate) {
|
||||
if (*streak < 0xFF) (*streak)++;
|
||||
} else {
|
||||
*candidate = raw;
|
||||
*streak = 1;
|
||||
}
|
||||
if (*streak >= EDGE_PERSON_PERSIST_FRAMES) {
|
||||
*stable = *candidate;
|
||||
*streak = 0;
|
||||
}
|
||||
return *stable;
|
||||
}
|
||||
|
||||
/* presence_flag_update() — issue #996 hysteresis + clear debounce. */
|
||||
static bool presence_flag_update(bool prev, float score, float threshold,
|
||||
uint8_t *below_count)
|
||||
{
|
||||
float low_thresh = threshold * EDGE_PRESENCE_HYST_RATIO;
|
||||
|
||||
if (score > threshold) {
|
||||
*below_count = 0;
|
||||
return true;
|
||||
}
|
||||
|
||||
if (score >= low_thresh) {
|
||||
*below_count = 0;
|
||||
return prev;
|
||||
}
|
||||
|
||||
if (*below_count < 0xFF) (*below_count)++;
|
||||
if (!prev) {
|
||||
return false;
|
||||
}
|
||||
if (*below_count >= EDGE_PRESENCE_CLEAR_FRAMES) {
|
||||
*below_count = 0;
|
||||
return false;
|
||||
}
|
||||
return true;
|
||||
}
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────────
|
||||
* Test harness
|
||||
* ────────────────────────────────────────────────────────────────────── */
|
||||
|
||||
static int g_failed = 0;
|
||||
static int g_passed = 0;
|
||||
|
||||
#define CHECK_EQ_U8(label, got, expected) do { \
|
||||
if ((uint8_t)(got) == (uint8_t)(expected)) { g_passed++; } \
|
||||
else { \
|
||||
g_failed++; \
|
||||
printf("FAIL: %s — got=%u expected=%u\n", \
|
||||
(label), (unsigned)(uint8_t)(got), \
|
||||
(unsigned)(uint8_t)(expected)); \
|
||||
} \
|
||||
} while (0)
|
||||
|
||||
#define CHECK_TRUE(label, cond) do { \
|
||||
if (cond) { g_passed++; } \
|
||||
else { g_failed++; printf("FAIL: %s — expected true\n", (label)); } \
|
||||
} while (0)
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────────
|
||||
* #998 — count_distinct_persons: single body must NOT report EDGE_MAX_PERSONS
|
||||
* ────────────────────────────────────────────────────────────────────── */
|
||||
|
||||
/* One strong signature + weak multipath echoes in adjacent subcarrier groups.
|
||||
* This is exactly the field report: one person ~50 cm → persons=4. The energy
|
||||
* gate + spatial dedup must collapse this to 1. */
|
||||
static void test_count_single_strong_signature(void)
|
||||
{
|
||||
/* 4 groups: one dominant, three weak multipath (below the energy gate),
|
||||
* representative subcarriers clustered (adjacent → one body). */
|
||||
float energy[EDGE_MAX_PERSONS] = {10.0f, 0.6f, 0.4f, 0.3f};
|
||||
uint8_t sc[EDGE_MAX_PERSONS] = {20, 21, 22, 23};
|
||||
CHECK_EQ_U8("single strong signature → 1",
|
||||
count_distinct_persons(energy, sc, EDGE_MAX_PERSONS), 1);
|
||||
}
|
||||
|
||||
/* Even if the weak echoes are spatially spread, they're still below the energy
|
||||
* gate, so they don't count. */
|
||||
static void test_count_single_spread_multipath(void)
|
||||
{
|
||||
float energy[EDGE_MAX_PERSONS] = {10.0f, 1.0f, 0.8f, 0.5f};
|
||||
uint8_t sc[EDGE_MAX_PERSONS] = {10, 40, 70, 100};
|
||||
CHECK_EQ_U8("single body spread multipath → 1",
|
||||
count_distinct_persons(energy, sc, EDGE_MAX_PERSONS), 1);
|
||||
}
|
||||
|
||||
/* Two genuine, well-separated, comparably-strong signatures → 2. */
|
||||
static void test_count_two_well_separated(void)
|
||||
{
|
||||
float energy[EDGE_MAX_PERSONS] = {10.0f, 9.0f, 0.3f, 0.2f};
|
||||
uint8_t sc[EDGE_MAX_PERSONS] = {10, 90, 11, 12};
|
||||
CHECK_EQ_U8("two well-separated strong → 2",
|
||||
count_distinct_persons(energy, sc, EDGE_MAX_PERSONS), 2);
|
||||
}
|
||||
|
||||
/* Two strong but spatially ADJACENT signatures collapse to 1 (same body):
|
||||
* spatial dedup prevents double-counting one person's two strong subcarriers. */
|
||||
static void test_count_two_strong_adjacent_dedup(void)
|
||||
{
|
||||
float energy[EDGE_MAX_PERSONS] = {10.0f, 9.0f, 0.3f, 0.2f};
|
||||
uint8_t sc[EDGE_MAX_PERSONS] = {20, 21, 60, 61}; /* 20 & 21 adjacent */
|
||||
CHECK_EQ_U8("two strong but adjacent → 1 (dedup)",
|
||||
count_distinct_persons(energy, sc, EDGE_MAX_PERSONS), 1);
|
||||
}
|
||||
|
||||
/* No signal at all → 0 persons (empty room). */
|
||||
static void test_count_no_signal(void)
|
||||
{
|
||||
float energy[EDGE_MAX_PERSONS] = {0.0f, 0.0f, 0.0f, 0.0f};
|
||||
uint8_t sc[EDGE_MAX_PERSONS] = {10, 30, 50, 70};
|
||||
CHECK_EQ_U8("no signal → 0", count_distinct_persons(energy, sc, EDGE_MAX_PERSONS), 0);
|
||||
}
|
||||
|
||||
/* Three genuine well-separated strong signatures → 3 (gate doesn't under-count). */
|
||||
static void test_count_three_well_separated(void)
|
||||
{
|
||||
float energy[EDGE_MAX_PERSONS] = {10.0f, 9.0f, 8.0f, 0.2f};
|
||||
uint8_t sc[EDGE_MAX_PERSONS] = {10, 50, 90, 11};
|
||||
CHECK_EQ_U8("three well-separated strong → 3",
|
||||
count_distinct_persons(energy, sc, EDGE_MAX_PERSONS), 3);
|
||||
}
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────────
|
||||
* #998 — person_count_debounce: a single noisy frame can't change the count
|
||||
* ────────────────────────────────────────────────────────────────────── */
|
||||
|
||||
static void test_debounce_rejects_transient_spike(void)
|
||||
{
|
||||
uint8_t candidate = 1, streak = 0, stable = 1; /* settled on 1 person */
|
||||
|
||||
/* One spurious frame reports 4 — must NOT promote. */
|
||||
uint8_t out = person_count_debounce(4, &candidate, &streak, &stable);
|
||||
CHECK_EQ_U8("transient spike held at 1", out, 1);
|
||||
|
||||
/* Back to 1 — resets pending change. */
|
||||
out = person_count_debounce(1, &candidate, &streak, &stable);
|
||||
CHECK_EQ_U8("recovered to 1", out, 1);
|
||||
CHECK_EQ_U8("streak reset", streak, 0);
|
||||
}
|
||||
|
||||
static void test_debounce_accepts_sustained_change(void)
|
||||
{
|
||||
uint8_t candidate = 1, streak = 0, stable = 1;
|
||||
|
||||
uint8_t out = 1;
|
||||
/* A genuine 2-person arrival must hold EDGE_PERSON_PERSIST_FRAMES frames. */
|
||||
for (int i = 0; i < EDGE_PERSON_PERSIST_FRAMES; i++) {
|
||||
out = person_count_debounce(2, &candidate, &streak, &stable);
|
||||
}
|
||||
CHECK_EQ_U8("sustained 2 promoted", out, 2);
|
||||
CHECK_EQ_U8("stable now 2", stable, 2);
|
||||
}
|
||||
|
||||
/* A flapping count (2,1,2,1,...) never accumulates a streak → stays at stable. */
|
||||
static void test_debounce_flapping_stays_stable(void)
|
||||
{
|
||||
uint8_t candidate = 1, streak = 0, stable = 1;
|
||||
uint8_t out = 1;
|
||||
for (int i = 0; i < 10; i++) {
|
||||
out = person_count_debounce((i & 1) ? 1 : 2, &candidate, &streak, &stable);
|
||||
}
|
||||
CHECK_EQ_U8("flapping count stays at 1", out, 1);
|
||||
}
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────────
|
||||
* #996 — presence_flag_update: dithering score must NOT flicker the flag
|
||||
* ────────────────────────────────────────────────────────────────────── */
|
||||
|
||||
/* Field trace dithers around the OLD single threshold while the person is
|
||||
* clearly present. With T_high=10, T_low=5, a score sequence that crosses 10
|
||||
* up and down must produce a STABLE flag (no per-frame flicker). */
|
||||
static void test_presence_no_flicker_on_dither(void)
|
||||
{
|
||||
const float threshold = 10.0f; /* high threshold */
|
||||
/* Observed-style trace (issue evidence: 2.6-26.7), but here we model the
|
||||
* realistic "person present" case where the score mostly sits in/above the
|
||||
* dead band and only briefly dips. */
|
||||
float trace[] = {5.6f, 23.0f, 6.8f, 12.0f, 8.0f, 26.7f, 7.0f, 11.0f, 9.0f, 24.0f};
|
||||
int n = (int)(sizeof(trace) / sizeof(trace[0]));
|
||||
|
||||
bool flag = false;
|
||||
uint8_t below = 0;
|
||||
int flips = 0;
|
||||
bool prev = flag;
|
||||
for (int i = 0; i < n; i++) {
|
||||
flag = presence_flag_update(flag, trace[i], threshold, &below);
|
||||
if (i > 0 && flag != prev) flips++;
|
||||
prev = flag;
|
||||
}
|
||||
/* First sample (5.6) is below T_low=5? No, 5.6 >= 5 → dead band, holds
|
||||
* initial false until 23.0 asserts. After that, dips to 6.8/8.0/7.0/9.0 are
|
||||
* all >= T_low (5), so they HOLD true. The only transition is the initial
|
||||
* false→true. No flicker. */
|
||||
CHECK_TRUE("presence asserted by end", flag);
|
||||
CHECK_TRUE("at most one transition (no flicker)", flips <= 1);
|
||||
}
|
||||
|
||||
/* Hard dither straddling T_low must still not flicker frame-to-frame because of
|
||||
* the clear debounce: brief sub-T_low dips don't immediately clear. */
|
||||
static void test_presence_clear_debounce_holds(void)
|
||||
{
|
||||
const float threshold = 10.0f; /* T_low = 5.0 */
|
||||
bool flag = false;
|
||||
uint8_t below = 0;
|
||||
|
||||
/* Assert. */
|
||||
flag = presence_flag_update(flag, 20.0f, threshold, &below);
|
||||
CHECK_TRUE("asserted on strong score", flag);
|
||||
|
||||
/* A few brief dips below T_low (< CLEAR_FRAMES) must NOT clear. */
|
||||
for (int i = 0; i < EDGE_PRESENCE_CLEAR_FRAMES - 1; i++) {
|
||||
flag = presence_flag_update(flag, 1.0f, threshold, &below);
|
||||
}
|
||||
CHECK_TRUE("brief dips below T_low still present", flag);
|
||||
|
||||
/* Recovery resets the debounce. */
|
||||
flag = presence_flag_update(flag, 20.0f, threshold, &below);
|
||||
CHECK_TRUE("recovered", flag);
|
||||
CHECK_EQ_U8("below_count reset on recovery", below, 0);
|
||||
}
|
||||
|
||||
/* A genuine departure (score drops and STAYS low) clears within the hold window. */
|
||||
static void test_presence_genuine_departure_clears(void)
|
||||
{
|
||||
const float threshold = 10.0f;
|
||||
bool flag = false;
|
||||
uint8_t below = 0;
|
||||
|
||||
flag = presence_flag_update(flag, 20.0f, threshold, &below);
|
||||
CHECK_TRUE("asserted", flag);
|
||||
|
||||
/* Person leaves: score stays well below T_low for CLEAR_FRAMES frames. */
|
||||
for (int i = 0; i < EDGE_PRESENCE_CLEAR_FRAMES; i++) {
|
||||
flag = presence_flag_update(flag, 0.5f, threshold, &below);
|
||||
}
|
||||
CHECK_TRUE("cleared after sustained low", !flag);
|
||||
}
|
||||
|
||||
/* Schmitt gap: a score in the dead band (between T_low and T_high) holds state,
|
||||
* it neither asserts from false nor clears from true. */
|
||||
static void test_presence_dead_band_holds_state(void)
|
||||
{
|
||||
const float threshold = 10.0f; /* dead band 5..10 */
|
||||
uint8_t below = 0;
|
||||
|
||||
/* From false, a dead-band score does not assert. */
|
||||
bool flag = presence_flag_update(false, 7.0f, threshold, &below);
|
||||
CHECK_TRUE("dead band does not assert from false", !flag);
|
||||
|
||||
/* From true, a dead-band score does not clear. */
|
||||
below = 0;
|
||||
flag = presence_flag_update(true, 7.0f, threshold, &below);
|
||||
CHECK_TRUE("dead band does not clear from true", flag);
|
||||
}
|
||||
|
||||
/* ──────────────────────────────────────────────────────────────────────
|
||||
* main
|
||||
* ────────────────────────────────────────────────────────────────────── */
|
||||
|
||||
int main(void)
|
||||
{
|
||||
/* #998 person count gate */
|
||||
test_count_single_strong_signature();
|
||||
test_count_single_spread_multipath();
|
||||
test_count_two_well_separated();
|
||||
test_count_two_strong_adjacent_dedup();
|
||||
test_count_no_signal();
|
||||
test_count_three_well_separated();
|
||||
|
||||
/* #998 count debounce */
|
||||
test_debounce_rejects_transient_spike();
|
||||
test_debounce_accepts_sustained_change();
|
||||
test_debounce_flapping_stays_stable();
|
||||
|
||||
/* #996 presence hysteresis */
|
||||
test_presence_no_flicker_on_dither();
|
||||
test_presence_clear_debounce_holds();
|
||||
test_presence_genuine_departure_clears();
|
||||
test_presence_dead_band_holds_state();
|
||||
|
||||
printf("\n%d passed, %d failed\n", g_passed, g_failed);
|
||||
return g_failed == 0 ? 0 : 1;
|
||||
}
|
||||
Generated
+3
-3
@@ -10835,7 +10835,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wifi-densepose-cli"
|
||||
version = "0.3.0"
|
||||
version = "0.3.1"
|
||||
dependencies = [
|
||||
"anyhow",
|
||||
"assert_cmd",
|
||||
@@ -11067,7 +11067,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wifi-densepose-sensing-server"
|
||||
version = "0.3.2"
|
||||
version = "0.3.3"
|
||||
dependencies = [
|
||||
"axum",
|
||||
"chrono",
|
||||
@@ -11101,7 +11101,7 @@ dependencies = [
|
||||
|
||||
[[package]]
|
||||
name = "wifi-densepose-signal"
|
||||
version = "0.3.3"
|
||||
version = "0.3.4"
|
||||
dependencies = [
|
||||
"chrono",
|
||||
"criterion",
|
||||
|
||||
@@ -79,6 +79,6 @@ harness = false
|
||||
name = "train_marl"
|
||||
required-features = ["train"]
|
||||
|
||||
# ADR-149 Stage-1 evaluation CLI — pure Rust, no special feature needed.
|
||||
# ADR-171 Stage-1 evaluation CLI — pure Rust, no special feature needed.
|
||||
[[bin]]
|
||||
name = "eval_swarm"
|
||||
|
||||
@@ -1,2 +1,2 @@
|
||||
# ADR-149 evaluation outputs
|
||||
# ADR-171 evaluation outputs
|
||||
RESULTS.md is generated by the `eval_swarm` binary.
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
# ruview-swarm Evaluation Results (ADR-149 Stage 1, kinematic)
|
||||
# ruview-swarm Evaluation Results (ADR-171 Stage 1, kinematic)
|
||||
|
||||
Statistically-rigorous evaluation harness: seeded multi-run rollouts with IQM + 95% stratified-bootstrap confidence intervals (Agarwal et al., NeurIPS 2021).
|
||||
|
||||
@@ -9,7 +9,7 @@ Statistically-rigorous evaluation harness: seeded multi-run rollouts with IQM +
|
||||
- **CI method**: 95% stratified bootstrap of the IQM, stratified by seed
|
||||
- **GDOP**: 2-D geometric dilution of precision at first detection
|
||||
|
||||
> **Stage 2 pending**: high-fidelity Gazebo/PX4 SITL evaluation (false-alarm rate, real collision rate on the median seeds) is a follow-on — see ADR-149 §6.1. The collision figures below are a kinematic min-separation proxy, not SITL physics.
|
||||
> **Stage 2 pending**: high-fidelity Gazebo/PX4 SITL evaluation (false-alarm rate, real collision rate on the median seeds) is a follow-on — see ADR-171 §6.1. The collision figures below are a kinematic min-separation proxy, not SITL physics.
|
||||
|
||||
## Flight-pattern leaderboard
|
||||
|
||||
|
||||
@@ -1,11 +1,11 @@
|
||||
//! ADR-149 Stage-1 evaluation CLI.
|
||||
//! ADR-171 Stage-1 evaluation CLI.
|
||||
//!
|
||||
//! Runs the kinematic eval matrix over every flight pattern (default) and
|
||||
//! writes a ranked `RESULTS.md` leaderboard. Pure Rust — no special feature
|
||||
//! flag required, so it builds and runs in default CI.
|
||||
//!
|
||||
//! Defaults are intentionally small (10 seeds × 10 episodes) so the run is fast.
|
||||
//! The full ADR-149 reporting configuration is 10 seeds × 50 episodes — pass
|
||||
//! The full ADR-171 reporting configuration is 10 seeds × 50 episodes — pass
|
||||
//! `--seeds 10 --episodes 50` for the publication run.
|
||||
//!
|
||||
//! ```text
|
||||
@@ -45,7 +45,7 @@ fn main() {
|
||||
}
|
||||
"--help" | "-h" => {
|
||||
eprintln!(
|
||||
"eval_swarm — ADR-149 Stage-1 kinematic evaluator\n\
|
||||
"eval_swarm — ADR-171 Stage-1 kinematic evaluator\n\
|
||||
Usage: eval_swarm [--seeds N] [--episodes M] [--out PATH]\n\
|
||||
Defaults: --seeds 10 --episodes 10 --out crates/ruview-swarm/evals/RESULTS.md"
|
||||
);
|
||||
@@ -59,7 +59,7 @@ fn main() {
|
||||
}
|
||||
|
||||
eprintln!(
|
||||
"Running ADR-149 Stage-1 eval: {seeds} seeds × {episodes} episodes \
|
||||
"Running ADR-171 Stage-1 eval: {seeds} seeds × {episodes} episodes \
|
||||
over {} flight patterns...",
|
||||
FlightPattern::all().len()
|
||||
);
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! Per-episode and aggregate SAR + MARL metrics (ADR-149 Stage 1).
|
||||
//! Per-episode and aggregate SAR + MARL metrics (ADR-171 Stage 1).
|
||||
|
||||
use crate::evals::stats::{stratified_bootstrap_ci, ConfidenceInterval};
|
||||
|
||||
@@ -38,7 +38,7 @@ pub struct AggregateMetrics {
|
||||
impl AggregateMetrics {
|
||||
/// Aggregate a seed-stratified matrix of episodes. Each inner `Vec` is one
|
||||
/// seed's episodes; bootstrap resampling is stratified by seed so the CI
|
||||
/// reflects between-seed variance (the dominant source per ADR-149).
|
||||
/// reflects between-seed variance (the dominant source per ADR-171).
|
||||
pub fn from_strata(per_seed: &[Vec<EpisodeMetrics>], boot_seed: u64) -> Self {
|
||||
const N_BOOT: usize = 1000;
|
||||
|
||||
|
||||
@@ -1,11 +1,11 @@
|
||||
//! ADR-149 statistically-rigorous evaluation harness (Stage 1, kinematic).
|
||||
//! ADR-171 statistically-rigorous evaluation harness (Stage 1, kinematic).
|
||||
//!
|
||||
//! Produces SAR + MARL metrics over a seeded N-seed × M-episode matrix with
|
||||
//! IQM + 95% stratified-bootstrap CIs, a (sigma, kappa) CSI-noise sweep, and
|
||||
//! GDOP-stratified localization error. Generates evals/RESULTS.md.
|
||||
//!
|
||||
//! Stage 2 (Gazebo/PX4 SITL high-fidelity, false-alarm + collision rate on the
|
||||
//! median seeds) is a follow-on — see ADR-149 §6.1.
|
||||
//! median seeds) is a follow-on — see ADR-171 §6.1.
|
||||
pub mod gdop;
|
||||
pub mod stats;
|
||||
pub mod metrics;
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! RESULTS.md leaderboard generator (ADR-149 Stage 1).
|
||||
//! RESULTS.md leaderboard generator (ADR-171 Stage 1).
|
||||
|
||||
use crate::evals::metrics::AggregateMetrics;
|
||||
use crate::evals::stats::ConfidenceInterval;
|
||||
@@ -19,7 +19,7 @@ fn fmt_ci(ci: &ConfidenceInterval) -> String {
|
||||
/// so callers should pre-sort (e.g. by descending coverage point estimate).
|
||||
pub fn render_results_md(rows: &[(String, AggregateMetrics)]) -> String {
|
||||
let mut s = String::new();
|
||||
s.push_str("# ruview-swarm Evaluation Results (ADR-149 Stage 1, kinematic)\n\n");
|
||||
s.push_str("# ruview-swarm Evaluation Results (ADR-171 Stage 1, kinematic)\n\n");
|
||||
s.push_str(
|
||||
"Statistically-rigorous evaluation harness: seeded multi-run rollouts with \
|
||||
IQM + 95% stratified-bootstrap confidence intervals (Agarwal et al., \
|
||||
@@ -46,7 +46,7 @@ pub fn render_results_md(rows: &[(String, AggregateMetrics)]) -> String {
|
||||
s.push_str(
|
||||
"\n> **Stage 2 pending**: high-fidelity Gazebo/PX4 SITL evaluation \
|
||||
(false-alarm rate, real collision rate on the median seeds) is a \
|
||||
follow-on — see ADR-149 §6.1. The collision figures below are a \
|
||||
follow-on — see ADR-171 §6.1. The collision figures below are a \
|
||||
kinematic min-separation proxy, not SITL physics.\n\n",
|
||||
);
|
||||
|
||||
|
||||
@@ -1,4 +1,4 @@
|
||||
//! Stage-1 kinematic rollout + seed × episode matrix (ADR-149).
|
||||
//! Stage-1 kinematic rollout + seed × episode matrix (ADR-171).
|
||||
//!
|
||||
//! A single `run_episode` deterministically drives `drones` drones across a
|
||||
//! mission area under a chosen [`FlightPattern`], marks coverage on a grid,
|
||||
@@ -28,7 +28,7 @@ pub struct EvalConfig {
|
||||
pub config: SwarmConfig,
|
||||
pub drones: usize,
|
||||
pub steps: usize,
|
||||
pub seeds: usize, // ≥10 per ADR-149
|
||||
pub seeds: usize, // ≥10 per ADR-171
|
||||
pub episodes_per_seed: usize, // e.g. 50
|
||||
pub victims: Vec<Position3D>,
|
||||
pub noise: NoiseLevel,
|
||||
@@ -297,7 +297,7 @@ pub fn run_matrix(cfg: &EvalConfig) -> Vec<Vec<EpisodeMetrics>> {
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Standard ADR-149 noise sweep grid: cartesian product of σ × κ levels.
|
||||
/// Standard ADR-171 noise sweep grid: cartesian product of σ × κ levels.
|
||||
pub fn default_noise_sweep() -> Vec<NoiseLevel> {
|
||||
let sigmas = [0.02, 0.05, 0.10];
|
||||
let kappas = [16.0, 8.0, 4.0];
|
||||
|
||||
@@ -24,7 +24,7 @@ linux-wifi = []
|
||||
[dependencies]
|
||||
# CLI argument parsing (for bin/aggregator)
|
||||
clap = { version = "4.4", features = ["derive"] }
|
||||
# Cryptographic HMAC (ADR-050: replace fake XOR-fold HMAC)
|
||||
# Cryptographic HMAC (ADR-166: replace fake XOR-fold HMAC)
|
||||
hmac = "0.12"
|
||||
sha2 = "0.10"
|
||||
# Byte parsing
|
||||
|
||||
@@ -47,6 +47,42 @@ type HmacSha256 = Hmac<Sha256>;
|
||||
/// Size of the HMAC-SHA256 truncated tag (manual crypto mode).
|
||||
const HMAC_TAG_SIZE: usize = 8;
|
||||
|
||||
/// Constant-time comparison of two fixed-size HMAC/auth tags.
|
||||
///
|
||||
/// ADR-157 §B4: the previous `self.hmac_tag == expected` short-circuits on the
|
||||
/// first differing byte, leaking how many leading bytes matched through its
|
||||
/// execution time. For an authentication tag that is a timing oracle: an
|
||||
/// attacker who can submit forged beacons and measure verification latency can
|
||||
/// recover the correct tag byte-by-byte (~256·N trials instead of 256^N).
|
||||
///
|
||||
/// This hand-rolled compare avoids adding the `subtle` crate (ADR-157 deferred
|
||||
/// B4 only to dodge that dependency — a fixed 8-byte compare needs none). We
|
||||
/// XOR-accumulate every byte difference into a single `u8` with **no early
|
||||
/// exit**, so the work done is identical regardless of where (or whether) the
|
||||
/// tags differ. The accumulator is non-zero iff any byte differed; we compare
|
||||
/// it to zero exactly once at the end.
|
||||
///
|
||||
/// `#[inline(never)]` plus `black_box` on the accumulator stop the optimizer
|
||||
/// from reintroducing a short-circuit or hoisting the loop into a `memcmp`
|
||||
/// (which is itself non-constant-time). The two slices are required to be the
|
||||
/// same length by construction (both `[u8; HMAC_TAG_SIZE]`); a length mismatch
|
||||
/// returns `false` without inspecting contents.
|
||||
#[inline(never)]
|
||||
fn constant_time_tag_eq(a: &[u8], b: &[u8]) -> bool {
|
||||
if a.len() != b.len() {
|
||||
return false;
|
||||
}
|
||||
let mut diff: u8 = 0;
|
||||
for (x, y) in a.iter().zip(b.iter()) {
|
||||
// Branch-free: accumulate the bitwise difference of every byte.
|
||||
diff |= x ^ y;
|
||||
}
|
||||
// black_box prevents the compiler from proving `diff == 0` early and
|
||||
// short-circuiting the loop above. The single equality check is the only
|
||||
// data-dependent branch, and it is on the fully-accumulated value.
|
||||
core::hint::black_box(diff) == 0
|
||||
}
|
||||
|
||||
/// Size of the nonce field (manual crypto mode).
|
||||
const NONCE_SIZE: usize = 4;
|
||||
|
||||
@@ -265,7 +301,7 @@ impl AuthenticatedBeacon {
|
||||
/// Compute the HMAC-SHA256 tag for this beacon, truncated to 8 bytes.
|
||||
///
|
||||
/// Uses the `hmac` + `sha2` crates for cryptographically secure
|
||||
/// message authentication (ADR-050, Sprint 1).
|
||||
/// message authentication (ADR-166, Sprint 1).
|
||||
pub fn compute_tag(payload_and_nonce: &[u8], key: &[u8; 16]) -> [u8; HMAC_TAG_SIZE] {
|
||||
let mut mac = HmacSha256::new_from_slice(key).expect("HMAC-SHA256 accepts any key length");
|
||||
mac.update(payload_and_nonce);
|
||||
@@ -281,7 +317,10 @@ impl AuthenticatedBeacon {
|
||||
msg[..16].copy_from_slice(&self.beacon.to_bytes());
|
||||
msg[16..20].copy_from_slice(&self.nonce.to_le_bytes());
|
||||
let expected = Self::compute_tag(&msg, key);
|
||||
if self.hmac_tag == expected {
|
||||
// ADR-157 §B4: constant-time compare — `==` on the tag would leak,
|
||||
// via short-circuit timing, how many leading bytes an attacker's
|
||||
// forged tag matched, enabling byte-by-byte tag recovery.
|
||||
if constant_time_tag_eq(&self.hmac_tag, &expected) {
|
||||
Ok(())
|
||||
} else {
|
||||
Err(SecureTdmError::BeaconAuthFailed)
|
||||
@@ -752,6 +791,124 @@ mod tests {
|
||||
));
|
||||
}
|
||||
|
||||
// ---- ADR-157 §B4: constant-time tag compare ----
|
||||
|
||||
/// Functional pin proving the new constant-time helper is wired and correct
|
||||
/// for the four tag-shape cases. This is the *hard gate* for §B4 — it fails
|
||||
/// on the old `==` path only if the helper is removed/unwired, and it
|
||||
/// guarantees accept/reject semantics are byte-exact. Grade: MEASURED
|
||||
/// (constant-time *construction*); micro-timing on a noisy host is only a
|
||||
/// smoke check (see `tag_compare_timing_invariance_smoke`, #[ignore]).
|
||||
#[test]
|
||||
fn tag_compare_is_constant_time_shape() {
|
||||
let base = [0xA5u8; HMAC_TAG_SIZE];
|
||||
|
||||
// Equal tags accept.
|
||||
assert!(constant_time_tag_eq(&base, &base), "equal tags must accept");
|
||||
|
||||
// First byte differs → reject.
|
||||
let mut first = base;
|
||||
first[0] ^= 0xFF;
|
||||
assert!(
|
||||
!constant_time_tag_eq(&base, &first),
|
||||
"first-byte-differ must reject"
|
||||
);
|
||||
|
||||
// Last byte differs → reject.
|
||||
let mut last = base;
|
||||
last[HMAC_TAG_SIZE - 1] ^= 0x01;
|
||||
assert!(
|
||||
!constant_time_tag_eq(&base, &last),
|
||||
"last-byte-differ must reject"
|
||||
);
|
||||
|
||||
// Every byte differs → reject.
|
||||
let all = [0x5Au8; HMAC_TAG_SIZE]; // bitwise-inverse of 0xA5
|
||||
assert!(
|
||||
!constant_time_tag_eq(&base, &all),
|
||||
"all-bytes-differ must reject"
|
||||
);
|
||||
|
||||
// Length mismatch → reject without inspecting contents.
|
||||
assert!(
|
||||
!constant_time_tag_eq(&base, &base[..HMAC_TAG_SIZE - 1]),
|
||||
"length mismatch must reject"
|
||||
);
|
||||
|
||||
// End-to-end through verify(): a tag whose only difference is the
|
||||
// *last* byte must still be rejected exactly like a first-byte diff.
|
||||
let beacon = SyncBeacon {
|
||||
cycle_id: 7,
|
||||
cycle_period: Duration::from_millis(50),
|
||||
drift_correction_us: 0,
|
||||
generated_at: std::time::Instant::now(),
|
||||
};
|
||||
let key = DEFAULT_TEST_KEY;
|
||||
let nonce = 1u32;
|
||||
let mut msg = [0u8; 20];
|
||||
msg[..16].copy_from_slice(&beacon.to_bytes());
|
||||
msg[16..20].copy_from_slice(&nonce.to_le_bytes());
|
||||
let mut tag = AuthenticatedBeacon::compute_tag(&msg, &key);
|
||||
tag[HMAC_TAG_SIZE - 1] ^= 0x01; // tamper the LAST byte only
|
||||
let auth = AuthenticatedBeacon {
|
||||
beacon,
|
||||
nonce,
|
||||
hmac_tag: tag,
|
||||
};
|
||||
assert!(
|
||||
matches!(auth.verify(&key), Err(SecureTdmError::BeaconAuthFailed)),
|
||||
"last-byte tamper must fail verify()"
|
||||
);
|
||||
}
|
||||
|
||||
/// Coarse timing-invariance smoke check. #[ignore]d so it never flakes CI —
|
||||
/// the host is noisy and a hard timing bound is unreliable. Run manually
|
||||
/// with `cargo test -p wifi-densepose-hardware -- --ignored
|
||||
/// tag_compare_timing_invariance_smoke --nocapture`. The assertion is a
|
||||
/// deliberately *generous* ratio bound (4×): a short-circuit `==` would show
|
||||
/// last-byte-differ ≫ first-byte-differ; the constant-time helper should not.
|
||||
#[test]
|
||||
#[ignore = "timing smoke check — noisy host, run manually with --ignored"]
|
||||
fn tag_compare_timing_invariance_smoke() {
|
||||
use std::time::Instant;
|
||||
const ITERS: u32 = 2_000_000;
|
||||
let base = [0xA5u8; HMAC_TAG_SIZE];
|
||||
let mut first = base;
|
||||
first[0] ^= 0xFF;
|
||||
let mut last = base;
|
||||
last[HMAC_TAG_SIZE - 1] ^= 0x01;
|
||||
|
||||
// Warm up.
|
||||
for _ in 0..ITERS / 10 {
|
||||
core::hint::black_box(constant_time_tag_eq(&base, &first));
|
||||
}
|
||||
|
||||
let t0 = Instant::now();
|
||||
let mut acc = false;
|
||||
for _ in 0..ITERS {
|
||||
acc ^= constant_time_tag_eq(&base, &first);
|
||||
}
|
||||
core::hint::black_box(acc);
|
||||
let dt_first = t0.elapsed().as_nanos() as f64;
|
||||
|
||||
let t1 = Instant::now();
|
||||
let mut acc2 = false;
|
||||
for _ in 0..ITERS {
|
||||
acc2 ^= constant_time_tag_eq(&base, &last);
|
||||
}
|
||||
core::hint::black_box(acc2);
|
||||
let dt_last = t1.elapsed().as_nanos() as f64;
|
||||
|
||||
let ratio = dt_last.max(dt_first) / dt_last.min(dt_first).max(1.0);
|
||||
println!(
|
||||
"first-differ {dt_first:.0}ns, last-differ {dt_last:.0}ns, ratio {ratio:.3}"
|
||||
);
|
||||
assert!(
|
||||
ratio < 4.0,
|
||||
"timing ratio {ratio:.3} too large — possible short-circuit leak"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_auth_beacon_too_short() {
|
||||
let result = AuthenticatedBeacon::from_bytes(&[0u8; 10]);
|
||||
@@ -953,7 +1110,7 @@ mod tests {
|
||||
assert_eq!(SecLevel::Enforcing as u8, 2);
|
||||
}
|
||||
|
||||
// ---- Security tests (ADR-050) ----
|
||||
// ---- Security tests (ADR-166) ----
|
||||
|
||||
#[test]
|
||||
fn test_hmac_different_keys_produce_different_tags() {
|
||||
|
||||
@@ -63,3 +63,7 @@ harness = false
|
||||
name = "onnx_bench"
|
||||
harness = false
|
||||
required-features = ["onnx"]
|
||||
|
||||
[[bench]]
|
||||
name = "native_conv_bench"
|
||||
harness = false
|
||||
|
||||
@@ -0,0 +1,79 @@
|
||||
//! ADR-155 M2 §4 — native (pure-Rust) DensePose conv benchmark.
|
||||
//!
|
||||
//! `DensePoseHead::apply_conv_layer` is a pure-Rust naive 6-nested-loop
|
||||
//! convolution (the §8 "native-conv naive-loop" backlog item). This bench
|
||||
//! measures `forward()` (which runs the shared-conv + segmentation + UV conv
|
||||
//! stacks through that naive loop) on a representative single-layer config so a
|
||||
//! perf claim can be made (or refused) with a MEASURED before/after — never a
|
||||
//! fabricated number.
|
||||
//!
|
||||
//! Reproduce:
|
||||
//! cargo bench -p wifi-densepose-nn --no-default-features --bench native_conv_bench
|
||||
//!
|
||||
//! The bench is `--no-default-features` (no `onnx`/`ort` download needed): the
|
||||
//! conv path is pure-Rust and benchable on any host.
|
||||
|
||||
use criterion::{criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use ndarray::{Array1, Array4};
|
||||
use std::hint::black_box;
|
||||
use wifi_densepose_nn::densepose::{ConvLayerWeights, DensePoseWeights};
|
||||
use wifi_densepose_nn::{DensePoseConfig, DensePoseHead, Tensor};
|
||||
|
||||
/// Build a single same-padding conv layer `in_ch -> out_ch`, kernel `k`, with a
|
||||
/// bias (no batch-norm) — deterministic, small, representative of one stage.
|
||||
fn conv_layer(in_ch: usize, out_ch: usize, k: usize) -> ConvLayerWeights {
|
||||
let weight = Array4::from_shape_fn((out_ch, in_ch, k, k), |(o, i, kh, kw)| {
|
||||
// Deterministic, bounded weights.
|
||||
((o + i + kh + kw) as f32 * 0.013).sin()
|
||||
});
|
||||
ConvLayerWeights {
|
||||
weight,
|
||||
bias: Some(Array1::from_shape_fn(out_ch, |o| o as f32 * 0.01)),
|
||||
bn_gamma: None,
|
||||
bn_beta: None,
|
||||
bn_mean: None,
|
||||
bn_var: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// A head whose shared-conv stack is one `ch->ch` conv, with empty seg/uv heads,
|
||||
/// so the bench isolates a single conv-layer cost.
|
||||
fn single_conv_head(ch: usize, k: usize) -> DensePoseHead {
|
||||
let mut config = DensePoseConfig::new(ch, 1, 2);
|
||||
config.kernel_size = k;
|
||||
config.padding = k / 2; // same padding
|
||||
config.hidden_channels = vec![ch];
|
||||
let weights = DensePoseWeights {
|
||||
shared_conv: vec![conv_layer(ch, ch, k)],
|
||||
segmentation_head: vec![],
|
||||
uv_head: vec![],
|
||||
};
|
||||
DensePoseHead::with_weights(config, weights).expect("valid head")
|
||||
}
|
||||
|
||||
fn bench_native_conv(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("native_conv");
|
||||
// (channels, spatial, kernel) — a modest map and a larger one.
|
||||
for &(ch, hw, k) in &[(16usize, 32usize, 3usize), (32, 32, 3)] {
|
||||
let head = single_conv_head(ch, k);
|
||||
let input = Tensor::Float4D(Array4::from_shape_fn((1, ch, hw, hw), |(_, c, y, x)| {
|
||||
((c + y + x) as f32 * 0.001).cos()
|
||||
}));
|
||||
// Throughput in output elements processed.
|
||||
group.throughput(Throughput::Elements((ch * hw * hw) as u64));
|
||||
group.bench_with_input(
|
||||
BenchmarkId::from_parameter(format!("ch{ch}_hw{hw}_k{k}")),
|
||||
&input,
|
||||
|bencher, inp| {
|
||||
bencher.iter(|| {
|
||||
let out = head.forward(black_box(inp)).expect("forward ok");
|
||||
black_box(out);
|
||||
});
|
||||
},
|
||||
);
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_native_conv);
|
||||
criterion_main!(benches);
|
||||
@@ -338,7 +338,16 @@ impl DensePoseHead {
|
||||
|
||||
let mut output = Array4::zeros((batch, out_channels, out_height, out_width));
|
||||
|
||||
// Simple convolution implementation (not optimized)
|
||||
// Naive direct convolution (one MAC per tap). ADR-155 M2 §4: a
|
||||
// range-clamped variant (hoisting the per-tap in-bounds branch out of the
|
||||
// inner loops) was prototyped and proven bit-identical, but a committed
|
||||
// criterion bench (`benches/native_conv_bench.rs`) showed the perf result
|
||||
// is INCONCLUSIVE on this host: a ~35% win on padding-heavy small-channel
|
||||
// maps but a small (~3%) *regression* on channel-heavy maps, all inside a
|
||||
// ±20% run-to-run noise floor. Per the §0 PROOF discipline we do not ship
|
||||
// a perf change whose benefit isn't robustly positive, nor fabricate a
|
||||
// number — the naive loop is kept and the rewrite is honestly deferred
|
||||
// (see ADR-155 §8). Behaviour pinned by `native_conv_matches_reference`.
|
||||
for b in 0..batch {
|
||||
for oc in 0..out_channels {
|
||||
for oh in 0..out_height {
|
||||
@@ -565,6 +574,61 @@ impl BodyPart {
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
use ndarray::Array4;
|
||||
|
||||
/// ADR-155 M2 §4: characterize the native conv against **hand-computed**
|
||||
/// values so the §8 native-conv perf rewrite (or any future change) has a
|
||||
/// behaviour anchor — a 1×1 conv is just a per-pixel scalar multiply, and a
|
||||
/// same-padded 3×3 corner has a known truncated-window sum. Pins CURRENT
|
||||
/// behaviour (no behaviour change in this milestone — the rewrite was
|
||||
/// reverted as perf-inconclusive; see `benches/native_conv_bench.rs`).
|
||||
#[test]
|
||||
fn native_conv_matches_reference() {
|
||||
// --- Case 1: a 1×1 conv (no padding) is exactly `out = w·in + b`. ---
|
||||
let w11 = ConvLayerWeights {
|
||||
weight: Array4::from_shape_fn((1, 1, 1, 1), |_| 2.0_f32),
|
||||
bias: Some(ndarray::Array1::from_elem(1, 0.5_f32)),
|
||||
bn_gamma: None,
|
||||
bn_beta: None,
|
||||
bn_mean: None,
|
||||
bn_var: None,
|
||||
};
|
||||
let input = Array4::from_shape_fn((1, 1, 2, 2), |(_, _, y, x)| (y * 2 + x) as f32);
|
||||
let mut cfg = DensePoseConfig::new(1, 1, 2);
|
||||
cfg.kernel_size = 1;
|
||||
cfg.padding = 0;
|
||||
cfg.hidden_channels = vec![1];
|
||||
let head = DensePoseHead::new(cfg).unwrap();
|
||||
let out = head.apply_conv_layer(&input, &w11).unwrap();
|
||||
assert_eq!(out.dim(), (1, 1, 2, 2));
|
||||
// out[y,x] = 2·in[y,x] + 0.5 ⇒ {0.5, 2.5, 4.5, 6.5}.
|
||||
for (got, want) in out.iter().zip([0.5_f32, 2.5, 4.5, 6.5].iter()) {
|
||||
assert!((got - want).abs() < 1e-6, "1x1 conv: got {got}, want {want}");
|
||||
}
|
||||
|
||||
// --- Case 2: a same-padded 3×3 all-ones kernel sums the in-bounds
|
||||
// window. Input is all 1.0 on a 3×3 map ⇒ the centre output = 9 (full
|
||||
// window), each corner = 4 (2×2 truncated window). ---
|
||||
let w33 = ConvLayerWeights {
|
||||
weight: Array4::from_elem((1, 1, 3, 3), 1.0_f32),
|
||||
bias: None,
|
||||
bn_gamma: None,
|
||||
bn_beta: None,
|
||||
bn_mean: None,
|
||||
bn_var: None,
|
||||
};
|
||||
let ones = Array4::from_elem((1, 1, 3, 3), 1.0_f32);
|
||||
let mut cfg2 = DensePoseConfig::new(1, 1, 2);
|
||||
cfg2.kernel_size = 3;
|
||||
cfg2.padding = 1;
|
||||
cfg2.hidden_channels = vec![1];
|
||||
let head2 = DensePoseHead::new(cfg2).unwrap();
|
||||
let out2 = head2.apply_conv_layer(&ones, &w33).unwrap();
|
||||
assert_eq!(out2.dim(), (1, 1, 3, 3));
|
||||
assert!((out2[[0, 0, 1, 1]] - 9.0).abs() < 1e-6, "centre full window = 9");
|
||||
assert!((out2[[0, 0, 0, 0]] - 4.0).abs() < 1e-6, "corner 2x2 window = 4");
|
||||
assert!((out2[[0, 0, 0, 1]] - 6.0).abs() < 1e-6, "edge 2x3 window = 6");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_config_validation() {
|
||||
|
||||
@@ -98,8 +98,64 @@ pub struct LinearHead {
|
||||
var_b: f32,
|
||||
}
|
||||
|
||||
/// A shape mismatch when building a [`LinearHead`] from supplied weights.
|
||||
///
|
||||
/// Returned by [`LinearHead::try_new`] so a caller loading weights from an
|
||||
/// **untrusted / deserialized** source can validate the tensor shapes without
|
||||
/// the panic that [`LinearHead::new`] raises on a programmer-supplied mismatch
|
||||
/// (ADR-155 M2 §3: a pure-Rust input guard ahead of the construction contract).
|
||||
#[derive(Debug, Clone, PartialEq, Eq)]
|
||||
pub enum RfHeadError {
|
||||
/// `w.len()` was not `out_dim * EMBEDDING_DIM`.
|
||||
WeightShape {
|
||||
/// Expected length (`out_dim * EMBEDDING_DIM`).
|
||||
expected: usize,
|
||||
/// Actual `w.len()`.
|
||||
got: usize,
|
||||
},
|
||||
/// `b.len()` was not `out_dim`.
|
||||
BiasShape {
|
||||
/// Expected length (`out_dim`).
|
||||
expected: usize,
|
||||
/// Actual `b.len()`.
|
||||
got: usize,
|
||||
},
|
||||
/// `var_w.len()` was not `EMBEDDING_DIM`.
|
||||
VarWeightShape {
|
||||
/// Expected length (`EMBEDDING_DIM`).
|
||||
expected: usize,
|
||||
/// Actual `var_w.len()`.
|
||||
got: usize,
|
||||
},
|
||||
}
|
||||
|
||||
impl std::fmt::Display for RfHeadError {
|
||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
||||
match self {
|
||||
Self::WeightShape { expected, got } => {
|
||||
write!(f, "weight shape mismatch: expected {expected}, got {got}")
|
||||
}
|
||||
Self::BiasShape { expected, got } => {
|
||||
write!(f, "bias shape mismatch: expected {expected}, got {got}")
|
||||
}
|
||||
Self::VarWeightShape { expected, got } => {
|
||||
write!(f, "var weight shape mismatch: expected {expected}, got {got}")
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
impl std::error::Error for RfHeadError {}
|
||||
|
||||
impl LinearHead {
|
||||
/// Build a head with given weights. `w.len()` must be `out_dim * EMBEDDING_DIM`.
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// Panics on a shape mismatch (`w`/`b`/`var_w`). This is a construction-time
|
||||
/// API contract on *programmer-supplied* vectors. For weights from an
|
||||
/// untrusted / deserialized source, prefer [`LinearHead::try_new`], which
|
||||
/// returns a typed [`RfHeadError`] instead of panicking.
|
||||
#[must_use]
|
||||
pub fn new(task: TaskKind, out_dim: usize, w: Vec<f32>, b: Vec<f32>, var_w: Vec<f32>, var_b: f32) -> Self {
|
||||
assert_eq!(w.len(), out_dim * EMBEDDING_DIM, "weight shape mismatch");
|
||||
@@ -108,6 +164,40 @@ impl LinearHead {
|
||||
Self { task, w, b, out_dim, var_w, var_b }
|
||||
}
|
||||
|
||||
/// Fallible constructor: validate the weight shapes and return a typed
|
||||
/// [`RfHeadError`] on mismatch instead of panicking (ADR-155 M2 §3).
|
||||
///
|
||||
/// Use this when `w` / `b` / `var_w` originate from a checkpoint or any
|
||||
/// untrusted source. On success the produced head is byte-for-byte identical
|
||||
/// to [`LinearHead::new`] with the same arguments.
|
||||
///
|
||||
/// # Errors
|
||||
///
|
||||
/// Returns [`RfHeadError`] when any of:
|
||||
/// - `w.len() != out_dim * EMBEDDING_DIM`
|
||||
/// - `b.len() != out_dim`
|
||||
/// - `var_w.len() != EMBEDDING_DIM`
|
||||
pub fn try_new(
|
||||
task: TaskKind,
|
||||
out_dim: usize,
|
||||
w: Vec<f32>,
|
||||
b: Vec<f32>,
|
||||
var_w: Vec<f32>,
|
||||
var_b: f32,
|
||||
) -> Result<Self, RfHeadError> {
|
||||
let expected_w = out_dim * EMBEDDING_DIM;
|
||||
if w.len() != expected_w {
|
||||
return Err(RfHeadError::WeightShape { expected: expected_w, got: w.len() });
|
||||
}
|
||||
if b.len() != out_dim {
|
||||
return Err(RfHeadError::BiasShape { expected: out_dim, got: b.len() });
|
||||
}
|
||||
if var_w.len() != EMBEDDING_DIM {
|
||||
return Err(RfHeadError::VarWeightShape { expected: EMBEDDING_DIM, got: var_w.len() });
|
||||
}
|
||||
Ok(Self { task, w, b, out_dim, var_w, var_b })
|
||||
}
|
||||
|
||||
/// A zero-initialised head (uncertainty = softplus(0) ≈ 0.693).
|
||||
#[must_use]
|
||||
pub fn zeros(task: TaskKind, out_dim: usize) -> Self {
|
||||
@@ -136,9 +226,14 @@ impl LinearHead {
|
||||
}
|
||||
}
|
||||
|
||||
/// Input magnitude above which `softplus(x) ≈ x` to f32 precision, so the
|
||||
/// `exp` is skipped to avoid overflow (ADR-155 M2 §8: de-magicked from a bare
|
||||
/// `20.0`; value unchanged). At x = 20, `ln(1+e^20) − 20 ≈ 2e-9`, below f32 eps.
|
||||
const SOFTPLUS_LINEAR_THRESHOLD: f32 = 20.0;
|
||||
|
||||
fn softplus(x: f32) -> f32 {
|
||||
// Numerically stable softplus.
|
||||
if x > 20.0 {
|
||||
if x > SOFTPLUS_LINEAR_THRESHOLD {
|
||||
x
|
||||
} else {
|
||||
(1.0 + x.exp()).ln()
|
||||
@@ -270,6 +365,48 @@ mod tests {
|
||||
RfEmbedding::new(vec![fill; EMBEDDING_DIM])
|
||||
}
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked softplus linear-threshold must equal the
|
||||
/// prior inline `20.0` literal exactly (operating-value guard).
|
||||
#[test]
|
||||
fn softplus_threshold_unchanged_from_literal() {
|
||||
assert_eq!(SOFTPLUS_LINEAR_THRESHOLD, 20.0_f32);
|
||||
}
|
||||
|
||||
/// ADR-155 M2 §3: `try_new` accepts correctly-shaped weights and produces a
|
||||
/// head byte-identical to `new`, but returns a typed error on a mismatched
|
||||
/// (e.g. corrupt-checkpoint) shape instead of panicking.
|
||||
#[test]
|
||||
fn try_new_accepts_valid_and_rejects_each_bad_shape() {
|
||||
let out_dim = 2;
|
||||
let w = vec![0.0; out_dim * EMBEDDING_DIM];
|
||||
let b = vec![0.0; out_dim];
|
||||
let var_w = vec![0.0; EMBEDDING_DIM];
|
||||
|
||||
// Valid: try_new == new (forward identical on a probe embedding).
|
||||
let head = LinearHead::try_new(TaskKind::Presence, out_dim, w.clone(), b.clone(), var_w.clone(), 0.0)
|
||||
.expect("valid shapes must construct");
|
||||
let reference = LinearHead::new(TaskKind::Presence, out_dim, w.clone(), b.clone(), var_w.clone(), 0.0);
|
||||
assert_eq!(head.forward(&emb(0.5)).values, reference.forward(&emb(0.5)).values);
|
||||
|
||||
// Bad weight length.
|
||||
assert_eq!(
|
||||
LinearHead::try_new(TaskKind::Presence, out_dim, vec![0.0; 3], b.clone(), var_w.clone(), 0.0)
|
||||
.unwrap_err(),
|
||||
RfHeadError::WeightShape { expected: out_dim * EMBEDDING_DIM, got: 3 }
|
||||
);
|
||||
// Bad bias length.
|
||||
assert_eq!(
|
||||
LinearHead::try_new(TaskKind::Presence, out_dim, w.clone(), vec![0.0; 1], var_w.clone(), 0.0)
|
||||
.unwrap_err(),
|
||||
RfHeadError::BiasShape { expected: out_dim, got: 1 }
|
||||
);
|
||||
// Bad var-weight length.
|
||||
assert_eq!(
|
||||
LinearHead::try_new(TaskKind::Presence, out_dim, w, b, vec![0.0; 5], 0.0).unwrap_err(),
|
||||
RfHeadError::VarWeightShape { expected: EMBEDDING_DIM, got: 5 }
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn head_forward_produces_values_and_finite_uncertainty() {
|
||||
let head = LinearHead::zeros(TaskKind::Presence, 2);
|
||||
|
||||
@@ -47,3 +47,7 @@ harness = false
|
||||
[[bench]]
|
||||
name = "fusion_bench"
|
||||
harness = false
|
||||
|
||||
[[bench]]
|
||||
name = "ann_bench"
|
||||
harness = false
|
||||
|
||||
@@ -0,0 +1,74 @@
|
||||
//! Criterion bench for the ADR-261 graph-ANN index: linear scan vs float HNSW
|
||||
//! vs quantized HNSW, on the shared `ann_measure` fixture.
|
||||
//!
|
||||
//! The authoritative recall/QPS numbers in ADR-261 come from the
|
||||
//! `--no-default-features --release` test report
|
||||
//! (`ann_bench_report` in `src/ann_measure.rs`), which is deterministic and
|
||||
//! gate-runnable. This criterion bench times the same operations through the
|
||||
//! criterion harness for stable per-op medians:
|
||||
//!
|
||||
//! ```text
|
||||
//! cargo bench -p wifi-densepose-ruvector --bench ann_bench
|
||||
//! ```
|
||||
//!
|
||||
//! Build is excluded from the timed region (done once in setup); only the query
|
||||
//! path is measured. The fixture and both indices are identical to the report's,
|
||||
//! so the bench and the report can never measure different graphs.
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, Criterion};
|
||||
use wifi_densepose_ruvector::ann_measure::{build_indices, queries, AnnBenchParams};
|
||||
|
||||
fn bench_ann(c: &mut Criterion) {
|
||||
// Modest N so the bench builds quickly; the report covers the larger N.
|
||||
let p = AnnBenchParams::default_fixture(10_000);
|
||||
let (float_idx, quant_idx, _v) = build_indices(p);
|
||||
let qs = queries(p);
|
||||
let k = p.k;
|
||||
|
||||
let mut group = c.benchmark_group("ann_query");
|
||||
group.sample_size(20);
|
||||
|
||||
// Linear scan (brute force) — the no-index baseline.
|
||||
group.bench_function("linear_scan", |b| {
|
||||
b.iter(|| {
|
||||
let mut sink = 0u64;
|
||||
for q in &qs {
|
||||
sink = sink.wrapping_add(float_idx.brute_force(black_box(q), k).len() as u64);
|
||||
}
|
||||
black_box(sink)
|
||||
})
|
||||
});
|
||||
|
||||
// Float HNSW at a mid beam width.
|
||||
for &ef in &[64usize, 128] {
|
||||
group.bench_function(format!("float_hnsw_ef{ef}"), |b| {
|
||||
b.iter(|| {
|
||||
let mut sink = 0u64;
|
||||
for q in &qs {
|
||||
sink = sink.wrapping_add(float_idx.search(black_box(q), k, ef).len() as u64);
|
||||
}
|
||||
black_box(sink)
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
// Quantized HNSW at matched beam widths + rerank.
|
||||
for &ef in &[64usize, 128] {
|
||||
let rr = k * 5;
|
||||
group.bench_function(format!("quant_hnsw_ef{ef}_rr{rr}"), |b| {
|
||||
b.iter(|| {
|
||||
let mut sink = 0u64;
|
||||
for q in &qs {
|
||||
sink = sink
|
||||
.wrapping_add(quant_idx.search_quantized(black_box(q), k, ef, rr).len() as u64);
|
||||
}
|
||||
black_box(sink)
|
||||
})
|
||||
});
|
||||
}
|
||||
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_ann);
|
||||
criterion_main!(benches);
|
||||
@@ -174,5 +174,76 @@ fn bench_topk(c: &mut Criterion) {
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_compare_cost, bench_topk);
|
||||
/// ADR-156 §8 RaBitQ Pass-2 coverage measurement.
|
||||
///
|
||||
/// Not a timing bench — it prints the **measured top-K coverage** (Pass-1 vs
|
||||
/// Pass-2 rotation) on the deterministic anisotropic planted-cluster fixture
|
||||
/// from `wifi_densepose_ruvector::coverage`, so `cargo bench` surfaces the
|
||||
/// numbers quoted in ADR-156 §8 / ADR-084. The same harness backs the
|
||||
/// `pass2_coverage_report` unit test (single source of truth). Each criterion
|
||||
/// "benchmark" body computes the coverage once (cached) and the bench loop just
|
||||
/// reads it back, so the criterion timing is meaningless here on purpose — the
|
||||
/// value is the `println!` summary.
|
||||
fn bench_pass2_coverage(c: &mut Criterion) {
|
||||
use wifi_densepose_ruvector::coverage::{
|
||||
measure_estimator, measure_estimator_euclidean, measure_pass1, measure_pass2,
|
||||
CoverageParams,
|
||||
};
|
||||
|
||||
let base = CoverageParams::aether_default(0xAD00_0084);
|
||||
let rot_seed = 0x5EED_C0DE_1234_5678u64;
|
||||
|
||||
println!("\n=== ADR-156 §8/§11 RaBitQ coverage (anisotropic planted clusters) ===");
|
||||
println!(
|
||||
"dim={} N={} K={} clusters={} noise={} queries={} master_seed=0x{:X} rot_seed=0x{:X}",
|
||||
base.dim, base.n, base.k, base.n_clusters, base.noise, base.n_queries, base.seed, rot_seed
|
||||
);
|
||||
println!("(coverage = |sketch_topK ∩ float_cosine_topK| / K, ADR-084 bar = 90%)");
|
||||
println!("estimator side info = 8 B/vec (residual_norm + x_dot_o, 2x f32)");
|
||||
println!(
|
||||
" {:<12} {:>8} {:>8} {:>11} {:>11}",
|
||||
"candidate_k", "P1-sign", "P2-sign", "Est-cosine", "Est-euclid"
|
||||
);
|
||||
for &cand in &[8usize, 16, 24, 32, 64] {
|
||||
let p = CoverageParams {
|
||||
candidate_k: cand,
|
||||
..base
|
||||
};
|
||||
let p1 = measure_pass1(p).coverage;
|
||||
let p2 = measure_pass2(p, rot_seed).coverage;
|
||||
let est_cos = measure_estimator(p, rot_seed).coverage;
|
||||
let est_euc = measure_estimator_euclidean(p, rot_seed).coverage;
|
||||
let flag = if est_cos >= 0.90 { "EST≥90%" } else { "" };
|
||||
let strict = if cand == base.k { " STRICT" } else { "" };
|
||||
println!(
|
||||
" {:<12} {:>7.2}% {:>7.2}% {:>10.2}% {:>10.2}% {flag}{strict}",
|
||||
cand,
|
||||
p1 * 100.0,
|
||||
p2 * 100.0,
|
||||
est_cos * 100.0,
|
||||
est_euc * 100.0
|
||||
);
|
||||
}
|
||||
println!("========================================================================\n");
|
||||
|
||||
// A minimal criterion group so `cargo bench` exercises the path under the
|
||||
// harness (timing is not the point; the printed table above is).
|
||||
let mut group = c.benchmark_group("pass2_coverage");
|
||||
group.sample_size(10);
|
||||
let p = CoverageParams {
|
||||
n: 256,
|
||||
n_queries: 16,
|
||||
n_clusters: 16,
|
||||
..base
|
||||
};
|
||||
group.bench_function("measure_pass2_small", |b| {
|
||||
b.iter(|| {
|
||||
let r = measure_pass2(black_box(p), black_box(rot_seed));
|
||||
hint::black_box(r.coverage)
|
||||
});
|
||||
});
|
||||
group.finish();
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_compare_cost, bench_topk, bench_pass2_coverage);
|
||||
criterion_main!(benches);
|
||||
|
||||
@@ -0,0 +1,400 @@
|
||||
//! Deterministic, `--no-default-features`-runnable **ANN benchmark measurement**
|
||||
//! for ADR-261 — the single source of truth for the QPS/recall numbers the ADR
|
||||
//! quotes for **linear scan**, **float HNSW**, and **quantized HNSW**.
|
||||
//!
|
||||
//! Both the criterion bench (`benches/ann_bench.rs`) and the in-crate report test
|
||||
//! ([`tests::ann_bench_report`]) call into here, so they can never silently
|
||||
//! measure different things. The numbers in ADR-261 §6 come from running:
|
||||
//!
|
||||
//! ```text
|
||||
//! cd v2 && cargo test -p wifi-densepose-ruvector --no-default-features --release \
|
||||
//! ann_bench_report -- --nocapture
|
||||
//! ```
|
||||
//!
|
||||
//! # What is measured, and the honesty contract
|
||||
//!
|
||||
//! On one fixed planted-cluster fixture (documented dim/N/K/seed), for each
|
||||
//! method we measure:
|
||||
//! - **recall@10** vs the brute-force exact top-10 (the ground truth),
|
||||
//! - **QPS** = queries / total wall-clock query time (warm; build excluded),
|
||||
//! at matched recall operating points found by sweeping `ef` (HNSW) and
|
||||
//! `(ef, rerank)` (quantized).
|
||||
//!
|
||||
//! The reported **ratio** is the claim, not the absolute QPS (which is
|
||||
//! machine-specific). We do **not** tune the quantized path to manufacture a
|
||||
//! win: if at our scale quantized does not beat float HNSW, the report says so
|
||||
//! and the ADR records the honest negative + the expected larger-N crossover.
|
||||
|
||||
use std::collections::HashSet;
|
||||
use std::time::Instant;
|
||||
|
||||
use crate::hnsw::{HnswIndex, HnswParams, Metric};
|
||||
use crate::hnsw_quantized::QuantizedHnswIndex;
|
||||
|
||||
/// SplitMix64 — the crate-wide deterministic PRNG (mirrors `coverage.rs`).
|
||||
#[inline]
|
||||
fn split_mix64(state: &mut u64) -> u64 {
|
||||
*state = state.wrapping_add(0x9E37_79B9_7F4A_7C15);
|
||||
let mut z = *state;
|
||||
z = (z ^ (z >> 30)).wrapping_mul(0xBF58_476D_1CE4_E5B9);
|
||||
z = (z ^ (z >> 27)).wrapping_mul(0x94D0_49BB_1331_11EB);
|
||||
z ^ (z >> 31)
|
||||
}
|
||||
#[inline]
|
||||
fn unif01(state: &mut u64) -> f32 {
|
||||
((split_mix64(state) >> 40) as f32) / ((1u64 << 24) as f32)
|
||||
}
|
||||
#[inline]
|
||||
fn gauss(state: &mut u64) -> f32 {
|
||||
let u1 = unif01(state).max(1e-7);
|
||||
let u2 = unif01(state);
|
||||
(-2.0 * u1.ln()).sqrt() * (std::f32::consts::TAU * u2).cos()
|
||||
}
|
||||
|
||||
/// ANN benchmark fixture parameters, documented in the ADR-261 report.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct AnnBenchParams {
|
||||
/// Embedding dimension.
|
||||
pub dim: usize,
|
||||
/// Number of indexed vectors (N).
|
||||
pub n: usize,
|
||||
/// Number of planted clusters (near-neighbour structure).
|
||||
pub clusters: usize,
|
||||
/// Number of queries timed.
|
||||
pub n_queries: usize,
|
||||
/// Top-K.
|
||||
pub k: usize,
|
||||
/// Intra-cluster Gaussian jitter.
|
||||
pub noise: f32,
|
||||
/// Master fixture seed.
|
||||
pub seed: u64,
|
||||
/// Graph construction/level seed.
|
||||
pub graph_seed: u64,
|
||||
/// Rotation seed for the quantized 1-bit codes.
|
||||
pub rot_seed: u64,
|
||||
}
|
||||
|
||||
impl AnnBenchParams {
|
||||
/// The default ADR-261 fixture: AETHER-shape 128-d, planted clusters.
|
||||
pub fn default_fixture(n: usize) -> Self {
|
||||
Self {
|
||||
dim: 128,
|
||||
n,
|
||||
clusters: 64,
|
||||
n_queries: 200,
|
||||
k: 10,
|
||||
noise: 0.35,
|
||||
seed: 0xADADADAD_0000_0261,
|
||||
graph_seed: 0x6261_5247_4148_4E53,
|
||||
rot_seed: 0x5EED_C0DE_1234_5678,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// The fixture vectors for `p` (deterministic planted clusters).
|
||||
pub fn fixture(p: AnnBenchParams) -> Vec<Vec<f32>> {
|
||||
let centres: Vec<Vec<f32>> = (0..p.clusters)
|
||||
.map(|c| {
|
||||
let mut s = p.seed ^ (0xC0FFEE_u64.wrapping_mul(c as u64 + 1));
|
||||
(0..p.dim).map(|_| gauss(&mut s) * 3.0).collect()
|
||||
})
|
||||
.collect();
|
||||
(0..p.n)
|
||||
.map(|i| {
|
||||
let c = i % p.clusters;
|
||||
let mut s = p.seed ^ (i as u64).wrapping_mul(0x9E37);
|
||||
(0..p.dim)
|
||||
.map(|d| centres[c][d] + gauss(&mut s) * p.noise)
|
||||
.collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// The timed query set for `p` (drawn from the same clusters, disjoint seed).
|
||||
pub fn queries(p: AnnBenchParams) -> Vec<Vec<f32>> {
|
||||
let centres: Vec<Vec<f32>> = (0..p.clusters)
|
||||
.map(|c| {
|
||||
let mut s = p.seed ^ (0xC0FFEE_u64.wrapping_mul(c as u64 + 1));
|
||||
(0..p.dim).map(|_| gauss(&mut s) * 3.0).collect()
|
||||
})
|
||||
.collect();
|
||||
(0..p.n_queries)
|
||||
.map(|q| {
|
||||
let c = q % p.clusters;
|
||||
let mut s = p.seed ^ 0xDEAD_0000_0000 ^ (q as u64).wrapping_mul(0x2545_F491);
|
||||
(0..p.dim)
|
||||
.map(|d| centres[c][d] + gauss(&mut s) * p.noise)
|
||||
.collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Per-method measurement: recall@K and QPS.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct MethodResult {
|
||||
/// Mean recall@K vs brute-force ground truth.
|
||||
pub recall: f64,
|
||||
/// Queries per second (warm wall-clock).
|
||||
pub qps: f64,
|
||||
/// Mean query latency in microseconds.
|
||||
pub latency_us: f64,
|
||||
}
|
||||
|
||||
/// Ground-truth brute-force top-K id sets for every query (computed once).
|
||||
/// Public so the criterion bench and the report test share one definition.
|
||||
pub fn ground_truth(idx: &HnswIndex, queries: &[Vec<f32>], k: usize) -> Vec<HashSet<u32>> {
|
||||
queries
|
||||
.iter()
|
||||
.map(|q| idx.brute_force(q, k).into_iter().map(|(id, _)| id).collect())
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Measure **linear scan** (brute force): recall is 1.0 by definition; QPS is the
|
||||
/// timed exact scan. This is the no-index baseline.
|
||||
pub fn measure_linear(
|
||||
idx: &HnswIndex,
|
||||
queries: &[Vec<f32>],
|
||||
truth: &[HashSet<u32>],
|
||||
k: usize,
|
||||
) -> MethodResult {
|
||||
let mut recall_acc = 0.0f64;
|
||||
let start = Instant::now();
|
||||
let mut sink = 0u64;
|
||||
for (qi, q) in queries.iter().enumerate() {
|
||||
let got = idx.brute_force(q, k);
|
||||
let hit = got.iter().filter(|(id, _)| truth[qi].contains(id)).count();
|
||||
recall_acc += hit as f64 / k as f64;
|
||||
sink = sink.wrapping_add(got.len() as u64);
|
||||
}
|
||||
let elapsed = start.elapsed().as_secs_f64();
|
||||
std::hint::black_box(sink);
|
||||
MethodResult {
|
||||
recall: recall_acc / queries.len() as f64,
|
||||
qps: queries.len() as f64 / elapsed,
|
||||
latency_us: elapsed / queries.len() as f64 * 1e6,
|
||||
}
|
||||
}
|
||||
|
||||
/// Measure **float HNSW** at a given beam width `ef`.
|
||||
pub fn measure_float_hnsw(
|
||||
idx: &HnswIndex,
|
||||
queries: &[Vec<f32>],
|
||||
truth: &[HashSet<u32>],
|
||||
k: usize,
|
||||
ef: usize,
|
||||
) -> MethodResult {
|
||||
let mut recall_acc = 0.0f64;
|
||||
let start = Instant::now();
|
||||
let mut sink = 0u64;
|
||||
for (qi, q) in queries.iter().enumerate() {
|
||||
let got = idx.search(q, k, ef);
|
||||
let hit = got.iter().filter(|(id, _)| truth[qi].contains(id)).count();
|
||||
recall_acc += hit as f64 / k as f64;
|
||||
sink = sink.wrapping_add(got.len() as u64);
|
||||
}
|
||||
let elapsed = start.elapsed().as_secs_f64();
|
||||
std::hint::black_box(sink);
|
||||
MethodResult {
|
||||
recall: recall_acc / queries.len() as f64,
|
||||
qps: queries.len() as f64 / elapsed,
|
||||
latency_us: elapsed / queries.len() as f64 * 1e6,
|
||||
}
|
||||
}
|
||||
|
||||
/// Measure **quantized HNSW** at a given `(ef, rerank)`.
|
||||
pub fn measure_quantized_hnsw(
|
||||
qidx: &QuantizedHnswIndex,
|
||||
queries: &[Vec<f32>],
|
||||
truth: &[HashSet<u32>],
|
||||
k: usize,
|
||||
ef: usize,
|
||||
rerank: usize,
|
||||
) -> MethodResult {
|
||||
let mut recall_acc = 0.0f64;
|
||||
let start = Instant::now();
|
||||
let mut sink = 0u64;
|
||||
for (qi, q) in queries.iter().enumerate() {
|
||||
let got = qidx.search_quantized(q, k, ef, rerank);
|
||||
let hit = got.iter().filter(|(id, _)| truth[qi].contains(id)).count();
|
||||
recall_acc += hit as f64 / k as f64;
|
||||
sink = sink.wrapping_add(got.len() as u64);
|
||||
}
|
||||
let elapsed = start.elapsed().as_secs_f64();
|
||||
std::hint::black_box(sink);
|
||||
MethodResult {
|
||||
recall: recall_acc / queries.len() as f64,
|
||||
qps: queries.len() as f64 / elapsed,
|
||||
latency_us: elapsed / queries.len() as f64 * 1e6,
|
||||
}
|
||||
}
|
||||
|
||||
/// Build both indices for `p` (shared insertion order + graph seed so the float
|
||||
/// and quantized graphs are identical — the only variable is scoring).
|
||||
pub fn build_indices(p: AnnBenchParams) -> (HnswIndex, QuantizedHnswIndex, Vec<Vec<f32>>) {
|
||||
let vectors = fixture(p);
|
||||
let params = HnswParams {
|
||||
m: 16,
|
||||
ef_construction: 200,
|
||||
ef_search: 64,
|
||||
seed: p.graph_seed,
|
||||
};
|
||||
let mut float_idx = HnswIndex::new(p.dim, Metric::L2, params);
|
||||
for v in &vectors {
|
||||
float_idx.insert(v);
|
||||
}
|
||||
let quant_idx =
|
||||
QuantizedHnswIndex::build(&vectors, p.dim, Metric::L2, params, p.rot_seed, p.k * 4);
|
||||
(float_idx, quant_idx, vectors)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn fixture_and_queries_are_deterministic() {
|
||||
let p = AnnBenchParams::default_fixture(500);
|
||||
assert_eq!(fixture(p), fixture(p));
|
||||
assert_eq!(queries(p), queries(p));
|
||||
let p2 = AnnBenchParams {
|
||||
seed: p.seed ^ 1,
|
||||
..p
|
||||
};
|
||||
assert_ne!(fixture(p)[0], fixture(p2)[0]);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn linear_recall_is_one() {
|
||||
// Linear scan IS the ground truth, so recall must be exactly 1.0.
|
||||
let p = AnnBenchParams::default_fixture(800);
|
||||
let (float_idx, _q, _v) = build_indices(p);
|
||||
let qs = queries(p);
|
||||
let truth = ground_truth(&float_idx, &qs, p.k);
|
||||
let r = measure_linear(&float_idx, &qs, &truth, p.k);
|
||||
assert!((r.recall - 1.0).abs() < 1e-9, "linear recall {} != 1.0", r.recall);
|
||||
assert!(r.qps > 0.0);
|
||||
}
|
||||
|
||||
/// The ADR-261 measurement report. Prints the linear / float-HNSW /
|
||||
/// quantized-HNSW recall@10 + QPS table and the QPS ratios at matched recall.
|
||||
/// Run with `--release --nocapture` for the numbers the ADR quotes.
|
||||
#[test]
|
||||
fn ann_bench_report() {
|
||||
// N here is the small/CI-friendly default so the standard (debug) test
|
||||
// gate stays fast; the ADR's headline numbers are taken at the larger N
|
||||
// under --release (documented in the ADR with the exact command). This
|
||||
// test asserts only structural invariants so it is gate-safe at any N.
|
||||
let n: usize = std::env::var("ANN_BENCH_N")
|
||||
.ok()
|
||||
.and_then(|s| s.parse().ok())
|
||||
.unwrap_or(10_000);
|
||||
let p = AnnBenchParams::default_fixture(n);
|
||||
let (float_idx, quant_idx, _v) = build_indices(p);
|
||||
let qs = queries(p);
|
||||
let truth = ground_truth(&float_idx, &qs, p.k);
|
||||
|
||||
println!("\n=== ADR-261 ANN benchmark (planted-cluster synthetic) ===");
|
||||
println!(
|
||||
"dim={} N={} clusters={} queries={} K={} noise={} graph_seed=0x{:X} rot_seed=0x{:X}",
|
||||
p.dim, p.n, p.clusters, p.n_queries, p.k, p.noise, p.graph_seed, p.rot_seed
|
||||
);
|
||||
println!("metric=L2 M=16 ef_construction=200 (debug build unless --release)");
|
||||
println!(
|
||||
"{:<28} {:>9} {:>12} {:>12}",
|
||||
"method", "recall@10", "QPS", "lat(us)"
|
||||
);
|
||||
|
||||
let lin = measure_linear(&float_idx, &qs, &truth, p.k);
|
||||
println!(
|
||||
"{:<28} {:>8.4} {:>12.1} {:>12.1}",
|
||||
"linear scan (brute)", lin.recall, lin.qps, lin.latency_us
|
||||
);
|
||||
|
||||
// Float HNSW across an ef sweep.
|
||||
let mut float_ops: Vec<(usize, MethodResult)> = Vec::new();
|
||||
for &ef in &[16usize, 32, 64, 128, 256] {
|
||||
let r = measure_float_hnsw(&float_idx, &qs, &truth, p.k, ef);
|
||||
println!(
|
||||
"{:<28} {:>8.4} {:>12.1} {:>12.1}",
|
||||
format!("float-HNSW ef={ef}"),
|
||||
r.recall,
|
||||
r.qps,
|
||||
r.latency_us
|
||||
);
|
||||
float_ops.push((ef, r));
|
||||
}
|
||||
|
||||
// Quantized HNSW across (ef, rerank) sweep.
|
||||
let mut quant_ops: Vec<((usize, usize), MethodResult)> = Vec::new();
|
||||
for &ef in &[32usize, 64, 128, 256] {
|
||||
for &rr in &[p.k * 2, p.k * 5, p.k * 10] {
|
||||
let r = measure_quantized_hnsw(&quant_idx, &qs, &truth, p.k, ef, rr);
|
||||
println!(
|
||||
"{:<28} {:>8.4} {:>12.1} {:>12.1}",
|
||||
format!("quant-HNSW ef={ef} rr={rr}"),
|
||||
r.recall,
|
||||
r.qps,
|
||||
r.latency_us
|
||||
);
|
||||
quant_ops.push(((ef, rr), r));
|
||||
}
|
||||
}
|
||||
|
||||
// Equal-recall comparison: pick, for a target recall, the FASTEST op of
|
||||
// each method that meets it, then report the QPS ratios.
|
||||
println!("\n--- equal-recall QPS ratios ---");
|
||||
for &target in &[0.90f64, 0.95, 0.99] {
|
||||
let best_float = float_ops
|
||||
.iter()
|
||||
.filter(|(_, r)| r.recall >= target)
|
||||
.max_by(|a, b| a.1.qps.partial_cmp(&b.1.qps).unwrap());
|
||||
let best_quant = quant_ops
|
||||
.iter()
|
||||
.filter(|(_, r)| r.recall >= target)
|
||||
.max_by(|a, b| a.1.qps.partial_cmp(&b.1.qps).unwrap());
|
||||
match (best_float, best_quant) {
|
||||
(Some((fef, fr)), Some(((qef, qrr), qr))) => {
|
||||
let ratio = qr.qps / fr.qps;
|
||||
let hnsw_vs_lin = fr.qps / lin.qps;
|
||||
println!(
|
||||
"recall>={:.2}: float ef={} {:.0} QPS | quant ef={} rr={} {:.0} QPS | quant/float={:.2}x | float/linear={:.2}x",
|
||||
target, fef, fr.qps, qef, qrr, qr.qps, ratio, hnsw_vs_lin
|
||||
);
|
||||
}
|
||||
(Some((fef, fr)), None) => {
|
||||
let hnsw_vs_lin = fr.qps / lin.qps;
|
||||
println!(
|
||||
"recall>={:.2}: float ef={} {:.0} QPS | quant: NO op met this recall | float/linear={:.2}x",
|
||||
target, fef, fr.qps, hnsw_vs_lin
|
||||
);
|
||||
}
|
||||
_ => {
|
||||
println!("recall>={:.2}: neither method met this recall at the swept ops", target);
|
||||
}
|
||||
}
|
||||
}
|
||||
println!("=========================================================\n");
|
||||
|
||||
// Structural assertions (gate-safe, any N):
|
||||
// - linear scan is exact,
|
||||
// - the best float-HNSW op clears the correctness gate,
|
||||
// - quantized's best op is at least useful (recall well above random).
|
||||
assert!((lin.recall - 1.0).abs() < 1e-9);
|
||||
let best_float_recall = float_ops.iter().map(|(_, r)| r.recall).fold(0.0, f64::max);
|
||||
assert!(
|
||||
best_float_recall >= 0.95,
|
||||
"best float-HNSW recall {best_float_recall:.4} below 0.95 gate"
|
||||
);
|
||||
let best_quant_recall = quant_ops.iter().map(|(_, r)| r.recall).fold(0.0, f64::max);
|
||||
// Honest floor: the 1-bit Hamming traversal is a COARSE angle proxy, so
|
||||
// at large N its best recall lands well below the float gate (MEASURED
|
||||
// ~0.74 at N=10k — see ADR-261 §6). We assert only that it is clearly
|
||||
// useful (>> random: random top-10 of N=10k is ~0.001), which catches a
|
||||
// fully-broken traversal/rerank without pretending the quantized variant
|
||||
// matches float HNSW. The honest negative IS the result.
|
||||
assert!(
|
||||
best_quant_recall >= 0.30,
|
||||
"best quant-HNSW recall {best_quant_recall:.4} below the 0.30 not-broken floor"
|
||||
);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,602 @@
|
||||
//! Deterministic top-K **coverage** harness for the RaBitQ sketch
|
||||
//! (ADR-084 acceptance bar / ADR-156 §8 Pass-2 measurement).
|
||||
//!
|
||||
//! Single source of truth for the coverage number quoted in ADR-084 and
|
||||
//! ADR-156: both the in-crate regression test (`pass2_coverage_not_worse_…`)
|
||||
//! and the criterion bench (`benches/sketch_bench.rs`) call into here, so they
|
||||
//! can never silently measure different things.
|
||||
//!
|
||||
//! **Coverage** is defined exactly as in ADR-084:
|
||||
//!
|
||||
//! > the Top-K candidate set chosen by the sketch must contain **≥ 90%** of the
|
||||
//! > candidates the full-float pass would have picked.
|
||||
//!
|
||||
//! i.e. `coverage = |sketch_topK ∩ float_topK| / K`, averaged over a set of
|
||||
//! queries. The float top-K (squared-euclidean — AETHER's actual metric) is the
|
||||
//! ground truth; the sketch top-K is a *candidate* set, so in practice a system
|
||||
//! over-fetches `C ≥ K` sketch candidates and refines. We measure at
|
||||
//! `candidate_k == K` (the strict bar) by default; the bench also reports an
|
||||
//! over-fetch curve.
|
||||
//!
|
||||
//! # The synthetic distribution — and why it is *anisotropic*
|
||||
//!
|
||||
//! Pure 1-bit sign quantization (Pass 1) is near-optimal on **isotropic,
|
||||
//! zero-centred** embeddings — on such data a rotation barely moves the number,
|
||||
//! so testing rotation there proves nothing. ADR-084's "Open questions" and
|
||||
//! ADR-156 §8 both flag the *anisotropic / correlated* case (skewed CSI
|
||||
//! spectrogram embeddings) as exactly where the rotation is supposed to earn
|
||||
//! its keep. So [`make_anisotropic_embedding`] deliberately builds **correlated,
|
||||
//! axis-aligned, non-isotropic** vectors: a few dominant low-frequency factors
|
||||
//! shared across many coordinates (heavy coordinate correlation) plus a small
|
||||
//! per-dim offset that biases signs — the structure that defeats raw
|
||||
//! sign-quantization and that a randomized rotation is designed to fix. Every
|
||||
//! value derives from a seed via SplitMix64, so the whole harness is
|
||||
//! reproducible bit-for-bit.
|
||||
|
||||
use crate::estimator::EstimatorBank;
|
||||
use crate::{Rotation, SketchBank};
|
||||
|
||||
/// SplitMix64 step — reproducible PRNG for fixture generation (dependency-free).
|
||||
#[inline]
|
||||
fn split_mix64(state: &mut u64) -> u64 {
|
||||
*state = state.wrapping_add(0x9E37_79B9_7F4A_7C15);
|
||||
let mut z = *state;
|
||||
z = (z ^ (z >> 30)).wrapping_mul(0xBF58_476D_1CE4_E5B9);
|
||||
z = (z ^ (z >> 27)).wrapping_mul(0x94D0_49BB_1331_11EB);
|
||||
z ^ (z >> 31)
|
||||
}
|
||||
|
||||
/// A uniform `f32` in `[0, 1)` from the PRNG state.
|
||||
#[inline]
|
||||
fn unif01(state: &mut u64) -> f32 {
|
||||
let r = split_mix64(state);
|
||||
// top 24 bits → [0,1)
|
||||
((r >> 40) as f32) / ((1u64 << 24) as f32)
|
||||
}
|
||||
|
||||
/// A standard-normal-ish `f32` via Box–Muller from two uniforms. Deterministic.
|
||||
#[inline]
|
||||
fn gauss(state: &mut u64) -> f32 {
|
||||
let u1 = unif01(state).max(1e-7); // avoid log(0)
|
||||
let u2 = unif01(state);
|
||||
(-2.0 * u1.ln()).sqrt() * (std::f32::consts::TAU * u2).cos()
|
||||
}
|
||||
|
||||
/// Fixed **anisotropic axis scale** for coordinate `i` of `dim`.
|
||||
///
|
||||
/// A learned embedding space is not isotropic: a handful of axes carry most of
|
||||
/// the variance and the rest are near-flat. We model that with a smoothly
|
||||
/// decaying per-axis scale (≈10× spread between the most- and least-energetic
|
||||
/// axes). This axis-aligned imbalance is exactly what a 1-bit sign sketch
|
||||
/// handles poorly (the low-variance axes' sign bits are noise) and exactly what
|
||||
/// a randomized rotation re-balances (it spreads the variance across all axes so
|
||||
/// every sign bit carries comparable information). The scale depends only on the
|
||||
/// coordinate index, so it is the *same fixed geometry* for every vector.
|
||||
#[inline]
|
||||
fn axis_scale(i: usize, dim: usize) -> f32 {
|
||||
let t = i as f32 / dim.max(1) as f32;
|
||||
// exp decay from ~3.0 down to ~0.3 → ~10× anisotropy.
|
||||
3.0 * (-2.3 * t).exp() + 0.3
|
||||
}
|
||||
|
||||
/// Build the **planted-cluster** fixture: `n_clusters` random centres in the
|
||||
/// anisotropic space. Returned as raw centres (pre-scale); callers add scale +
|
||||
/// intra-cluster noise. Deterministic from `seed`.
|
||||
fn cluster_centres(dim: usize, n_clusters: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
(0..n_clusters)
|
||||
.map(|c| {
|
||||
let mut s = seed ^ 0xC0FFEE_u64.wrapping_mul(c as u64 + 1);
|
||||
(0..dim).map(|_| gauss(&mut s)).collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// One embedding = its cluster centre + small intra-cluster noise, then the
|
||||
/// fixed anisotropic axis scale, then a small off-centre bias. This makes the
|
||||
/// **cosine top-K meaningful** (same-cluster members are genuine near-neighbours,
|
||||
/// not random-noise ties), while keeping the space anisotropic so the rotation
|
||||
/// has something real to fix.
|
||||
fn realize(centre: &[f32], dim: usize, noise: f32, vec_seed: u64) -> Vec<f32> {
|
||||
let mut s = vec_seed ^ 0x5151_5151_5151_5151;
|
||||
(0..dim)
|
||||
.map(|i| {
|
||||
let jitter = gauss(&mut s) * noise;
|
||||
let bias = ((i % 11) as f32 - 5.0) * 0.05;
|
||||
axis_scale(i, dim) * (centre[i] + jitter) + bias
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Cosine distance `1 - cos(a,b)` — the metric a sign sketch approximates
|
||||
/// (hamming over sign bits is a monotone estimate of the angle between vectors).
|
||||
/// This is the correct full-float ground truth for top-K *coverage*: the sketch
|
||||
/// is an angular sensor, so we grade it against the angular full-float ranking,
|
||||
/// per ADR-084's `float_cosine` baseline.
|
||||
#[inline]
|
||||
fn cosine_distance(a: &[f32], b: &[f32]) -> f32 {
|
||||
let mut dot = 0.0f32;
|
||||
let mut na = 0.0f32;
|
||||
let mut nb = 0.0f32;
|
||||
for (&x, &y) in a.iter().zip(b.iter()) {
|
||||
dot += x * y;
|
||||
na += x * x;
|
||||
nb += y * y;
|
||||
}
|
||||
let denom = (na * nb).sqrt();
|
||||
if denom < f32::EPSILON {
|
||||
1.0
|
||||
} else {
|
||||
1.0 - dot / denom
|
||||
}
|
||||
}
|
||||
|
||||
/// Full-float cosine top-K ids (ground truth), ascending by cosine distance.
|
||||
fn float_topk(bank: &[Vec<f32>], query: &[f32], k: usize) -> Vec<u32> {
|
||||
let mut scored: Vec<(u32, f32)> = bank
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| (i as u32, cosine_distance(query, v)))
|
||||
.collect();
|
||||
scored.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
scored.truncate(k);
|
||||
scored.into_iter().map(|(id, _)| id).collect()
|
||||
}
|
||||
|
||||
/// Parameters for a coverage measurement, documented in the report.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct CoverageParams {
|
||||
/// Embedding dimension.
|
||||
pub dim: usize,
|
||||
/// Number of stored vectors in the bank (N).
|
||||
pub n: usize,
|
||||
/// Number of distinct query vectors averaged over.
|
||||
pub n_queries: usize,
|
||||
/// True top-K size (the bar's K).
|
||||
pub k: usize,
|
||||
/// Sketch candidate-set size to compare against the float top-K. Equal to
|
||||
/// `k` for the strict ADR-084 bar; `> k` models over-fetch + refine.
|
||||
pub candidate_k: usize,
|
||||
/// Number of planted clusters. Same-cluster vectors are genuine near
|
||||
/// neighbours, so the cosine top-K is *meaningful* (not random-noise ties).
|
||||
pub n_clusters: usize,
|
||||
/// Intra-cluster Gaussian jitter (relative to unit-variance centres). Small
|
||||
/// jitter → tight, easily-recovered clusters; larger → harder top-K.
|
||||
pub noise: f32,
|
||||
/// Master seed (the whole fixture derives from this).
|
||||
pub seed: u64,
|
||||
}
|
||||
|
||||
impl CoverageParams {
|
||||
/// The canonical AETHER-shape fixture used for the ADR-quoted numbers:
|
||||
/// 128-d, planted clusters, modest intra-cluster jitter. Override fields
|
||||
/// with struct-update syntax (`CoverageParams { candidate_k: 32, ..base }`).
|
||||
pub fn aether_default(seed: u64) -> Self {
|
||||
Self {
|
||||
dim: 128,
|
||||
n: 2048,
|
||||
n_queries: 128,
|
||||
k: 8,
|
||||
candidate_k: 8,
|
||||
n_clusters: 64,
|
||||
noise: 0.35,
|
||||
seed,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Result of a coverage measurement.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct CoverageResult {
|
||||
/// Mean coverage in `[0, 1]` (fraction of float top-K found in the sketch
|
||||
/// candidate set), averaged over queries.
|
||||
pub coverage: f64,
|
||||
}
|
||||
|
||||
/// Measure mean top-K coverage of the **Pass-1** (no rotation) sketch against
|
||||
/// the full-float top-K, on the anisotropic synthetic distribution.
|
||||
pub fn measure_pass1(p: CoverageParams) -> CoverageResult {
|
||||
measure_inner(p, None)
|
||||
}
|
||||
|
||||
/// Measure mean top-K coverage of the **Pass-2** (rotated) sketch against the
|
||||
/// full-float top-K, on the anisotropic synthetic distribution. `rotation_seed`
|
||||
/// fixes the rotation (index and query share it — that is the contract).
|
||||
pub fn measure_pass2(p: CoverageParams, rotation_seed: u64) -> CoverageResult {
|
||||
let rot = Rotation::new(rotation_seed, p.dim);
|
||||
measure_inner(p, Some(rot))
|
||||
}
|
||||
|
||||
/// Measure mean top-K coverage of the **RaBitQ unbiased estimator** rerank
|
||||
/// (ADR-156 Milestone-2) against the full-float top-K, on the **same**
|
||||
/// anisotropic synthetic fixture and query stream as [`measure_pass1`] /
|
||||
/// [`measure_pass2`].
|
||||
///
|
||||
/// This is the whole point of Milestone-2: instead of ranking candidates by
|
||||
/// raw Hamming over sign bits ([`measure_pass2`]), rank them by the RaBitQ
|
||||
/// *unbiased distance estimate* recovered from the 1-bit code + per-vector side
|
||||
/// info ([`crate::estimator`]). `rotation_seed` fixes the rotation (index and
|
||||
/// query share it). The fixture, cluster centres, query draws, and ground-truth
|
||||
/// cosine top-K are **bit-identical** to `measure_pass2`, so the only variable
|
||||
/// is sign-Hamming vs estimator-rerank — an honest apples-to-apples coverage
|
||||
/// comparison.
|
||||
pub fn measure_estimator(p: CoverageParams, rotation_seed: u64) -> CoverageResult {
|
||||
// Cosine ground truth ⇒ rerank by the estimated COSINE key (the angular
|
||||
// sensor's natural metric). See `measure_estimator_euclidean` for the
|
||||
// squared-euclidean key, reported alongside for honesty.
|
||||
measure_estimator_inner(p, rotation_seed, EstimatorRank::Cosine)
|
||||
}
|
||||
|
||||
/// Same as [`measure_estimator`] but reranks by the estimated **squared
|
||||
/// euclidean** distance key instead of cosine. Reported alongside the cosine
|
||||
/// rerank so the ADR shows both honestly: against a *cosine* ground truth, the
|
||||
/// cosine key is the apples-to-apples comparison to sign-Hamming (also angular),
|
||||
/// while the euclidean key mixes in residual-norm and generally ranks worse here.
|
||||
pub fn measure_estimator_euclidean(p: CoverageParams, rotation_seed: u64) -> CoverageResult {
|
||||
measure_estimator_inner(p, rotation_seed, EstimatorRank::Euclidean)
|
||||
}
|
||||
|
||||
#[derive(Clone, Copy)]
|
||||
enum EstimatorRank {
|
||||
Cosine,
|
||||
Euclidean,
|
||||
}
|
||||
|
||||
fn measure_estimator_inner(
|
||||
p: CoverageParams,
|
||||
rotation_seed: u64,
|
||||
rank: EstimatorRank,
|
||||
) -> CoverageResult {
|
||||
let rot = Rotation::new(rotation_seed, p.dim);
|
||||
let float_bank = make_fixture(p);
|
||||
let centres = cluster_centres(p.dim, p.n_clusters.max(1), p.seed);
|
||||
|
||||
// Estimator bank over the SAME fixture vectors.
|
||||
let mut bank = EstimatorBank::new(rot);
|
||||
for (i, v) in float_bank.iter().enumerate() {
|
||||
bank.insert_embedding(i as u32, v);
|
||||
}
|
||||
|
||||
let mut total = 0.0f64;
|
||||
for q in 0..p.n_queries {
|
||||
// IDENTICAL query draw to measure_inner (same seed expression).
|
||||
let c = q % p.n_clusters.max(1);
|
||||
let qv = realize(
|
||||
¢res[c],
|
||||
p.dim,
|
||||
p.noise,
|
||||
p.seed ^ 0xDEAD_0000_0000 ^ (q as u64).wrapping_mul(0x2545_F491),
|
||||
);
|
||||
let truth = float_topk(&float_bank, &qv, p.k);
|
||||
let cand = match rank {
|
||||
EstimatorRank::Cosine => bank.topk_estimated_cosine(&qv, p.candidate_k),
|
||||
EstimatorRank::Euclidean => bank.topk_estimated(&qv, p.candidate_k),
|
||||
};
|
||||
let cand_ids: std::collections::HashSet<u32> = cand.into_iter().map(|(id, _)| id).collect();
|
||||
let hit = truth.iter().filter(|id| cand_ids.contains(id)).count();
|
||||
total += hit as f64 / p.k as f64;
|
||||
}
|
||||
CoverageResult {
|
||||
coverage: total / p.n_queries as f64,
|
||||
}
|
||||
}
|
||||
|
||||
/// Measure mean top-K coverage of a **multi-bit (Pass-3)** rotated sketch:
|
||||
/// `bits` bits per dimension instead of 1, ranked by L1 distance over the
|
||||
/// per-dim codes (the natural multi-bit generalization of hamming). This is the
|
||||
/// "Multi-bit / Extended RaBitQ" half of ADR-156 §8 — measured here as an
|
||||
/// experiment to decide whether a full `MultiBitSketch` type is worth building.
|
||||
///
|
||||
/// Quantization: rotate (Pass-2 frame), then map each rotated coordinate through
|
||||
/// a uniform mid-rise scalar quantizer with `2^bits` levels over a fixed
|
||||
/// symmetric range `[-RANGE, RANGE]` (RANGE chosen from the rotated-coord scale).
|
||||
/// `bits == 1` reduces to sign-quantization (sanity: should match Pass-2 within
|
||||
/// quantizer-boundary noise). Memory cost is `bits×` the 1-bit sketch.
|
||||
///
|
||||
/// Returns the measured coverage; the caller reports the bit/coverage tradeoff.
|
||||
pub fn measure_multibit(p: CoverageParams, rotation_seed: u64, bits: u32) -> CoverageResult {
|
||||
assert!((1..=8).contains(&bits), "bits must be in 1..=8");
|
||||
let rot = Rotation::new(rotation_seed, p.dim);
|
||||
let levels = 1u32 << bits; // 2^bits codes per dim
|
||||
// Rotated AETHER-shape coords after the normalized FHT sit roughly in
|
||||
// [-RANGE, RANGE]; clamp out-of-range to the end codes. RANGE picked to
|
||||
// cover ~99% of the rotated-coord magnitude on this fixture (empirically
|
||||
// ~3.0 after the 1/√m normalization).
|
||||
const RANGE: f32 = 3.0;
|
||||
let quantize = move |v: &[f32]| -> Vec<u16> {
|
||||
rot.apply(v)
|
||||
.iter()
|
||||
.map(|&x| {
|
||||
let t = ((x + RANGE) / (2.0 * RANGE)).clamp(0.0, 1.0); // → [0,1]
|
||||
let code = (t * (levels - 1) as f32).round() as u32;
|
||||
code.min(levels - 1) as u16
|
||||
})
|
||||
.collect()
|
||||
};
|
||||
// L1 distance over per-dim codes.
|
||||
let l1 = |a: &[u16], b: &[u16]| -> u32 {
|
||||
a.iter()
|
||||
.zip(b)
|
||||
.map(|(&x, &y)| (x as i32 - y as i32).unsigned_abs())
|
||||
.sum()
|
||||
};
|
||||
|
||||
let float_bank = make_fixture(p);
|
||||
let centres = cluster_centres(p.dim, p.n_clusters.max(1), p.seed);
|
||||
let coded_bank: Vec<Vec<u16>> = float_bank.iter().map(|v| quantize(v)).collect();
|
||||
|
||||
let mut total = 0.0f64;
|
||||
for q in 0..p.n_queries {
|
||||
let c = q % p.n_clusters.max(1);
|
||||
let qv = realize(
|
||||
¢res[c],
|
||||
p.dim,
|
||||
p.noise,
|
||||
p.seed ^ 0xDEAD_0000_0000 ^ (q as u64).wrapping_mul(0x2545_F491),
|
||||
);
|
||||
let truth = float_topk(&float_bank, &qv, p.k);
|
||||
let qc = quantize(&qv);
|
||||
// top candidate_k by L1 over codes.
|
||||
let mut scored: Vec<(u32, u32)> = coded_bank
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, code)| (i as u32, l1(&qc, code)))
|
||||
.collect();
|
||||
scored.sort_by_key(|&(_, d)| d);
|
||||
scored.truncate(p.candidate_k);
|
||||
let cand_ids: std::collections::HashSet<u32> =
|
||||
scored.into_iter().map(|(id, _)| id).collect();
|
||||
let hit = truth.iter().filter(|id| cand_ids.contains(id)).count();
|
||||
total += hit as f64 / p.k as f64;
|
||||
}
|
||||
CoverageResult {
|
||||
coverage: total / p.n_queries as f64,
|
||||
}
|
||||
}
|
||||
|
||||
/// Build the deterministic float bank for `p`: `p.n` vectors, each assigned to
|
||||
/// one of `p.n_clusters` planted clusters (round-robin), realized as
|
||||
/// `centre + jitter` under the fixed anisotropic axis scale. Returned with the
|
||||
/// cluster id of each vector so queries can be drawn from the same clusters.
|
||||
pub fn make_fixture(p: CoverageParams) -> Vec<Vec<f32>> {
|
||||
let centres = cluster_centres(p.dim, p.n_clusters.max(1), p.seed);
|
||||
(0..p.n)
|
||||
.map(|i| {
|
||||
let c = i % p.n_clusters.max(1);
|
||||
realize(¢res[c], p.dim, p.noise, p.seed ^ (i as u64).wrapping_mul(0x9E37))
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn measure_inner(p: CoverageParams, rotation: Option<Rotation>) -> CoverageResult {
|
||||
const SV: u16 = 1;
|
||||
// Float bank (ground truth) + sketch bank from the SAME vectors, so the
|
||||
// only variable is float-vs-sketch (and Pass-1-vs-Pass-2).
|
||||
let float_bank = make_fixture(p);
|
||||
let centres = cluster_centres(p.dim, p.n_clusters.max(1), p.seed);
|
||||
|
||||
let mut bank = match &rotation {
|
||||
Some(r) => SketchBank::with_rotation(r.clone()),
|
||||
None => SketchBank::new(),
|
||||
};
|
||||
for (i, v) in float_bank.iter().enumerate() {
|
||||
// Use the bank's rotation policy for both Pass-1 and Pass-2 uniformly.
|
||||
bank.insert_embedding(i as u32, v, SV)
|
||||
.expect("schema-locked insert");
|
||||
}
|
||||
|
||||
let mut total = 0.0f64;
|
||||
for q in 0..p.n_queries {
|
||||
// Each query is a fresh draw from a planted cluster (disjoint seed
|
||||
// range from the bank), so it HAS genuine same-cluster neighbours in
|
||||
// the bank — a meaningful top-K, not random-noise ties.
|
||||
let c = q % p.n_clusters.max(1);
|
||||
let qv = realize(
|
||||
¢res[c],
|
||||
p.dim,
|
||||
p.noise,
|
||||
p.seed ^ 0xDEAD_0000_0000 ^ (q as u64).wrapping_mul(0x2545_F491),
|
||||
);
|
||||
let truth = float_topk(&float_bank, &qv, p.k);
|
||||
let cand = bank
|
||||
.topk_embedding(&qv, SV, p.candidate_k)
|
||||
.expect("schema match");
|
||||
let cand_ids: std::collections::HashSet<u32> = cand.into_iter().map(|(id, _)| id).collect();
|
||||
let hit = truth.iter().filter(|id| cand_ids.contains(id)).count();
|
||||
total += hit as f64 / p.k as f64;
|
||||
}
|
||||
CoverageResult {
|
||||
coverage: total / p.n_queries as f64,
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn tight_clusters_give_high_coverage_with_overfetch() {
|
||||
// Sanity / regression: on tight clusters with enough over-fetch the
|
||||
// sketch MUST recover essentially all of the float cosine top-K — this
|
||||
// both proves the harness is correct (a broken topk gives ~random here)
|
||||
// and pins the cluster structure as meaningful. Catches the heap
|
||||
// inversion bug found during this work (which made this ~6%).
|
||||
let p = CoverageParams {
|
||||
n: 1024,
|
||||
n_queries: 64,
|
||||
n_clusters: 64,
|
||||
noise: 0.1,
|
||||
candidate_k: 64,
|
||||
..CoverageParams::aether_default(0x1111)
|
||||
};
|
||||
let cov = measure_pass1(p).coverage;
|
||||
assert!(
|
||||
cov > 0.95,
|
||||
"tight clusters + 8× over-fetch should recover >95% of top-K, got {:.3}",
|
||||
cov
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn multibit_tradeoff_report() {
|
||||
// ADR-156 §8 "Multi-bit / Extended RaBitQ" measurement: bit/coverage
|
||||
// tradeoff at the STRICT bar (candidate_k == K). Reports b=1..4 bits
|
||||
// per dim alongside Pass-1 / Pass-2 (1-bit) baselines. Run with
|
||||
// --nocapture to see the table.
|
||||
let base = CoverageParams::aether_default(0xAD00_0084);
|
||||
let rot_seed = 0x5EED_C0DE_1234_5678u64;
|
||||
let p1 = measure_pass1(base).coverage;
|
||||
let p2 = measure_pass2(base, rot_seed).coverage;
|
||||
println!("\n=== ADR-156 §8 multi-bit tradeoff (strict candidate_k=K={}) ===", base.k);
|
||||
println!("dim={} N={} clusters={} noise={} bar=90%", base.dim, base.n, base.n_clusters, base.noise);
|
||||
println!(" Pass1 (no rot, 1-bit) : {:6.2}%", p1 * 100.0);
|
||||
println!(" Pass2 (rot, 1-bit) : {:6.2}%", p2 * 100.0);
|
||||
for bits in 1..=4u32 {
|
||||
let cov = measure_multibit(base, rot_seed, bits).coverage;
|
||||
let bytes_per_vec = base.dim * bits as usize / 8;
|
||||
println!(
|
||||
" Pass3 (rot, {bits}-bit, {bytes_per_vec:>3} B/vec): {:6.2}% {}",
|
||||
cov * 100.0,
|
||||
if cov >= 0.90 { "≥90%" } else { "" }
|
||||
);
|
||||
}
|
||||
println!("=================================================================\n");
|
||||
assert!((0.0..=1.0).contains(&p1));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn multibit_1bit_matches_pass2_approx() {
|
||||
// Sanity: 1-bit multi-bit quantization is essentially sign-quantization,
|
||||
// so its coverage should track Pass-2 (rotated 1-bit) closely. (Not
|
||||
// exact: the mid-rise quantizer's 0/1 boundary is at the RANGE midpoint,
|
||||
// which equals the sign boundary, so they should match very closely.)
|
||||
let p = CoverageParams {
|
||||
n: 256,
|
||||
n_queries: 16,
|
||||
n_clusters: 16,
|
||||
..CoverageParams::aether_default(0x55)
|
||||
};
|
||||
let rot_seed = 0xABCDu64;
|
||||
let p2 = measure_pass2(p, rot_seed).coverage;
|
||||
let mb1 = measure_multibit(p, rot_seed, 1).coverage;
|
||||
assert!(
|
||||
(p2 - mb1).abs() < 0.05,
|
||||
"1-bit multibit {mb1:.3} should track Pass-2 {p2:.3}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn estimator_rerank_not_worse_than_sign() {
|
||||
// ADR-156 Milestone-2 core regression: on a fixed anisotropic fixture,
|
||||
// reranking the candidate set by the RaBitQ unbiased ESTIMATE must be
|
||||
// >= ranking by sign-only Hamming (Pass-2). The estimator must never
|
||||
// make coverage WORSE — it strictly refines the same 1-bit codes with
|
||||
// side info. (We assert >= here, not a hard 90% bar — the bar is the
|
||||
// measured number reported in the ADR, not a unit invariant.)
|
||||
let p = CoverageParams {
|
||||
n: 512,
|
||||
n_queries: 64,
|
||||
n_clusters: 32,
|
||||
..CoverageParams::aether_default(0x00C0_FFEE)
|
||||
};
|
||||
let rot_seed = 0x1234_5678_9ABC_DEF0u64;
|
||||
let sign = measure_pass2(p, rot_seed).coverage;
|
||||
let est = measure_estimator(p, rot_seed).coverage;
|
||||
assert!(
|
||||
est + 1e-9 >= sign,
|
||||
"estimator rerank coverage {est:.4} regressed below sign-only Pass-2 {sign:.4}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn estimator_coverage_is_deterministic() {
|
||||
// Same params + rotation seed ⇒ same measured coverage, twice.
|
||||
let p = CoverageParams {
|
||||
n: 256,
|
||||
n_queries: 16,
|
||||
n_clusters: 16,
|
||||
..CoverageParams::aether_default(0xE571_3A7E)
|
||||
};
|
||||
let a = measure_estimator(p, 0xFEED_FACE_0000_0001).coverage;
|
||||
let b = measure_estimator(p, 0xFEED_FACE_0000_0001).coverage;
|
||||
assert_eq!(a, b, "estimator coverage must be deterministic");
|
||||
assert!((0.0..=1.0).contains(&a));
|
||||
}
|
||||
|
||||
/// Deterministic, test-runnable coverage measurement that PRINTS the
|
||||
/// Milestone-2 strict-K table: Pass-1 | Pass-2-sign | Pass-2+estimator, at
|
||||
/// the strict bar (candidate_k == K) plus the over-fetch curve. Run with:
|
||||
/// cargo test -p wifi-densepose-ruvector --no-default-features \
|
||||
/// estimator_coverage_report -- --nocapture
|
||||
#[test]
|
||||
fn estimator_coverage_report() {
|
||||
let base = CoverageParams::aether_default(0xAD00_0084);
|
||||
let rot_seed = 0x5EED_C0DE_1234_5678u64;
|
||||
println!(
|
||||
"\n=== ADR-156 Milestone-2 RaBitQ estimator coverage (anisotropic synthetic) ==="
|
||||
);
|
||||
println!(
|
||||
"dim={} N={} K={} queries={} clusters={} noise={} master_seed=0x{:X} rotation_seed=0x{:X}",
|
||||
base.dim, base.n, base.k, base.n_queries, base.n_clusters, base.noise, base.seed, rot_seed
|
||||
);
|
||||
println!("side info = 8 B/vec (residual_norm + x_dot_o, 2x f32)");
|
||||
println!(
|
||||
"{:<12} {:>9} {:>9} {:>11} {:>11} {:>9}",
|
||||
"candidate_k", "P1-sign", "P2-sign", "Est-cosine", "Est-euclid", "vs 90%"
|
||||
);
|
||||
for &c in &[base.k, 16usize, 24, 32, 64] {
|
||||
let pc = CoverageParams {
|
||||
candidate_k: c,
|
||||
..base
|
||||
};
|
||||
let p1 = measure_pass1(pc).coverage;
|
||||
let p2 = measure_pass2(pc, rot_seed).coverage;
|
||||
let est_cos = measure_estimator(pc, rot_seed).coverage;
|
||||
let est_euc = measure_estimator_euclidean(pc, rot_seed).coverage;
|
||||
let bar = if est_cos >= 0.90 { "EST≥90%" } else { "below" };
|
||||
let strict = if c == base.k { " (STRICT)" } else { "" };
|
||||
println!(
|
||||
"{:<12} {:>8.2}% {:>8.2}% {:>10.2}% {:>10.2}% {:>9}{}",
|
||||
c,
|
||||
p1 * 100.0,
|
||||
p2 * 100.0,
|
||||
est_cos * 100.0,
|
||||
est_euc * 100.0,
|
||||
bar,
|
||||
strict
|
||||
);
|
||||
}
|
||||
println!("============================================================================\n");
|
||||
let strict = measure_estimator(base, rot_seed).coverage;
|
||||
assert!((0.0..=1.0).contains(&strict));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn fixture_is_deterministic() {
|
||||
let p = CoverageParams::aether_default(12345);
|
||||
let a = make_fixture(p);
|
||||
let b = make_fixture(p);
|
||||
assert_eq!(a, b);
|
||||
assert_eq!(a.len(), p.n);
|
||||
assert_eq!(a[0].len(), p.dim);
|
||||
let c = make_fixture(CoverageParams::aether_default(12346));
|
||||
assert_ne!(a[0], c[0]);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn coverage_harness_runs_and_is_in_range() {
|
||||
// Small fixed fixture — fast, deterministic, in [0,1].
|
||||
let p = CoverageParams {
|
||||
n: 256,
|
||||
n_queries: 16,
|
||||
n_clusters: 16,
|
||||
..CoverageParams::aether_default(0xABCD)
|
||||
};
|
||||
let c1 = measure_pass1(p);
|
||||
let c2 = measure_pass2(p, 0x1234_5678);
|
||||
assert!((0.0..=1.0).contains(&c1.coverage));
|
||||
assert!((0.0..=1.0).contains(&c2.coverage));
|
||||
// Determinism: same params → same number.
|
||||
assert_eq!(measure_pass1(p).coverage, c1.coverage);
|
||||
assert_eq!(measure_pass2(p, 0x1234_5678).coverage, c2.coverage);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,685 @@
|
||||
//! RaBitQ **unbiased distance estimator** — the real Gao & Long (SIGMOD 2024)
|
||||
//! contribution, on top of the Pass-2 rotation ([`crate::rotation`]).
|
||||
//!
|
||||
//! ## Why this exists (ADR-156 Milestone-2)
|
||||
//!
|
||||
//! Pass-1 ([`crate::sketch`]) and Pass-2 ([`crate::rotation`]) use only the
|
||||
//! **sign** of each rotated coordinate and rank candidates by **Hamming /
|
||||
//! bit distance** — a coarse, monotone-but-lossy proxy for the true angle.
|
||||
//! ADR-156 §10 measured that sign-only Pass-2 leaves strict-K
|
||||
//! (`candidate_k == K`) top-K coverage at **~46%**, well below the ADR-084
|
||||
//! **≥90%** bar, and only clears 90% with ~3× over-fetch.
|
||||
//!
|
||||
//! RaBitQ's *actual* algorithmic contribution is not the sign bits — it is an
|
||||
//! **unbiased estimator of the inner product / squared distance** recovered
|
||||
//! from the 1-bit code **plus a few bytes of per-vector side information**.
|
||||
//! That estimate is far sharper than the raw Hamming proxy, so it can
|
||||
//! **rerank** the candidate set and (the question this module measures) close
|
||||
//! the strict-K coverage gap.
|
||||
//!
|
||||
//! ## The estimator (paper formula + our simplification, stated honestly)
|
||||
//!
|
||||
//! Notation follows the paper. Let `P` be the Pass-2 orthogonal rotation
|
||||
//! ([`crate::Rotation`], `R = H·D`). For a data vector `o_raw` and a query
|
||||
//! `q_raw`:
|
||||
//!
|
||||
//! 1. **Centroid.** The paper centres each vector on its (per-cluster)
|
||||
//! centroid `c`: residual `o_r = o_raw − c`. **We use a zero / global
|
||||
//! centroid `c = 0`** (`o_r = o_raw`). This is an explicit simplification
|
||||
//! (no IVF/k-means cluster structure in the current sketch path) — it costs
|
||||
//! accuracy when the data is far off-origin, and we document it rather than
|
||||
//! hide it. With `c = 0`, the residual *is* the raw vector.
|
||||
//!
|
||||
//! 2. **Unit residual + 1-bit code.** `o = o_r / ‖o_r‖`. Rotate:
|
||||
//! `o' = P·o`. The 1-bit code is `x̄_i = sign(o'_i) · (1/√D)`, so `x̄`
|
||||
//! is a **unit vector** in `{±1/√D}^D` (the corner of the hypercube nearest
|
||||
//! `o'`). `D` is the rotation's padded dimension (`next_pow2(dim)`), because
|
||||
//! the FHT operates on the padded length and `x̄` is unit over that length.
|
||||
//!
|
||||
//! 3. **Per-vector side information** (the "few bytes"): we store, per sketch,
|
||||
//! - `residual_norm = ‖o_r‖` (an `f32`), and
|
||||
//! - `x_dot_o = ⟨x̄, o'⟩` (an `f32`), the cosine between the code and the
|
||||
//! rotated unit residual. This is the quantity the paper calls `⟨x̄, o⟩`
|
||||
//! (after rotation); it lies in `(0, 1]` and is `1` only when `o'`
|
||||
//! already sits exactly on a hypercube corner.
|
||||
//!
|
||||
//! That is **8 bytes/vector** of side info (2× `f32`).
|
||||
//!
|
||||
//! 4. **Query-time estimate.** Rotate the query residual: `q' = P·q_r`. The
|
||||
//! **unbiased estimator of `⟨o', q'⟩`** (equivalently `⟨o, q_r⟩`, since `P`
|
||||
//! is orthogonal) is
|
||||
//!
|
||||
//! ```text
|
||||
//! ⟨o', q'⟩ ≈ ⟨x̄, q'⟩ / ⟨x̄, o'⟩ = ⟨x̄, q'⟩ / x_dot_o
|
||||
//! ```
|
||||
//!
|
||||
//! This is RaBitQ Eq. (in the paper, the estimator `<q, o> ≈ <q̄, ...>`):
|
||||
//! the random rotation makes the quantization error of `x̄` (relative to
|
||||
//! `o'`) orthogonal **in expectation** to `q'`, so dividing the measured
|
||||
//! `⟨x̄, q'⟩` by `x_dot_o` is **unbiased** for `⟨o', q'⟩`, with the paper's
|
||||
//! `O(1/√D)` error bound. The only per-candidate cost is one length-`D`
|
||||
//! dot product `⟨x̄, q'⟩` — which, because `x̄ ∈ {±1/√D}`, is just a signed
|
||||
//! sum of the query coordinates (`±` chosen by the stored sign bits),
|
||||
//! i.e. as cheap as the Hamming proxy plus one multiply.
|
||||
//!
|
||||
//! 5. **Inner product and squared distance.** Un-normalize:
|
||||
//! `⟨o_r, q_r⟩ = ‖o_r‖ · ⟨o, q_r⟩`. Then
|
||||
//!
|
||||
//! ```text
|
||||
//! ‖q_r − o_r‖² = ‖q_r‖² + ‖o_r‖² − 2·⟨o_r, q_r⟩
|
||||
//! ```
|
||||
//!
|
||||
//! For **ranking** a candidate set against one fixed query, `‖q_r‖²` is a
|
||||
//! per-query constant and can be dropped; we keep it in
|
||||
//! [`DistanceEstimator::estimate_sq_distance`] so the value is a genuine
|
||||
//! distance estimate (used by the unbiasedness test), and expose the
|
||||
//! cheaper ranking key separately.
|
||||
//!
|
||||
//! ## What is unbiased, and what we measure
|
||||
//!
|
||||
//! The estimator of `⟨o', q'⟩` is unbiased over the random rotation. We pin
|
||||
//! that on a small hand-checkable fixture (`estimator_unbiased_on_fixture`):
|
||||
//! averaging the estimate over many random rotation seeds converges to the true
|
||||
//! inner product within tolerance. We then measure whether **reranking the
|
||||
//! candidate set by this estimate** closes the strict-K coverage gap that the
|
||||
//! sign-only Pass-2 left at ~46% — reported honestly in ADR-156 §10 / §11
|
||||
//! whether it clears 90% or not.
|
||||
//!
|
||||
//! ## Backward compatibility
|
||||
//!
|
||||
//! This module is **purely additive**. It introduces an *extended* sketch type
|
||||
//! ([`EstimatorSketch`]) and bank ([`EstimatorBank`]) that carry the side info;
|
||||
//! the Pass-1 [`crate::Sketch`] / Pass-2 [`crate::SketchBank`] paths and the
|
||||
//! [`crate::WireSketch`] wire format are **untouched**. Nothing on the existing
|
||||
//! surface changes.
|
||||
|
||||
use crate::rotation::{next_pow2, Rotation};
|
||||
|
||||
/// The per-vector side information RaBitQ needs to turn a 1-bit code into an
|
||||
/// **unbiased** distance estimate (§ module docs step 3).
|
||||
///
|
||||
/// Two `f32`s = **8 bytes/vector** on top of the packed sign bits.
|
||||
#[derive(Debug, Clone, Copy, PartialEq)]
|
||||
pub struct SideInfo {
|
||||
/// `‖o_r‖` — L2 norm of the (zero-centroid) residual = the raw vector norm.
|
||||
pub residual_norm: f32,
|
||||
/// `⟨x̄, o'⟩` — dot product of the unit 1-bit code with the rotated unit
|
||||
/// residual. In `(0, 1]`; the paper's `⟨x̄, o⟩`. Drives the unbiased
|
||||
/// rescaling `⟨x̄, q'⟩ / x_dot_o`.
|
||||
pub x_dot_o: f32,
|
||||
}
|
||||
|
||||
/// A Pass-2 sketch **plus** the RaBitQ side information, sufficient to compute
|
||||
/// the unbiased distance estimate at query time.
|
||||
///
|
||||
/// Stores the packed sign bits over the **padded** rotation length `D`
|
||||
/// (`next_pow2(dim)`) — the frame `x̄` actually lives in — together with the
|
||||
/// [`SideInfo`]. Construct via [`EstimatorSketch::from_embedding`]; the index
|
||||
/// and the query **must** use the same [`Rotation`] (same seed + dim), exactly
|
||||
/// as for a Pass-2 sketch.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct EstimatorSketch {
|
||||
/// Sign bits of the rotated *padded* unit residual, MSB-first per byte.
|
||||
/// Length is `ceil(D / 8)` where `D = next_pow2(dim)`. Bit set ⇒ `o'_i ≥ 0`
|
||||
/// ⇒ code coordinate `+1/√D`; clear ⇒ `−1/√D`.
|
||||
bits: Vec<u8>,
|
||||
/// Padded rotation dimension `D = next_pow2(dim)`; the code is unit over `D`.
|
||||
padded_dim: usize,
|
||||
/// Source embedding dimension (for compatibility checks / reporting).
|
||||
embedding_dim: usize,
|
||||
/// The RaBitQ side info for the unbiased estimate.
|
||||
side: SideInfo,
|
||||
}
|
||||
|
||||
impl EstimatorSketch {
|
||||
/// Build an estimator sketch from a dense embedding and a [`Rotation`].
|
||||
///
|
||||
/// Zero-centroid (`c = 0`): the residual is the raw embedding. The vector is
|
||||
/// rotated through `rotation` over its padded length `D = next_pow2(dim)`,
|
||||
/// the sign of each rotated coordinate is packed, and the side info
|
||||
/// (`‖o_r‖`, `⟨x̄, o'⟩`) is computed in the same pass.
|
||||
///
|
||||
/// A zero (or all-equal-to-its-own-mean) input yields `residual_norm = 0`;
|
||||
/// its estimate degenerates to `0` (handled in
|
||||
/// [`EstimatorBank`]) rather than dividing by zero.
|
||||
pub fn from_embedding(embedding: &[f32], rotation: &Rotation) -> Self {
|
||||
Self::from_embedding_centred(embedding, rotation, None)
|
||||
}
|
||||
|
||||
/// Build an estimator sketch with an **explicit centroid** `c` subtracted
|
||||
/// before rotation (the paper's per-cluster centroid; `o_r = o_raw − c`).
|
||||
///
|
||||
/// Pass `None` for the zero-centroid simplification (`c = 0`, identical to
|
||||
/// [`EstimatorSketch::from_embedding`]). Pass `Some(centroid)` (length `dim`)
|
||||
/// to centre on a shared global / cluster centroid — the index and the query
|
||||
/// **must** use the *same* centroid, exactly as they must share the rotation.
|
||||
/// This path exists so ADR-156 can **measure the cost of the zero-centroid
|
||||
/// simplification** honestly rather than assert it.
|
||||
pub fn from_embedding_centred(
|
||||
embedding: &[f32],
|
||||
rotation: &Rotation,
|
||||
centroid: Option<&[f32]>,
|
||||
) -> Self {
|
||||
let dim = rotation.dim();
|
||||
let padded = next_pow2(dim);
|
||||
// Residual o_r = o_raw − c (c = 0 when centroid is None). Build it once.
|
||||
let residual: Vec<f32> = (0..dim)
|
||||
.map(|i| {
|
||||
let v = embedding.get(i).copied().unwrap_or(0.0);
|
||||
let c = centroid.and_then(|c| c.get(i)).copied().unwrap_or(0.0);
|
||||
v - c
|
||||
})
|
||||
.collect();
|
||||
let residual_norm = {
|
||||
let mut acc = 0.0f64;
|
||||
for &v in &residual {
|
||||
acc += (v as f64) * (v as f64);
|
||||
}
|
||||
acc.sqrt() as f32
|
||||
};
|
||||
|
||||
// Rotate the RESIDUAL over the PADDED length so the code frame matches
|
||||
// what `x_dot_o` and the query dot product use.
|
||||
let rotated_padded = rotation.apply_padded(&residual);
|
||||
debug_assert_eq!(rotated_padded.len(), padded);
|
||||
|
||||
// 1-bit code over the padded length: x̄_i = sign(o'_i)/√D on the *unit*
|
||||
// residual. Since o' = P·o = P·(o_r/‖o_r‖) = (P·o_r)/‖o_r‖, and sign is
|
||||
// scale-invariant, sign(o'_i) == sign((P·o_r)_i) == sign(rotated_padded_i).
|
||||
// ⟨x̄, o'⟩ = (1/√D)·Σ sign(o'_i)·o'_i = (1/√D)·Σ |o'_i|
|
||||
// = (1/√D)·(Σ|(P·o_r)_i|) / ‖o_r‖.
|
||||
let inv_sqrt_d = 1.0f32 / (padded as f32).sqrt();
|
||||
let mut bits = vec![0u8; padded.div_ceil(8)];
|
||||
let mut sum_abs = 0.0f64; // Σ |(P·o_r)_i|
|
||||
for (i, &c) in rotated_padded.iter().enumerate() {
|
||||
if c >= 0.0 {
|
||||
bits[i / 8] |= 1 << (7 - (i % 8));
|
||||
}
|
||||
sum_abs += (c as f64).abs();
|
||||
}
|
||||
// ⟨x̄, o'⟩ with o' the rotated *unit* residual.
|
||||
let x_dot_o = if residual_norm > 0.0 {
|
||||
(inv_sqrt_d as f64 * sum_abs / residual_norm as f64) as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
|
||||
Self {
|
||||
bits,
|
||||
padded_dim: padded,
|
||||
embedding_dim: dim,
|
||||
side: SideInfo {
|
||||
residual_norm,
|
||||
x_dot_o,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
/// The padded rotation dimension `D` the code lives in.
|
||||
#[inline]
|
||||
pub fn padded_dim(&self) -> usize {
|
||||
self.padded_dim
|
||||
}
|
||||
|
||||
/// Source embedding dimension.
|
||||
#[inline]
|
||||
pub fn embedding_dim(&self) -> usize {
|
||||
self.embedding_dim
|
||||
}
|
||||
|
||||
/// The RaBitQ side information.
|
||||
#[inline]
|
||||
pub fn side_info(&self) -> SideInfo {
|
||||
self.side
|
||||
}
|
||||
|
||||
/// `‖o_r‖` of the residual (zero-centroid ⇒ raw vector norm).
|
||||
#[inline]
|
||||
pub fn residual_norm(&self) -> f32 {
|
||||
self.side.residual_norm
|
||||
}
|
||||
|
||||
/// Side-information byte cost (excluding the packed sign bits): 8 bytes.
|
||||
pub const SIDE_INFO_BYTES: usize = 2 * std::mem::size_of::<f32>();
|
||||
|
||||
/// `⟨x̄, q'⟩` — the dot product of this sketch's unit 1-bit code with a
|
||||
/// rotated query `q'` (length `padded_dim`). Because `x̄_i = ±1/√D`, this is
|
||||
/// `(1/√D)·Σ ±q'_i` with the sign taken from the stored bit. The single
|
||||
/// per-candidate cost of the estimator.
|
||||
#[inline]
|
||||
fn code_dot(&self, q_rotated_padded: &[f32]) -> f32 {
|
||||
debug_assert_eq!(q_rotated_padded.len(), self.padded_dim);
|
||||
let inv_sqrt_d = 1.0f32 / (self.padded_dim as f32).sqrt();
|
||||
let mut acc = 0.0f32;
|
||||
for (i, &q) in q_rotated_padded.iter().enumerate() {
|
||||
let bit = (self.bits[i / 8] >> (7 - (i % 8))) & 1;
|
||||
if bit == 1 {
|
||||
acc += q;
|
||||
} else {
|
||||
acc -= q;
|
||||
}
|
||||
}
|
||||
acc * inv_sqrt_d
|
||||
}
|
||||
}
|
||||
|
||||
/// A pre-rotated query, computed **once** per query and reused across all
|
||||
/// candidates. Carries `q' = P·q_r` (over the padded length) and `‖q_r‖²`.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct EstimatorQuery {
|
||||
/// `q' = P·q_r` over the padded rotation length.
|
||||
q_rotated_padded: Vec<f32>,
|
||||
/// `‖q_r‖²` — per-query constant in the squared-distance expansion.
|
||||
q_norm_sq: f32,
|
||||
}
|
||||
|
||||
impl EstimatorQuery {
|
||||
/// Pre-rotate a query embedding through `rotation` (zero-centroid).
|
||||
pub fn new(query: &[f32], rotation: &Rotation) -> Self {
|
||||
Self::new_centred(query, rotation, None)
|
||||
}
|
||||
|
||||
/// Pre-rotate a query residual `q_r = q − c` through `rotation`. The
|
||||
/// centroid **must** match the one used to build the bank's sketches.
|
||||
pub fn new_centred(query: &[f32], rotation: &Rotation, centroid: Option<&[f32]>) -> Self {
|
||||
let dim = rotation.dim();
|
||||
let residual: Vec<f32> = (0..dim)
|
||||
.map(|i| {
|
||||
let v = query.get(i).copied().unwrap_or(0.0);
|
||||
let c = centroid.and_then(|c| c.get(i)).copied().unwrap_or(0.0);
|
||||
v - c
|
||||
})
|
||||
.collect();
|
||||
let mut q_norm_sq = 0.0f64;
|
||||
for &v in &residual {
|
||||
q_norm_sq += (v as f64) * (v as f64);
|
||||
}
|
||||
Self {
|
||||
q_rotated_padded: rotation.apply_padded(&residual),
|
||||
q_norm_sq: q_norm_sq as f32,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Computes RaBitQ unbiased estimates from an [`EstimatorSketch`] + a
|
||||
/// pre-rotated [`EstimatorQuery`].
|
||||
///
|
||||
/// Stateless — the methods are associated functions. Kept as a type for
|
||||
/// discoverability and to group the estimator formula in one place.
|
||||
pub struct DistanceEstimator;
|
||||
|
||||
impl DistanceEstimator {
|
||||
/// Unbiased estimate of `⟨o_r, q_r⟩` (the inner product of the residuals).
|
||||
///
|
||||
/// `⟨o_r, q_r⟩ = ‖o_r‖ · (⟨x̄, q'⟩ / ⟨x̄, o'⟩)`. Returns `0.0` when the
|
||||
/// stored `x_dot_o` is non-positive (degenerate / zero residual), which
|
||||
/// cannot happen for a non-zero input but keeps the call total.
|
||||
pub fn estimate_inner_product(sketch: &EstimatorSketch, query: &EstimatorQuery) -> f32 {
|
||||
let x_dot_o = sketch.side.x_dot_o;
|
||||
if x_dot_o <= 0.0 {
|
||||
return 0.0;
|
||||
}
|
||||
let code_dot_q = sketch.code_dot(&query.q_rotated_padded);
|
||||
// ⟨o, q_r⟩ ≈ ⟨x̄, q'⟩ / x_dot_o (unit residual o)
|
||||
let inner_unit = code_dot_q / x_dot_o;
|
||||
sketch.side.residual_norm * inner_unit
|
||||
}
|
||||
|
||||
/// Unbiased estimate of the **squared euclidean distance** `‖q_r − o_r‖²`.
|
||||
///
|
||||
/// `= ‖q_r‖² + ‖o_r‖² − 2·⟨o_r, q_r⟩`, using the estimated inner product.
|
||||
/// This is the value the unbiasedness test checks.
|
||||
pub fn estimate_sq_distance(sketch: &EstimatorSketch, query: &EstimatorQuery) -> f32 {
|
||||
let ip = Self::estimate_inner_product(sketch, query);
|
||||
let o_norm = sketch.side.residual_norm;
|
||||
query.q_norm_sq + o_norm * o_norm - 2.0 * ip
|
||||
}
|
||||
|
||||
/// The cheap **euclidean ranking key** for nearest-neighbour reranking:
|
||||
/// monotone in the estimated squared distance with the per-query constant
|
||||
/// `‖q_r‖²` dropped. Smaller = nearer. Equals `‖o_r‖² − 2·⟨o_r, q_r⟩`.
|
||||
///
|
||||
/// Use this (not [`Self::estimate_sq_distance`]) for top-K reranking under a
|
||||
/// **euclidean** ground truth — it avoids adding the same `q_norm_sq` to
|
||||
/// every candidate. For a **cosine** ground truth (AETHER / the coverage
|
||||
/// harness), use [`Self::cosine_ranking_key`] instead.
|
||||
#[inline]
|
||||
pub fn ranking_key(sketch: &EstimatorSketch, query: &EstimatorQuery) -> f32 {
|
||||
let ip = Self::estimate_inner_product(sketch, query);
|
||||
let o_norm = sketch.side.residual_norm;
|
||||
o_norm * o_norm - 2.0 * ip
|
||||
}
|
||||
|
||||
/// The cheap **cosine ranking key**: smaller = nearer in cosine distance.
|
||||
///
|
||||
/// Cosine distance is `1 − ⟨o_r,q_r⟩ / (‖o_r‖·‖q_r‖)`. `‖q_r‖` is a
|
||||
/// per-query constant, so ranking by cosine distance ascending is ranking by
|
||||
/// `⟨o_r,q_r⟩ / ‖o_r‖` **descending**, i.e. by `−⟨o, q_r⟩` ascending. And
|
||||
/// `⟨o, q_r⟩ = ⟨x̄, q'⟩ / x_dot_o` — the unit-residual inner product, which
|
||||
/// needs **only the code and `x_dot_o`**, not even `residual_norm`. We
|
||||
/// return `−⟨o, q_r⟩` so "smaller = nearer" matches the euclidean key's
|
||||
/// convention.
|
||||
///
|
||||
/// This is the correct key when the sketch is used (as in ADR-084) as an
|
||||
/// **angular** sensor graded against a cosine top-K: the 1-bit code is a
|
||||
/// rotated-angle estimator, and dividing by `x_dot_o` is the RaBitQ unbiased
|
||||
/// rescale of that angle's inner product.
|
||||
#[inline]
|
||||
pub fn cosine_ranking_key(sketch: &EstimatorSketch, query: &EstimatorQuery) -> f32 {
|
||||
let x_dot_o = sketch.side.x_dot_o;
|
||||
if x_dot_o <= 0.0 {
|
||||
return 0.0;
|
||||
}
|
||||
// ⟨o, q_r⟩ = ⟨x̄, q'⟩ / x_dot_o ; nearer in cosine ⇒ larger ⇒ negate.
|
||||
-(sketch.code_dot(&query.q_rotated_padded) / x_dot_o)
|
||||
}
|
||||
}
|
||||
|
||||
/// A bank of [`EstimatorSketch`]es with stable IDs, reranked by the RaBitQ
|
||||
/// **unbiased distance estimate** instead of raw Hamming.
|
||||
///
|
||||
/// All sketches share one [`Rotation`] (the index/query frame). The bank rotates
|
||||
/// every inserted embedding and every query through it, so the estimator is
|
||||
/// always computed in a consistent frame.
|
||||
///
|
||||
/// # Invariants
|
||||
/// - All sketches share the bank's `embedding_dim` and `Rotation`.
|
||||
/// - IDs are caller-assigned and stable.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct EstimatorBank {
|
||||
rotation: Rotation,
|
||||
entries: Vec<(u32, EstimatorSketch)>,
|
||||
embedding_dim: usize,
|
||||
/// Optional shared centroid subtracted from every embedding/query before
|
||||
/// rotation. `None` = zero-centroid (the default simplification).
|
||||
centroid: Option<Vec<f32>>,
|
||||
}
|
||||
|
||||
impl EstimatorBank {
|
||||
/// Create an empty bank over `rotation`'s dimension and frame (zero-centroid).
|
||||
pub fn new(rotation: Rotation) -> Self {
|
||||
let embedding_dim = rotation.dim();
|
||||
Self {
|
||||
rotation,
|
||||
entries: Vec::new(),
|
||||
embedding_dim,
|
||||
centroid: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create an empty bank that subtracts `centroid` from every embedding and
|
||||
/// query before rotation (the paper's centroid path). Used by ADR-156 to
|
||||
/// measure the cost of the zero-centroid simplification.
|
||||
pub fn with_centroid(rotation: Rotation, centroid: Vec<f32>) -> Self {
|
||||
let embedding_dim = rotation.dim();
|
||||
Self {
|
||||
rotation,
|
||||
entries: Vec::new(),
|
||||
embedding_dim,
|
||||
centroid: Some(centroid),
|
||||
}
|
||||
}
|
||||
|
||||
/// The rotation (index/query frame) this bank uses.
|
||||
#[inline]
|
||||
pub fn rotation(&self) -> &Rotation {
|
||||
&self.rotation
|
||||
}
|
||||
|
||||
/// Number of stored sketches.
|
||||
#[inline]
|
||||
pub fn len(&self) -> usize {
|
||||
self.entries.len()
|
||||
}
|
||||
|
||||
/// True iff empty.
|
||||
#[inline]
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.entries.is_empty()
|
||||
}
|
||||
|
||||
/// Source embedding dimension.
|
||||
#[inline]
|
||||
pub fn embedding_dim(&self) -> usize {
|
||||
self.embedding_dim
|
||||
}
|
||||
|
||||
/// Insert a raw embedding, sketching it (with side info) through the bank's
|
||||
/// rotation. The stored code and the queries share one rotated frame.
|
||||
pub fn insert_embedding(&mut self, id: u32, embedding: &[f32]) {
|
||||
let sketch = EstimatorSketch::from_embedding_centred(
|
||||
embedding,
|
||||
&self.rotation,
|
||||
self.centroid.as_deref(),
|
||||
);
|
||||
self.entries.push((id, sketch));
|
||||
}
|
||||
|
||||
/// Insert a pre-built [`EstimatorSketch`] (must have been built with this
|
||||
/// bank's rotation; the caller is responsible for that).
|
||||
pub fn insert(&mut self, id: u32, sketch: EstimatorSketch) {
|
||||
self.entries.push((id, sketch));
|
||||
}
|
||||
|
||||
/// Top-K nearest neighbours by the **RaBitQ unbiased estimate**, ascending
|
||||
/// by [`DistanceEstimator::ranking_key`]. Returns up to `k` `(id, key)`
|
||||
/// pairs. If `k == 0` or the bank is empty, returns empty. If the bank has
|
||||
/// fewer than `k`, returns all of them.
|
||||
///
|
||||
/// The query is rotated **once**; every candidate then costs one
|
||||
/// length-`D` signed-sum dot product — the estimator is as cheap per
|
||||
/// candidate as Hamming plus a multiply.
|
||||
pub fn topk_estimated(&self, query: &[f32], k: usize) -> Vec<(u32, f32)> {
|
||||
self.topk_by(query, k, DistanceEstimator::ranking_key)
|
||||
}
|
||||
|
||||
/// Top-K by the estimated **cosine** distance
|
||||
/// ([`DistanceEstimator::cosine_ranking_key`]) — the correct rerank when the
|
||||
/// sketch is graded against a cosine top-K (AETHER / the coverage harness).
|
||||
pub fn topk_estimated_cosine(&self, query: &[f32], k: usize) -> Vec<(u32, f32)> {
|
||||
self.topk_by(query, k, DistanceEstimator::cosine_ranking_key)
|
||||
}
|
||||
|
||||
/// Shared top-K driver parameterised on the ranking-key function. Rotates
|
||||
/// the query once, scores every candidate with `key`, returns the `k`
|
||||
/// smallest keys ascending.
|
||||
fn topk_by(
|
||||
&self,
|
||||
query: &[f32],
|
||||
k: usize,
|
||||
key: fn(&EstimatorSketch, &EstimatorQuery) -> f32,
|
||||
) -> Vec<(u32, f32)> {
|
||||
if k == 0 || self.entries.is_empty() {
|
||||
return Vec::new();
|
||||
}
|
||||
let q = EstimatorQuery::new_centred(query, &self.rotation, self.centroid.as_deref());
|
||||
let mut scored: Vec<(u32, f32)> = self
|
||||
.entries
|
||||
.iter()
|
||||
.map(|(id, sk)| (*id, key(sk, &q)))
|
||||
.collect();
|
||||
// Ascending by ranking key. Total ordering via partial_cmp with a
|
||||
// NaN-safe fallback (estimates are finite for finite input).
|
||||
scored.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(std::cmp::Ordering::Equal));
|
||||
scored.truncate(k);
|
||||
scored
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
fn l2(v: &[f32]) -> f32 {
|
||||
v.iter().map(|&x| x * x).sum::<f32>().sqrt()
|
||||
}
|
||||
|
||||
/// Brute-force true inner product of two residuals (zero-centroid).
|
||||
fn true_inner(a: &[f32], b: &[f32]) -> f32 {
|
||||
a.iter().zip(b).map(|(&x, &y)| x * y).sum()
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn estimator_is_deterministic() {
|
||||
// Same (seed, dim) rotation + same vectors ⇒ identical estimate, twice.
|
||||
let dim = 64;
|
||||
let rot = Rotation::new(0xC0DE_1234_5678_9ABC, dim);
|
||||
let o: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.21).sin() + 0.3).collect();
|
||||
let qv: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.11).cos() - 0.2).collect();
|
||||
|
||||
let s1 = EstimatorSketch::from_embedding(&o, &rot);
|
||||
let s2 = EstimatorSketch::from_embedding(&o, &rot);
|
||||
let q1 = EstimatorQuery::new(&qv, &rot);
|
||||
let q2 = EstimatorQuery::new(&qv, &Rotation::new(0xC0DE_1234_5678_9ABC, dim));
|
||||
|
||||
let e1 = DistanceEstimator::estimate_inner_product(&s1, &q1);
|
||||
let e2 = DistanceEstimator::estimate_inner_product(&s2, &q2);
|
||||
assert_eq!(e1, e2, "estimator must be deterministic for a fixed seed");
|
||||
|
||||
// Bank topk is deterministic too.
|
||||
let mut bank = EstimatorBank::new(Rotation::new(7, dim));
|
||||
for id in 0..16u32 {
|
||||
let v: Vec<f32> = (0..dim).map(|i| ((i + id as usize) as f32 * 0.07).sin()).collect();
|
||||
bank.insert_embedding(id, &v);
|
||||
}
|
||||
let a = bank.topk_estimated(&qv, 5);
|
||||
let b = bank.topk_estimated(&qv, 5);
|
||||
assert_eq!(a, b, "topk_estimated must be deterministic");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn estimator_unbiased_on_fixture() {
|
||||
// The core unbiasedness claim: averaging the estimate of ⟨o_r, q_r⟩ over
|
||||
// MANY random rotation seeds converges to the true inner product.
|
||||
//
|
||||
// Hand-checkable small case: two fixed vectors, known true inner
|
||||
// product, average the estimator over many seeds and assert it lands
|
||||
// within a tolerance that a BIASED estimator would miss.
|
||||
let dim = 32;
|
||||
let o: Vec<f32> = (0..dim).map(|i| ((i % 7) as f32 - 3.0) * 0.4 + 0.5).collect();
|
||||
let qv: Vec<f32> = (0..dim).map(|i| ((i % 5) as f32 - 2.0) * 0.3 - 0.1).collect();
|
||||
let truth = true_inner(&o, &qv);
|
||||
|
||||
let n_seeds = 4000u64;
|
||||
let mut acc = 0.0f64;
|
||||
for seed in 0..n_seeds {
|
||||
let rot = Rotation::new(seed.wrapping_mul(0x9E37_79B9_7F4A_7C15) ^ 0xABCD, dim);
|
||||
let sk = EstimatorSketch::from_embedding(&o, &rot);
|
||||
let q = EstimatorQuery::new(&qv, &rot);
|
||||
acc += DistanceEstimator::estimate_inner_product(&sk, &q) as f64;
|
||||
}
|
||||
let mean = (acc / n_seeds as f64) as f32;
|
||||
|
||||
// Tolerance scaled to the magnitudes involved. The estimator is
|
||||
// unbiased, so the Monte-Carlo mean must be CLOSE to truth; a sign-only
|
||||
// Hamming proxy (or a biased rescale) would be systematically off.
|
||||
let scale = l2(&o) * l2(&qv);
|
||||
let tol = 0.06 * scale; // ~6% of the ‖o‖‖q‖ envelope over 4000 seeds
|
||||
assert!(
|
||||
(mean - truth).abs() < tol,
|
||||
"estimator biased: mean={mean:.4} truth={truth:.4} tol={tol:.4} (scale={scale:.4})"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn estimator_self_distance_is_small() {
|
||||
// Estimating the distance of a vector to itself should be ~0 (the
|
||||
// estimate of ⟨o,o⟩ ≈ ‖o‖², so ‖q-o‖² ≈ 0). Not exactly 0 (1-bit code),
|
||||
// but small relative to ‖o‖².
|
||||
let dim = 128;
|
||||
let rot = Rotation::new(0xBEEF_CAFE, dim);
|
||||
let o: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.37).cos() + 0.2).collect();
|
||||
let sk = EstimatorSketch::from_embedding(&o, &rot);
|
||||
let q = EstimatorQuery::new(&o, &rot);
|
||||
let sq = DistanceEstimator::estimate_sq_distance(&sk, &q);
|
||||
let o_norm_sq = l2(&o) * l2(&o);
|
||||
assert!(
|
||||
sq.abs() < 0.25 * o_norm_sq,
|
||||
"self sq-distance estimate {sq:.3} too large vs ‖o‖²={o_norm_sq:.3}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn side_info_is_eight_bytes() {
|
||||
assert_eq!(EstimatorSketch::SIDE_INFO_BYTES, 8);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn x_dot_o_in_unit_range() {
|
||||
// ⟨x̄, o'⟩ ∈ (0, 1] for any non-zero input (it's the cosine between the
|
||||
// rotated residual and its nearest hypercube corner).
|
||||
let dim = 96;
|
||||
let rot = Rotation::new(0x1357_9BDF, dim);
|
||||
for s in 0..20u32 {
|
||||
let v: Vec<f32> = (0..dim).map(|i| (((i + s as usize) * 13 % 23) as f32 - 11.0) * 0.2).collect();
|
||||
let sk = EstimatorSketch::from_embedding(&v, &rot);
|
||||
let x = sk.side_info().x_dot_o;
|
||||
assert!(x > 0.0 && x <= 1.0 + 1e-5, "x_dot_o out of (0,1]: {x}");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn zero_input_does_not_panic() {
|
||||
let dim = 64;
|
||||
let rot = Rotation::new(1, dim);
|
||||
let sk = EstimatorSketch::from_embedding(&vec![0.0f32; dim], &rot);
|
||||
assert_eq!(sk.residual_norm(), 0.0);
|
||||
let q = EstimatorQuery::new(&vec![1.0f32; dim], &rot);
|
||||
// No divide-by-zero; degenerate estimate is 0 inner product.
|
||||
assert_eq!(DistanceEstimator::estimate_inner_product(&sk, &q), 0.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn centroid_path_self_query_ranks_self_first() {
|
||||
// The paper-faithful centroid path (o_r = o − c) must still rank a
|
||||
// stored vector first when queried with itself, with a shared centroid.
|
||||
let dim = 64;
|
||||
let rot = Rotation::new(0x9999, dim);
|
||||
let centroid: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.05).sin()).collect();
|
||||
let mut bank = EstimatorBank::with_centroid(rot, centroid.clone());
|
||||
let target: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.23).cos() + 1.5).collect();
|
||||
bank.insert_embedding(7, &target);
|
||||
for id in 0..24u32 {
|
||||
let v: Vec<f32> = (0..dim)
|
||||
.map(|i| ((i as f32 + id as f32) * 0.09).sin() + 1.4)
|
||||
.collect();
|
||||
bank.insert_embedding(id, &v);
|
||||
}
|
||||
let top = bank.topk_estimated_cosine(&target, 1);
|
||||
assert_eq!(top.len(), 1);
|
||||
assert_eq!(top[0].0, 7, "centroid-path self-query should rank self first");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn centroid_zero_matches_default() {
|
||||
// from_embedding_centred(None) must be byte-identical to from_embedding.
|
||||
let dim = 48;
|
||||
let rot = Rotation::new(0x4242, dim);
|
||||
let v: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.3).sin() - 0.1).collect();
|
||||
let a = EstimatorSketch::from_embedding(&v, &rot);
|
||||
let b = EstimatorSketch::from_embedding_centred(&v, &rot, None);
|
||||
assert_eq!(a.residual_norm(), b.residual_norm());
|
||||
assert_eq!(a.side_info(), b.side_info());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_self_query_ranks_self_first() {
|
||||
// A bank queried with one of its own stored vectors should rank that id
|
||||
// first under the estimator (its estimated distance to itself is the
|
||||
// smallest).
|
||||
let dim = 128;
|
||||
let rot = Rotation::new(0xABCD_1234, dim);
|
||||
let mut bank = EstimatorBank::new(rot);
|
||||
let target: Vec<f32> = (0..dim).map(|i| (i as f32 * 0.19).sin() * 2.0).collect();
|
||||
bank.insert_embedding(99, &target);
|
||||
for id in 0..32u32 {
|
||||
let v: Vec<f32> = (0..dim)
|
||||
.map(|i| ((i as f32 + id as f32 * 3.0) * 0.05).cos())
|
||||
.collect();
|
||||
bank.insert_embedding(id, &v);
|
||||
}
|
||||
let top = bank.topk_estimated(&target, 1);
|
||||
assert_eq!(top.len(), 1);
|
||||
assert_eq!(top[0].0, 99, "self-query should rank the stored self first");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,826 @@
|
||||
//! A correct, dependency-free **float HNSW** graph-ANN index — ADR-261.
|
||||
//!
|
||||
//! # Why this exists
|
||||
//!
|
||||
//! The ruvector crate's retrieval path (AETHER re-ID hot-cache, the `sketch.rs`
|
||||
//! 1-bit prefilter, room fingerprinting) is, at its core, an **approximate
|
||||
//! nearest-neighbour** problem: dense float embedding in, top-K similar ids out.
|
||||
//! Until now the crate had **no graph index** — every `topk` was a linear scan
|
||||
//! (`O(N·d)` per query) or a 1-bit Hamming prefilter over a linear scan. That is
|
||||
//! fine at the small N the unit fixtures use, but it is `O(N)` per query and does
|
||||
//! not scale.
|
||||
//!
|
||||
//! [ADR-156 §5 #1](../../../../../docs/adr/ADR-156-ruvector-fusion-beyond-sota.md)
|
||||
//! lists **SymphonyQG** (SIGMOD 2025) as the lead beyond-SOTA ANN candidate,
|
||||
//! claiming **3.5–17× QPS over HNSW at equal recall** — but graded that claim
|
||||
//! **CLAIMED**, *"not reproduced on our hardware (no HNSW baseline exists to
|
||||
//! compare against)."* You cannot measure a ratio against a baseline you do not
|
||||
//! have. This module **builds that missing HNSW baseline**; [`crate::hnsw_quantized`]
|
||||
//! builds the quantized-rerank variant that tests the *direction* of the
|
||||
//! SymphonyQG bet. ADR-261 reports the **measured** ratio.
|
||||
//!
|
||||
//! # The algorithm (Malkov & Yashunin, TPAMI 2018)
|
||||
//!
|
||||
//! HNSW = a multi-layer navigable small-world graph. Each inserted point gets a
|
||||
//! random **level** `ℓ` (geometrically distributed, mean `1/ln(M)`); it appears
|
||||
//! in all layers `0..=ℓ`. Layer 0 holds every point; higher layers are
|
||||
//! exponentially sparser "express lanes". A search:
|
||||
//!
|
||||
//! 1. Enters at the top layer's single entry point.
|
||||
//! 2. **Greedy-descends** each layer above 0: repeatedly hop to the neighbour
|
||||
//! closest to the query until no neighbour is closer, then drop a layer.
|
||||
//! 3. At layer 0, runs a **best-first beam search** with beam width `ef`,
|
||||
//! keeping the `ef` closest candidates seen, and returns the closest `k`.
|
||||
//!
|
||||
//! Construction inserts each point by searching for its `ef_construction`
|
||||
//! nearest existing neighbours at each of its layers, then connecting it to a
|
||||
//! pruned subset chosen by the **neighbour-selection heuristic** (Algorithm 4 in
|
||||
//! the paper): prefer neighbours that are closer to the new point than to any
|
||||
//! already-selected neighbour, which keeps the graph navigable (diverse edges)
|
||||
//! instead of clumping all edges toward one cluster.
|
||||
//!
|
||||
//! # Determinism (the proof contract)
|
||||
//!
|
||||
//! Level assignment is the only randomness, and it is driven by a **seeded
|
||||
//! SplitMix64** PRNG (the exact pattern from [`crate::rotation`]) — never
|
||||
//! `Date::now`, an OS RNG, or `rand` without a seed. Two indices built from the
|
||||
//! same `(seed, params, insertion order)` are bit-identical, pinned by
|
||||
//! [`tests::hnsw_is_deterministic_for_seed`]. This matters for reproducible
|
||||
//! benchmarks: the recall/QPS numbers in ADR-261 must be regenerable.
|
||||
//!
|
||||
//! # Robustness (no panic on degenerate input)
|
||||
//!
|
||||
//! Empty index, `k > n`, `k == 0`, a single node, zero-dimension vectors,
|
||||
//! ragged-length queries, and `ef < k` are all handled without panicking —
|
||||
//! pinned by the `*_no_panic` / degenerate tests. Graph traversal is bounded by
|
||||
//! the visited-set and the candidate beam, so there is no unbounded recursion
|
||||
//! (the search is iterative, using explicit heaps).
|
||||
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::{BinaryHeap, HashSet};
|
||||
|
||||
/// Distance metric for the index. Both are computed over `Vec<f32>` with an
|
||||
/// `f64` accumulator for numerical stability on long vectors.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub enum Metric {
|
||||
/// Squared euclidean distance `Σ (a_i − b_i)²`. Monotone in euclidean
|
||||
/// distance, so top-K ranking is identical; we skip the sqrt.
|
||||
L2,
|
||||
/// Cosine **distance** `1 − cos(a, b)`. Smaller = more similar. This is
|
||||
/// AETHER's actual angular metric and what the `sketch.rs` sign code
|
||||
/// approximates, so it is the default for ruvector re-ID.
|
||||
Cosine,
|
||||
}
|
||||
|
||||
impl Metric {
|
||||
/// Distance between two equal-length slices under this metric.
|
||||
///
|
||||
/// Ragged lengths are handled charitably (compared over the shorter prefix);
|
||||
/// a degenerate (zero-norm) cosine input yields the maximum cosine distance
|
||||
/// `1.0` rather than a NaN. Never panics.
|
||||
#[inline]
|
||||
pub fn distance(self, a: &[f32], b: &[f32]) -> f32 {
|
||||
let n = a.len().min(b.len());
|
||||
match self {
|
||||
Metric::L2 => {
|
||||
let mut acc = 0.0f64;
|
||||
for i in 0..n {
|
||||
let d = a[i] as f64 - b[i] as f64;
|
||||
acc += d * d;
|
||||
}
|
||||
acc as f32
|
||||
}
|
||||
Metric::Cosine => {
|
||||
let mut dot = 0.0f64;
|
||||
let mut na = 0.0f64;
|
||||
let mut nb = 0.0f64;
|
||||
for i in 0..n {
|
||||
let (x, y) = (a[i] as f64, b[i] as f64);
|
||||
dot += x * y;
|
||||
na += x * x;
|
||||
nb += y * y;
|
||||
}
|
||||
let denom = (na * nb).sqrt();
|
||||
if denom < 1e-12 {
|
||||
1.0
|
||||
} else {
|
||||
(1.0 - dot / denom) as f32
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Construction / search hyper-parameters for an [`HnswIndex`].
|
||||
///
|
||||
/// Defaults follow the paper's recommended starting points (`M = 16`,
|
||||
/// `ef_construction = 200`). `ef_search` is the query-time beam width; larger
|
||||
/// `ef_search` trades QPS for recall — the knob the ADR-261 benchmark sweeps to
|
||||
/// find the equal-recall operating point.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
pub struct HnswParams {
|
||||
/// Max neighbours per node on layers ≥ 1. Layer 0 uses `2·M` (`m_max0`),
|
||||
/// the paper's standard asymmetry (the base layer needs higher degree).
|
||||
pub m: usize,
|
||||
/// Candidate list size during construction (`efConstruction`). Larger =
|
||||
/// better-connected graph, slower build.
|
||||
pub ef_construction: usize,
|
||||
/// Default beam width at query time (`ef`). Overridable per-query in
|
||||
/// [`HnswIndex::search`].
|
||||
pub ef_search: usize,
|
||||
/// Seed for the level-assignment PRNG. Fixed ⇒ reproducible graph.
|
||||
pub seed: u64,
|
||||
}
|
||||
|
||||
impl Default for HnswParams {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
m: 16,
|
||||
ef_construction: 200,
|
||||
ef_search: 64,
|
||||
seed: 0x1157_0000_0000_0001u64,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// A min-distance ordering wrapper: a `BinaryHeap<Candidate>` is a **max-heap**,
|
||||
/// so we negate the comparison to make `peek()` the *closest* candidate when we
|
||||
/// want a min-heap, or use it directly for a max-heap of the *farthest*. We keep
|
||||
/// two explicit newtypes to make the intent unmistakable at each call site.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
struct Scored {
|
||||
dist: f32,
|
||||
id: u32,
|
||||
}
|
||||
|
||||
impl PartialEq for Scored {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
self.dist == other.dist && self.id == other.id
|
||||
}
|
||||
}
|
||||
impl Eq for Scored {}
|
||||
|
||||
/// Max-heap ordering: larger `dist` is "greater" ⇒ at the top. Ties broken by
|
||||
/// id so the order is total and deterministic.
|
||||
impl Ord for Scored {
|
||||
fn cmp(&self, other: &Self) -> Ordering {
|
||||
self.dist
|
||||
.partial_cmp(&other.dist)
|
||||
.unwrap_or(Ordering::Equal)
|
||||
.then(self.id.cmp(&other.id))
|
||||
}
|
||||
}
|
||||
impl PartialOrd for Scored {
|
||||
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
|
||||
Some(self.cmp(other))
|
||||
}
|
||||
}
|
||||
|
||||
/// `Reverse`-equivalent for a min-heap (closest at top) without pulling in
|
||||
/// `std::cmp::Reverse` boilerplate at every site.
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
struct MinScored(Scored);
|
||||
impl PartialEq for MinScored {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
self.0 == other.0
|
||||
}
|
||||
}
|
||||
impl Eq for MinScored {}
|
||||
impl Ord for MinScored {
|
||||
fn cmp(&self, other: &Self) -> Ordering {
|
||||
other.0.cmp(&self.0) // reversed
|
||||
}
|
||||
}
|
||||
impl PartialOrd for MinScored {
|
||||
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
|
||||
Some(self.cmp(other))
|
||||
}
|
||||
}
|
||||
|
||||
/// A multi-layer HNSW graph index over dense `Vec<f32>` embeddings.
|
||||
///
|
||||
/// IDs are the **insertion index** (`0..len`), returned by [`HnswIndex::search`]
|
||||
/// alongside the distance. The original vectors are retained (the graph needs
|
||||
/// them for distance computation at query time), so memory is
|
||||
/// `O(N·d) + O(N·M)` — the float vectors plus the adjacency lists.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct HnswIndex {
|
||||
metric: Metric,
|
||||
params: HnswParams,
|
||||
dim: usize,
|
||||
/// Stored vectors, indexed by id.
|
||||
vectors: Vec<Vec<f32>>,
|
||||
/// `links[id][layer]` = neighbour ids of `id` on `layer`. A node of level
|
||||
/// `ℓ` has `ℓ+1` layers (`0..=ℓ`).
|
||||
links: Vec<Vec<Vec<u32>>>,
|
||||
/// Per-node top level.
|
||||
levels: Vec<usize>,
|
||||
/// Current entry point id (the highest-level node), or `None` if empty.
|
||||
entry: Option<u32>,
|
||||
/// Highest level currently present in the graph.
|
||||
top_level: usize,
|
||||
/// PRNG state for level assignment (advances per insert).
|
||||
rng_state: u64,
|
||||
}
|
||||
|
||||
impl HnswIndex {
|
||||
/// Create an empty index with the given metric and parameters.
|
||||
///
|
||||
/// `dim` is the expected embedding dimension. Inserts of a different length
|
||||
/// are accepted charitably (the metric compares over the shorter prefix), so
|
||||
/// a wrong-length vector degrades recall rather than panicking — but callers
|
||||
/// should keep dimension uniform.
|
||||
pub fn new(dim: usize, metric: Metric, params: HnswParams) -> Self {
|
||||
Self {
|
||||
metric,
|
||||
params,
|
||||
dim,
|
||||
vectors: Vec::new(),
|
||||
links: Vec::new(),
|
||||
levels: Vec::new(),
|
||||
entry: None,
|
||||
top_level: 0,
|
||||
rng_state: params.seed.wrapping_add(0x9E37_79B9_7F4A_7C15),
|
||||
}
|
||||
}
|
||||
|
||||
/// Number of indexed points.
|
||||
#[inline]
|
||||
pub fn len(&self) -> usize {
|
||||
self.vectors.len()
|
||||
}
|
||||
|
||||
/// True iff the index holds no points.
|
||||
#[inline]
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.vectors.is_empty()
|
||||
}
|
||||
|
||||
/// The metric this index ranks by.
|
||||
#[inline]
|
||||
pub fn metric(&self) -> Metric {
|
||||
self.metric
|
||||
}
|
||||
|
||||
/// The expected embedding dimension.
|
||||
#[inline]
|
||||
pub fn dim(&self) -> usize {
|
||||
self.dim
|
||||
}
|
||||
|
||||
/// The current entry-point id (highest-level node), or `None` if empty.
|
||||
/// Exposed so the quantized variant ([`crate::hnsw_quantized`]) can traverse
|
||||
/// the **same** graph with a different (quantized) score.
|
||||
#[inline]
|
||||
pub fn entry_point(&self) -> Option<u32> {
|
||||
self.entry
|
||||
}
|
||||
|
||||
/// The highest level currently present in the graph.
|
||||
#[inline]
|
||||
pub fn top_level(&self) -> usize {
|
||||
self.top_level
|
||||
}
|
||||
|
||||
/// The default query-time beam width (`ef_search`) from this index's params.
|
||||
#[inline]
|
||||
pub fn params_ef_search(&self) -> usize {
|
||||
self.params.ef_search
|
||||
}
|
||||
|
||||
/// Borrow the neighbour ids of `id` on `layer`. Returns an empty slice if the
|
||||
/// id is unknown or the node does not reach that layer — never panics. Used
|
||||
/// by the quantized variant to walk the shared graph.
|
||||
#[inline]
|
||||
pub fn neighbours(&self, id: u32, layer: usize) -> &[u32] {
|
||||
match self.links.get(id as usize).and_then(|l| l.get(layer)) {
|
||||
Some(v) => v.as_slice(),
|
||||
None => &[],
|
||||
}
|
||||
}
|
||||
|
||||
/// `m_max` for a layer: `2·M` on layer 0, `M` above. The base layer carries
|
||||
/// every node and needs higher degree to stay connected (the paper's
|
||||
/// asymmetric degree cap).
|
||||
#[inline]
|
||||
fn m_max(&self, layer: usize) -> usize {
|
||||
if layer == 0 {
|
||||
self.params.m * 2
|
||||
} else {
|
||||
self.params.m
|
||||
}
|
||||
}
|
||||
|
||||
/// Draw the next node's level from a geometric distribution with parameter
|
||||
/// `m_l = 1/ln(M)` — the paper's level generator — using the **seeded**
|
||||
/// SplitMix64 stream. `floor(−ln(U) · m_l)` with `U ∈ (0, 1]`.
|
||||
fn assign_level(&mut self) -> usize {
|
||||
let m = self.params.m.max(2) as f64;
|
||||
let m_l = 1.0 / m.ln();
|
||||
// Uniform in (0, 1] from the top 53 bits of a SplitMix64 word.
|
||||
let r = split_mix64(&mut self.rng_state);
|
||||
let u = (((r >> 11) as f64) + 1.0) / ((1u64 << 53) as f64 + 1.0);
|
||||
let level = (-(u.ln()) * m_l).floor();
|
||||
if level.is_finite() && level >= 0.0 {
|
||||
level as usize
|
||||
} else {
|
||||
0
|
||||
}
|
||||
}
|
||||
|
||||
/// Insert `embedding` with the next sequential id. Returns the assigned id.
|
||||
///
|
||||
/// Builds the node's adjacency by searching the existing graph for its
|
||||
/// nearest neighbours at each of its layers and connecting via the
|
||||
/// neighbour-selection heuristic. The first insert becomes the entry point.
|
||||
pub fn insert(&mut self, embedding: &[f32]) -> u32 {
|
||||
let id = self.vectors.len() as u32;
|
||||
let vec = embedding.to_vec();
|
||||
let node_level = self.assign_level();
|
||||
|
||||
// Push the node into the arrays UP FRONT with empty per-layer link lists.
|
||||
// This is load-bearing: the bidirectional wiring below does
|
||||
// `self.links[nbr][l].push(id)`, after which a neighbour points at `id`;
|
||||
// a subsequent traversal step in the SAME insert can hop to that
|
||||
// neighbour and read `self.links[id]`. If `id`'s links did not exist yet
|
||||
// that read panics (the bug the recall gate caught). The new node has no
|
||||
// *incoming* edges until we add them, and empty outgoing lists, so it is
|
||||
// unreachable by the searches that run before its edges are wired —
|
||||
// pushing it early is safe and keeps every `self.links[*]` index valid.
|
||||
self.vectors.push(vec.clone());
|
||||
self.links.push(vec![Vec::new(); node_level + 1]);
|
||||
self.levels.push(node_level);
|
||||
|
||||
// First node: it is the entry point, no neighbours to connect.
|
||||
if self.entry.is_none() {
|
||||
self.entry = Some(id);
|
||||
self.top_level = node_level;
|
||||
return id;
|
||||
}
|
||||
|
||||
let entry = self.entry.unwrap();
|
||||
let mut ep = entry;
|
||||
|
||||
// Phase 1: greedy-descend from the top of the graph down to the layer
|
||||
// just above the node's own top level, refining the single entry point.
|
||||
let mut layer = self.top_level;
|
||||
while layer > node_level {
|
||||
ep = self.greedy_closest(&vec, ep, layer);
|
||||
if layer == 0 {
|
||||
break;
|
||||
}
|
||||
layer -= 1;
|
||||
}
|
||||
|
||||
// Phase 2: from min(node_level, top_level) down to 0, search for
|
||||
// ef_construction candidates, select neighbours, and wire bidirectional
|
||||
// edges (pruning the neighbour's list if it overflows m_max).
|
||||
let start = node_level.min(self.top_level);
|
||||
let mut layer = start as isize;
|
||||
while layer >= 0 {
|
||||
let l = layer as usize;
|
||||
let candidates =
|
||||
self.search_layer(&vec, &[ep], self.params.ef_construction.max(1), l);
|
||||
let selected = self.select_neighbours(&vec, &candidates, self.m_max(l));
|
||||
|
||||
// Connect node -> selected (write straight into the node's slot).
|
||||
self.links[id as usize][l] = selected.iter().map(|s| s.id).collect();
|
||||
|
||||
// Connect selected -> node (bidirectional), pruning if needed.
|
||||
for s in &selected {
|
||||
let nbr = s.id as usize;
|
||||
self.links[nbr][l].push(id);
|
||||
if self.links[nbr][l].len() > self.m_max(l) {
|
||||
self.prune_neighbours(nbr as u32, l);
|
||||
}
|
||||
}
|
||||
|
||||
// Move the entry for the next-lower layer to the closest candidate.
|
||||
if let Some(best) = candidates
|
||||
.iter()
|
||||
.min_by(|a, b| a.dist.partial_cmp(&b.dist).unwrap_or(Ordering::Equal))
|
||||
{
|
||||
ep = best.id;
|
||||
}
|
||||
layer -= 1;
|
||||
}
|
||||
|
||||
if node_level > self.top_level {
|
||||
self.top_level = node_level;
|
||||
self.entry = Some(id);
|
||||
}
|
||||
id
|
||||
}
|
||||
|
||||
/// Greedy single-best descent on one layer: hop to the neighbour closest to
|
||||
/// `query` until no neighbour improves. Iterative (bounded by the graph) —
|
||||
/// no recursion.
|
||||
fn greedy_closest(&self, query: &[f32], start: u32, layer: usize) -> u32 {
|
||||
let mut best = start;
|
||||
let mut best_d = self.metric.distance(query, &self.vectors[best as usize]);
|
||||
loop {
|
||||
let mut improved = false;
|
||||
for &nbr in &self.links[best as usize][layer] {
|
||||
let d = self.metric.distance(query, &self.vectors[nbr as usize]);
|
||||
if d < best_d {
|
||||
best_d = d;
|
||||
best = nbr;
|
||||
improved = true;
|
||||
}
|
||||
}
|
||||
if !improved {
|
||||
return best;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Beam search on one layer (paper Algorithm 2): best-first expansion from
|
||||
/// `entry_points`, keeping the `ef` closest results. Returns the result set
|
||||
/// (unsorted; callers sort/truncate). Bounded by a visited set + the `ef`
|
||||
/// result heap — no recursion, no unbounded growth.
|
||||
fn search_layer(
|
||||
&self,
|
||||
query: &[f32],
|
||||
entry_points: &[u32],
|
||||
ef: usize,
|
||||
layer: usize,
|
||||
) -> Vec<Scored> {
|
||||
let mut visited: HashSet<u32> = HashSet::new();
|
||||
// `candidates`: min-heap (closest first) of nodes to expand.
|
||||
let mut candidates: BinaryHeap<MinScored> = BinaryHeap::new();
|
||||
// `results`: max-heap (farthest first) of the best-ef found so far, so
|
||||
// the top is the current worst and is cheap to evict.
|
||||
let mut results: BinaryHeap<Scored> = BinaryHeap::new();
|
||||
|
||||
for &ep in entry_points {
|
||||
if ep as usize >= self.vectors.len() {
|
||||
continue;
|
||||
}
|
||||
let d = self.metric.distance(query, &self.vectors[ep as usize]);
|
||||
let s = Scored { dist: d, id: ep };
|
||||
visited.insert(ep);
|
||||
candidates.push(MinScored(s));
|
||||
results.push(s);
|
||||
}
|
||||
// Cap results at ef from the start.
|
||||
while results.len() > ef {
|
||||
results.pop();
|
||||
}
|
||||
|
||||
while let Some(MinScored(cur)) = candidates.pop() {
|
||||
// Stop when the closest unexpanded candidate is farther than the
|
||||
// current worst result and the result set is already full.
|
||||
let worst = results.peek().map(|s| s.dist).unwrap_or(f32::INFINITY);
|
||||
if cur.dist > worst && results.len() >= ef {
|
||||
break;
|
||||
}
|
||||
for &nbr in &self.links[cur.id as usize][layer] {
|
||||
if !visited.insert(nbr) {
|
||||
continue;
|
||||
}
|
||||
let d = self.metric.distance(query, &self.vectors[nbr as usize]);
|
||||
let worst = results.peek().map(|s| s.dist).unwrap_or(f32::INFINITY);
|
||||
if results.len() < ef || d < worst {
|
||||
let s = Scored { dist: d, id: nbr };
|
||||
candidates.push(MinScored(s));
|
||||
results.push(s);
|
||||
while results.len() > ef {
|
||||
results.pop();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
results.into_vec()
|
||||
}
|
||||
|
||||
/// Neighbour-selection heuristic (paper Algorithm 4): from `candidates`,
|
||||
/// greedily pick up to `m` that are **closer to the new point than to any
|
||||
/// already-picked neighbour**, giving diverse, navigable edges instead of a
|
||||
/// clump. Candidates are considered nearest-first.
|
||||
fn select_neighbours(&self, _base: &[f32], candidates: &[Scored], m: usize) -> Vec<Scored> {
|
||||
let mut sorted = candidates.to_vec();
|
||||
sorted.sort_by(|a, b| a.dist.partial_cmp(&b.dist).unwrap_or(Ordering::Equal));
|
||||
let mut selected: Vec<Scored> = Vec::with_capacity(m);
|
||||
for cand in sorted {
|
||||
if selected.len() >= m {
|
||||
break;
|
||||
}
|
||||
// Keep `cand` only if it is closer to `base` than to every already
|
||||
// selected neighbour — the diversity condition.
|
||||
let cand_vec = &self.vectors[cand.id as usize];
|
||||
let mut keep = true;
|
||||
for sel in &selected {
|
||||
let d_cand_sel = self.metric.distance(cand_vec, &self.vectors[sel.id as usize]);
|
||||
if d_cand_sel < cand.dist {
|
||||
keep = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
if keep {
|
||||
selected.push(cand);
|
||||
}
|
||||
}
|
||||
// If the diversity filter left us short (sparse graph), backfill with the
|
||||
// remaining nearest candidates so the node is not under-connected.
|
||||
if selected.len() < m {
|
||||
let chosen: HashSet<u32> = selected.iter().map(|s| s.id).collect();
|
||||
let mut rest: Vec<Scored> = candidates
|
||||
.iter()
|
||||
.filter(|c| !chosen.contains(&c.id))
|
||||
.copied()
|
||||
.collect();
|
||||
rest.sort_by(|a, b| a.dist.partial_cmp(&b.dist).unwrap_or(Ordering::Equal));
|
||||
for c in rest {
|
||||
if selected.len() >= m {
|
||||
break;
|
||||
}
|
||||
selected.push(c);
|
||||
}
|
||||
}
|
||||
selected
|
||||
}
|
||||
|
||||
/// Re-prune a node's neighbour list on `layer` back down to `m_max` using
|
||||
/// the selection heuristic, after a bidirectional edge pushed it over cap.
|
||||
fn prune_neighbours(&mut self, id: u32, layer: usize) {
|
||||
let base = self.vectors[id as usize].clone();
|
||||
let current: Vec<Scored> = self.links[id as usize][layer]
|
||||
.iter()
|
||||
.map(|&nbr| Scored {
|
||||
dist: self.metric.distance(&base, &self.vectors[nbr as usize]),
|
||||
id: nbr,
|
||||
})
|
||||
.collect();
|
||||
let kept = self.select_neighbours(&base, ¤t, self.m_max(layer));
|
||||
self.links[id as usize][layer] = kept.iter().map(|s| s.id).collect();
|
||||
}
|
||||
|
||||
/// Search for the `k` nearest neighbours of `query`, using beam width `ef`
|
||||
/// (clamped to at least `k`). Returns up to `k` `(id, distance)` pairs sorted
|
||||
/// ascending by distance.
|
||||
///
|
||||
/// Degenerate cases return cleanly: empty index ⇒ empty vec; `k == 0` ⇒ empty
|
||||
/// vec; `k > len` ⇒ all points; a single node ⇒ that node. Never panics.
|
||||
pub fn search(&self, query: &[f32], k: usize, ef: usize) -> Vec<(u32, f32)> {
|
||||
if k == 0 || self.is_empty() {
|
||||
return Vec::new();
|
||||
}
|
||||
let entry = match self.entry {
|
||||
Some(e) => e,
|
||||
None => return Vec::new(),
|
||||
};
|
||||
let ef = ef.max(k).max(1);
|
||||
|
||||
// Greedy-descend the upper layers to a good layer-0 entry point.
|
||||
let mut ep = entry;
|
||||
let mut layer = self.top_level;
|
||||
while layer > 0 {
|
||||
ep = self.greedy_closest(query, ep, layer);
|
||||
layer -= 1;
|
||||
}
|
||||
// Beam search on layer 0.
|
||||
let mut results = self.search_layer(query, &[ep], ef, 0);
|
||||
results.sort_by(|a, b| a.dist.partial_cmp(&b.dist).unwrap_or(Ordering::Equal));
|
||||
results.truncate(k);
|
||||
results.into_iter().map(|s| (s.id, s.dist)).collect()
|
||||
}
|
||||
|
||||
/// Search using the index's configured default `ef_search`.
|
||||
#[inline]
|
||||
pub fn search_default(&self, query: &[f32], k: usize) -> Vec<(u32, f32)> {
|
||||
self.search(query, k, self.params.ef_search)
|
||||
}
|
||||
|
||||
/// Borrow a stored vector by id (for the quantized variant / reranking).
|
||||
#[inline]
|
||||
pub fn vector(&self, id: u32) -> Option<&[f32]> {
|
||||
self.vectors.get(id as usize).map(|v| v.as_slice())
|
||||
}
|
||||
|
||||
/// Brute-force exact top-K linear scan over the stored vectors — the ANN
|
||||
/// **ground truth** and the linear-scan baseline the benchmark measures
|
||||
/// against. `O(N·d)` per query. Returns up to `k` `(id, distance)` ascending.
|
||||
pub fn brute_force(&self, query: &[f32], k: usize) -> Vec<(u32, f32)> {
|
||||
if k == 0 || self.is_empty() {
|
||||
return Vec::new();
|
||||
}
|
||||
let mut scored: Vec<(u32, f32)> = self
|
||||
.vectors
|
||||
.iter()
|
||||
.enumerate()
|
||||
.map(|(i, v)| (i as u32, self.metric.distance(query, v)))
|
||||
.collect();
|
||||
scored.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
|
||||
scored.truncate(k);
|
||||
scored
|
||||
}
|
||||
}
|
||||
|
||||
/// SplitMix64 step — the same deterministic PRNG used by [`crate::rotation`].
|
||||
/// Public-domain (Sebastiano Vigna). Dependency-free and reproducible.
|
||||
#[inline]
|
||||
pub(crate) fn split_mix64(state: &mut u64) -> u64 {
|
||||
*state = state.wrapping_add(0x9E37_79B9_7F4A_7C15);
|
||||
let mut z = *state;
|
||||
z = (z ^ (z >> 30)).wrapping_mul(0xBF58_476D_1CE4_E5B9);
|
||||
z = (z ^ (z >> 27)).wrapping_mul(0x94D0_49BB_1331_11EB);
|
||||
z ^ (z >> 31)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// SplitMix64-driven uniform in [0,1) for building fixtures (mirrors
|
||||
/// `coverage.rs`'s style so the planted-cluster geometry matches).
|
||||
fn unif01(state: &mut u64) -> f32 {
|
||||
let r = split_mix64(state);
|
||||
((r >> 40) as f32) / ((1u64 << 24) as f32)
|
||||
}
|
||||
fn gauss(state: &mut u64) -> f32 {
|
||||
let u1 = unif01(state).max(1e-7);
|
||||
let u2 = unif01(state);
|
||||
(-2.0 * u1.ln()).sqrt() * (std::f32::consts::TAU * u2).cos()
|
||||
}
|
||||
|
||||
/// Build a planted-cluster fixture: `n` vectors of `dim`, in `clusters`
|
||||
/// Gaussian clusters. Returns the vectors. Deterministic from `seed`.
|
||||
fn planted(dim: usize, n: usize, clusters: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let centres: Vec<Vec<f32>> = (0..clusters)
|
||||
.map(|c| {
|
||||
let mut s = seed ^ (0xC0FFEE_u64.wrapping_mul(c as u64 + 1));
|
||||
(0..dim).map(|_| gauss(&mut s) * 3.0).collect()
|
||||
})
|
||||
.collect();
|
||||
(0..n)
|
||||
.map(|i| {
|
||||
let c = i % clusters;
|
||||
let mut s = seed ^ (i as u64).wrapping_mul(0x9E37);
|
||||
(0..dim).map(|d| centres[c][d] + gauss(&mut s) * 0.35).collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn build(vectors: &[Vec<f32>], metric: Metric, seed: u64) -> HnswIndex {
|
||||
let params = HnswParams {
|
||||
m: 16,
|
||||
ef_construction: 200,
|
||||
ef_search: 64,
|
||||
seed,
|
||||
};
|
||||
let mut idx = HnswIndex::new(vectors[0].len(), metric, params);
|
||||
for v in vectors {
|
||||
idx.insert(v);
|
||||
}
|
||||
idx
|
||||
}
|
||||
|
||||
/// Recall@k of HNSW search vs brute-force ground truth, averaged over queries
|
||||
/// drawn from the same planted clusters.
|
||||
fn recall_at_k(
|
||||
idx: &HnswIndex,
|
||||
vectors: &[Vec<f32>],
|
||||
dim: usize,
|
||||
clusters: usize,
|
||||
k: usize,
|
||||
ef: usize,
|
||||
n_queries: usize,
|
||||
seed: u64,
|
||||
) -> f64 {
|
||||
let centres_seed = seed; // reuse fixture seed for matching cluster geometry
|
||||
let mut total = 0.0f64;
|
||||
for q in 0..n_queries {
|
||||
let c = q % clusters;
|
||||
let mut s = centres_seed ^ 0xDEAD_0000 ^ (q as u64).wrapping_mul(0x2545_F491);
|
||||
// A query near cluster centre c: regenerate the centre then jitter.
|
||||
let mut cs = centres_seed ^ (0xC0FFEE_u64.wrapping_mul(c as u64 + 1));
|
||||
let centre: Vec<f32> = (0..dim).map(|_| gauss(&mut cs) * 3.0).collect();
|
||||
let qv: Vec<f32> = (0..dim).map(|d| centre[d] + gauss(&mut s) * 0.35).collect();
|
||||
|
||||
let truth: HashSet<u32> = idx.brute_force(&qv, k).into_iter().map(|(id, _)| id).collect();
|
||||
let got = idx.search(&qv, k, ef);
|
||||
let hit = got.iter().filter(|(id, _)| truth.contains(id)).count();
|
||||
total += hit as f64 / k as f64;
|
||||
let _ = vectors;
|
||||
}
|
||||
total / n_queries as f64
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn empty_index_search_is_empty_no_panic() {
|
||||
let idx = HnswIndex::new(8, Metric::L2, HnswParams::default());
|
||||
assert!(idx.is_empty());
|
||||
assert!(idx.search(&[0.0; 8], 5, 16).is_empty());
|
||||
assert!(idx.brute_force(&[0.0; 8], 5).is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn single_node_returns_itself() {
|
||||
let mut idx = HnswIndex::new(4, Metric::L2, HnswParams::default());
|
||||
let id = idx.insert(&[1.0, 2.0, 3.0, 4.0]);
|
||||
assert_eq!(id, 0);
|
||||
let r = idx.search(&[1.0, 2.0, 3.0, 4.0], 5, 16);
|
||||
assert_eq!(r.len(), 1);
|
||||
assert_eq!(r[0].0, 0);
|
||||
assert!(r[0].1 < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn k_zero_and_k_gt_n_no_panic() {
|
||||
let vectors = planted(16, 40, 4, 0xABCD);
|
||||
let idx = build(&vectors, Metric::L2, 0x1234);
|
||||
assert!(idx.search(&vectors[0], 0, 16).is_empty());
|
||||
// k > n returns all n.
|
||||
let r = idx.search(&vectors[0], 1000, 64);
|
||||
assert_eq!(r.len(), 40);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn ragged_query_no_panic() {
|
||||
let vectors = planted(16, 30, 3, 0x55);
|
||||
let idx = build(&vectors, Metric::Cosine, 0x66);
|
||||
// Short and long queries must not panic.
|
||||
assert!(!idx.search(&[1.0, 2.0, 3.0], 3, 16).is_empty());
|
||||
let long: Vec<f32> = (0..100).map(|i| i as f32).collect();
|
||||
assert!(!idx.search(&long, 3, 16).is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn self_query_ranks_self_first() {
|
||||
let vectors = planted(32, 200, 8, 0x77);
|
||||
let idx = build(&vectors, Metric::L2, 0x88);
|
||||
for &probe in &[0usize, 50, 137, 199] {
|
||||
let r = idx.search(&vectors[probe], 1, 64);
|
||||
assert_eq!(r.len(), 1);
|
||||
assert_eq!(r[0].0, probe as u32, "self-query should return the stored self");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn hnsw_is_deterministic_for_seed() {
|
||||
// Same (seed, params, insertion order) ⇒ identical level assignment and
|
||||
// identical search output.
|
||||
let vectors = planted(24, 150, 6, 0x2222);
|
||||
let a = build(&vectors, Metric::Cosine, 0xFEED);
|
||||
let b = build(&vectors, Metric::Cosine, 0xFEED);
|
||||
assert_eq!(a.levels, b.levels, "level assignment must be deterministic");
|
||||
let q = &vectors[42];
|
||||
assert_eq!(a.search(q, 10, 64), b.search(q, 10, 64));
|
||||
// A different seed (almost surely) changes the level structure.
|
||||
let c = build(&vectors, Metric::Cosine, 0x1357);
|
||||
assert_ne!(a.levels, c.levels, "different seed should change levels");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn recall_at_10_meets_correctness_gate_l2() {
|
||||
// THE CORRECTNESS GATE (ADR-261): HNSW recall@10 vs brute-force must be
|
||||
// >= 0.95 at a reasonable ef. Low recall ⇒ a bug in the graph.
|
||||
let dim = 64;
|
||||
let n = 2000;
|
||||
let clusters = 32;
|
||||
let seed = 0x9999;
|
||||
let vectors = planted(dim, n, clusters, seed);
|
||||
let idx = build(&vectors, Metric::L2, 0xAAAA);
|
||||
let recall = recall_at_k(&idx, &vectors, dim, clusters, 10, 128, 64, seed);
|
||||
assert!(
|
||||
recall >= 0.95,
|
||||
"HNSW recall@10 (L2) = {recall:.4} below the 0.95 correctness gate — graph bug"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn recall_at_10_meets_correctness_gate_cosine() {
|
||||
let dim = 64;
|
||||
let n = 2000;
|
||||
let clusters = 32;
|
||||
let seed = 0xBBBB;
|
||||
let vectors = planted(dim, n, clusters, seed);
|
||||
let idx = build(&vectors, Metric::Cosine, 0xCCCC);
|
||||
let recall = recall_at_k(&idx, &vectors, dim, clusters, 10, 128, 64, seed);
|
||||
assert!(
|
||||
recall >= 0.95,
|
||||
"HNSW recall@10 (cosine) = {recall:.4} below the 0.95 correctness gate — graph bug"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn higher_ef_does_not_reduce_recall() {
|
||||
// Monotonicity sanity: more beam width should not hurt recall.
|
||||
let dim = 48;
|
||||
let vectors = planted(dim, 1000, 16, 0xD00D);
|
||||
let idx = build(&vectors, Metric::L2, 0xE00E);
|
||||
let lo = recall_at_k(&idx, &vectors, dim, 16, 10, 16, 48, 0xD00D);
|
||||
let hi = recall_at_k(&idx, &vectors, dim, 16, 10, 128, 48, 0xD00D);
|
||||
assert!(hi + 1e-9 >= lo, "recall dropped with larger ef: {lo:.3} -> {hi:.3}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn zero_dim_no_panic() {
|
||||
// Degenerate zero-dimension index: inserts and searches must not panic.
|
||||
let mut idx = HnswIndex::new(0, Metric::Cosine, HnswParams::default());
|
||||
idx.insert(&[]);
|
||||
idx.insert(&[]);
|
||||
let r = idx.search(&[], 2, 16);
|
||||
assert_eq!(r.len(), 2);
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,466 @@
|
||||
//! A **SymphonyQG-style quantized-traversal HNSW** — ADR-261.
|
||||
//!
|
||||
//! # The SymphonyQG bet (what we are testing)
|
||||
//!
|
||||
//! [SymphonyQG (SIGMOD 2025)](../../../../../docs/adr/ADR-261-ruvector-graph-ann-index.md)
|
||||
//! unifies **quantization with graph traversal**: instead of computing the full
|
||||
//! float distance at every node the beam search visits (the cost that dominates
|
||||
//! float HNSW — one `O(d)` float dot/diff per visited node), it scores traversal
|
||||
//! candidates with a **cheap quantized distance** and only computes the exact
|
||||
//! float distance for the *final* candidate set, which it **reranks**. The bet:
|
||||
//! the quantized score is cheap enough — and accurate enough to keep the beam on
|
||||
//! the right path — that you visit roughly as many nodes but pay far less per
|
||||
//! node, and recover the small recall loss with a final exact rerank. Source
|
||||
//! reports **3.5–17× QPS over HNSW at equal recall**.
|
||||
//!
|
||||
//! # Our implementation (honest scope)
|
||||
//!
|
||||
//! We are **not** reproducing SymphonyQG's exact system (their RaBitQ-fused codes,
|
||||
//! their SIMD layout, their refined graph). We build the **direction** of the
|
||||
//! claim from the pieces this crate already has, so the comparison is
|
||||
//! apples-to-apples on *our* hardware:
|
||||
//!
|
||||
//! - **Same graph** as the float [`crate::HnswIndex`] — identical structure,
|
||||
//! identical seed, identical level assignment. The *only* variable between the
|
||||
//! float and quantized search is **how a candidate is scored during traversal**,
|
||||
//! so any QPS/recall difference is attributable to the quantization, not to a
|
||||
//! different graph.
|
||||
//! - **Quantized score = 1-bit Hamming over the RaBitQ Pass-2 rotated sign code**
|
||||
//! ([`crate::rotation`] + the sign-quantization in [`crate::sketch`]). Each
|
||||
//! node stores its `ceil(D/8)`-byte sign code (`D = next_pow2(dim)`). During
|
||||
//! traversal we compare query-code vs node-code by **POPCNT Hamming** — a few
|
||||
//! machine words, no per-dimension float work.
|
||||
//! - **Exact float rerank** of the final beam: the top `rerank` candidates by
|
||||
//! Hamming are re-scored with the true float metric and the best `k` returned.
|
||||
//!
|
||||
//! This trades a small recall hit (the 1-bit code is a coarse angle proxy — the
|
||||
//! same ~46%-strict limitation ADR-156 §10 measured) for far cheaper per-node
|
||||
//! scoring, recovered by the float rerank. **Whether that nets a QPS win at our
|
||||
//! test scale is the measured question ADR-261 answers** — and at small N the
|
||||
//! float distance is cheap enough that the Hamming saving may not pay off. We
|
||||
//! report the real number, win or lose, and do not tune to manufacture a speedup.
|
||||
//!
|
||||
//! # Determinism & robustness
|
||||
//!
|
||||
//! The graph seed drives everything (level assignment), so the quantized index
|
||||
//! is as reproducible as the float one. Empty/degenerate inputs are guarded
|
||||
//! exactly as in [`crate::hnsw`] — no panic on empty index, `k > n`, `k == 0`,
|
||||
//! single node, ragged query, or zero dim.
|
||||
|
||||
use std::cmp::Ordering;
|
||||
use std::collections::{BinaryHeap, HashSet};
|
||||
|
||||
use crate::hnsw::{HnswIndex, HnswParams, Metric};
|
||||
use crate::rotation::Rotation;
|
||||
|
||||
/// A 1-bit Pass-2 sign code for one vector, over the padded rotation length `D`.
|
||||
/// Stored as packed bytes; compared by POPCNT Hamming.
|
||||
#[derive(Debug, Clone)]
|
||||
struct Code {
|
||||
bits: Vec<u8>,
|
||||
}
|
||||
|
||||
impl Code {
|
||||
/// Hamming distance to another code of the same length (popcount of XOR).
|
||||
#[inline]
|
||||
fn hamming(&self, other: &Code) -> u32 {
|
||||
let n = self.bits.len().min(other.bits.len());
|
||||
let mut acc = 0u32;
|
||||
for i in 0..n {
|
||||
acc += (self.bits[i] ^ other.bits[i]).count_ones();
|
||||
}
|
||||
acc
|
||||
}
|
||||
}
|
||||
|
||||
/// Build the packed 1-bit sign code of a rotated embedding over the padded
|
||||
/// length `D = rotation.padded_dim()`. Bit set ⇒ rotated coord ≥ 0.
|
||||
fn encode(embedding: &[f32], rotation: &Rotation) -> Code {
|
||||
let rotated = rotation.apply_padded(embedding);
|
||||
let d = rotated.len();
|
||||
let mut bits = vec![0u8; d.div_ceil(8)];
|
||||
for (i, &c) in rotated.iter().enumerate() {
|
||||
if c >= 0.0 {
|
||||
bits[i / 8] |= 1 << (7 - (i % 8));
|
||||
}
|
||||
}
|
||||
Code { bits }
|
||||
}
|
||||
|
||||
/// Min-heap node for the quantized beam (closest Hamming at the top).
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
struct HScored {
|
||||
/// Hamming distance (quantized score) — the traversal key.
|
||||
ham: u32,
|
||||
id: u32,
|
||||
}
|
||||
impl PartialEq for HScored {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
self.ham == other.ham && self.id == other.id
|
||||
}
|
||||
}
|
||||
impl Eq for HScored {}
|
||||
impl Ord for HScored {
|
||||
fn cmp(&self, other: &Self) -> Ordering {
|
||||
self.ham.cmp(&other.ham).then(self.id.cmp(&other.id))
|
||||
}
|
||||
}
|
||||
impl PartialOrd for HScored {
|
||||
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
|
||||
Some(self.cmp(other))
|
||||
}
|
||||
}
|
||||
/// Reversed wrapper for a min-heap (smallest Hamming at the top).
|
||||
#[derive(Debug, Clone, Copy)]
|
||||
struct MinH(HScored);
|
||||
impl PartialEq for MinH {
|
||||
fn eq(&self, other: &Self) -> bool {
|
||||
self.0 == other.0
|
||||
}
|
||||
}
|
||||
impl Eq for MinH {}
|
||||
impl Ord for MinH {
|
||||
fn cmp(&self, other: &Self) -> Ordering {
|
||||
other.0.cmp(&self.0)
|
||||
}
|
||||
}
|
||||
impl PartialOrd for MinH {
|
||||
fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
|
||||
Some(self.cmp(other))
|
||||
}
|
||||
}
|
||||
|
||||
/// A SymphonyQG-style HNSW: the same graph as [`HnswIndex`], traversed by a
|
||||
/// **cheap 1-bit Hamming score**, with a final **exact-float rerank**.
|
||||
///
|
||||
/// Built by inserting the same vectors in the same order with the same seed as
|
||||
/// a float [`HnswIndex`], so the two indices share identical graph structure and
|
||||
/// only differ in how the beam is scored. The shared [`Rotation`] (seed + dim)
|
||||
/// is the index/query frame for the 1-bit codes.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct QuantizedHnswIndex {
|
||||
/// The underlying graph (built with the float metric for exact rerank).
|
||||
graph: HnswIndex,
|
||||
/// Per-node 1-bit Pass-2 codes, indexed by id (parallel to graph vectors).
|
||||
codes: Vec<Code>,
|
||||
/// The rotation frame shared by index and query codes.
|
||||
rotation: Rotation,
|
||||
/// Number of final candidates to exact-float rerank (≥ k at query time).
|
||||
default_rerank: usize,
|
||||
}
|
||||
|
||||
impl QuantizedHnswIndex {
|
||||
/// Build a quantized index over `vectors`, mirroring a float [`HnswIndex`]
|
||||
/// built with the same `(dim, metric, params)` and insertion order. The
|
||||
/// `rotation_seed` fixes the 1-bit code frame (index and query share it).
|
||||
///
|
||||
/// `default_rerank` is how many top-Hamming candidates get an exact float
|
||||
/// re-score before returning the best `k`; it is clamped to `≥ k` at query
|
||||
/// time. A larger rerank recovers more recall at more float cost — the knob
|
||||
/// that, alongside `ef`, sets the equal-recall operating point.
|
||||
pub fn build(
|
||||
vectors: &[Vec<f32>],
|
||||
dim: usize,
|
||||
metric: Metric,
|
||||
params: HnswParams,
|
||||
rotation_seed: u64,
|
||||
default_rerank: usize,
|
||||
) -> Self {
|
||||
let rotation = Rotation::new(rotation_seed, dim);
|
||||
let mut graph = HnswIndex::new(dim, metric, params);
|
||||
let mut codes = Vec::with_capacity(vectors.len());
|
||||
for v in vectors {
|
||||
graph.insert(v);
|
||||
codes.push(encode(v, &rotation));
|
||||
}
|
||||
Self {
|
||||
graph,
|
||||
codes,
|
||||
rotation,
|
||||
default_rerank: default_rerank.max(1),
|
||||
}
|
||||
}
|
||||
|
||||
/// Number of indexed points.
|
||||
#[inline]
|
||||
pub fn len(&self) -> usize {
|
||||
self.graph.len()
|
||||
}
|
||||
|
||||
/// True iff empty.
|
||||
#[inline]
|
||||
pub fn is_empty(&self) -> bool {
|
||||
self.graph.is_empty()
|
||||
}
|
||||
|
||||
/// Borrow the underlying float graph (for shared-graph benchmark parity:
|
||||
/// the float-HNSW baseline runs on *this* graph so the only variable is
|
||||
/// scoring).
|
||||
#[inline]
|
||||
pub fn graph(&self) -> &HnswIndex {
|
||||
&self.graph
|
||||
}
|
||||
|
||||
/// The rerank width this index defaults to.
|
||||
#[inline]
|
||||
pub fn default_rerank(&self) -> usize {
|
||||
self.default_rerank
|
||||
}
|
||||
|
||||
/// SymphonyQG-style search: traverse the graph scoring candidates by **1-bit
|
||||
/// Hamming**, collect a beam of `ef`, then **exact-float rerank** the top
|
||||
/// `rerank` (clamped ≥ k) and return the best `k` as `(id, float_dist)`.
|
||||
///
|
||||
/// Degenerate cases mirror [`HnswIndex::search`]: empty ⇒ empty; `k == 0` ⇒
|
||||
/// empty; `k > n` ⇒ all; never panics.
|
||||
pub fn search_quantized(
|
||||
&self,
|
||||
query: &[f32],
|
||||
k: usize,
|
||||
ef: usize,
|
||||
rerank: usize,
|
||||
) -> Vec<(u32, f32)> {
|
||||
if k == 0 || self.is_empty() {
|
||||
return Vec::new();
|
||||
}
|
||||
let ef = ef.max(k).max(1);
|
||||
let rerank = rerank.max(k);
|
||||
let q_code = encode(query, &self.rotation);
|
||||
|
||||
// Entry point: the graph's entry (highest-level node).
|
||||
let entry = match self.graph.entry_point() {
|
||||
Some(e) => e,
|
||||
None => return Vec::new(),
|
||||
};
|
||||
|
||||
// Greedy-descend upper layers by Hamming, then beam-search layer 0.
|
||||
let mut ep = entry;
|
||||
let mut layer = self.graph.top_level();
|
||||
while layer > 0 {
|
||||
ep = self.greedy_hamming(&q_code, ep, layer);
|
||||
layer -= 1;
|
||||
}
|
||||
let beam = self.beam_hamming(&q_code, ep, ef);
|
||||
|
||||
// Exact-float rerank of the top `rerank` Hamming candidates.
|
||||
let mut cand: Vec<HScored> = beam;
|
||||
cand.sort_by_key(|c| c.ham);
|
||||
cand.truncate(rerank);
|
||||
let mut reranked: Vec<(u32, f32)> = cand
|
||||
.iter()
|
||||
.filter_map(|c| {
|
||||
self.graph
|
||||
.vector(c.id)
|
||||
.map(|v| (c.id, self.graph.metric().distance(query, v)))
|
||||
})
|
||||
.collect();
|
||||
reranked.sort_by(|a, b| a.1.partial_cmp(&b.1).unwrap_or(Ordering::Equal));
|
||||
reranked.truncate(k);
|
||||
reranked
|
||||
}
|
||||
|
||||
/// Search using the index's default `ef` (from graph params) and rerank.
|
||||
#[inline]
|
||||
pub fn search_default(&self, query: &[f32], k: usize) -> Vec<(u32, f32)> {
|
||||
self.search_quantized(query, k, self.graph.params_ef_search(), self.default_rerank)
|
||||
}
|
||||
|
||||
/// Greedy single-best descent on a layer scored by Hamming.
|
||||
fn greedy_hamming(&self, q_code: &Code, start: u32, layer: usize) -> u32 {
|
||||
let mut best = start;
|
||||
let mut best_h = self.codes[best as usize].hamming(q_code);
|
||||
loop {
|
||||
let mut improved = false;
|
||||
for &nbr in self.graph.neighbours(best, layer) {
|
||||
let h = self.codes[nbr as usize].hamming(q_code);
|
||||
if h < best_h {
|
||||
best_h = h;
|
||||
best = nbr;
|
||||
improved = true;
|
||||
}
|
||||
}
|
||||
if !improved {
|
||||
return best;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Beam search on layer 0 scored by Hamming. Returns the `ef` best-Hamming
|
||||
/// nodes (unsorted). Iterative — bounded by the visited set + the ef beam.
|
||||
fn beam_hamming(&self, q_code: &Code, ep: u32, ef: usize) -> Vec<HScored> {
|
||||
let mut visited: HashSet<u32> = HashSet::new();
|
||||
let mut candidates: BinaryHeap<MinH> = BinaryHeap::new();
|
||||
let mut results: BinaryHeap<HScored> = BinaryHeap::new(); // max-heap: worst at top
|
||||
|
||||
let h0 = self.codes[ep as usize].hamming(q_code);
|
||||
let s0 = HScored { ham: h0, id: ep };
|
||||
visited.insert(ep);
|
||||
candidates.push(MinH(s0));
|
||||
results.push(s0);
|
||||
|
||||
while let Some(MinH(cur)) = candidates.pop() {
|
||||
let worst = results.peek().map(|s| s.ham).unwrap_or(u32::MAX);
|
||||
if cur.ham > worst && results.len() >= ef {
|
||||
break;
|
||||
}
|
||||
for &nbr in self.graph.neighbours(cur.id, 0) {
|
||||
if !visited.insert(nbr) {
|
||||
continue;
|
||||
}
|
||||
let h = self.codes[nbr as usize].hamming(q_code);
|
||||
let worst = results.peek().map(|s| s.ham).unwrap_or(u32::MAX);
|
||||
if results.len() < ef || h < worst {
|
||||
let s = HScored { ham: h, id: nbr };
|
||||
candidates.push(MinH(s));
|
||||
results.push(s);
|
||||
while results.len() > ef {
|
||||
results.pop();
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
results.into_vec()
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
fn split_mix64(state: &mut u64) -> u64 {
|
||||
*state = state.wrapping_add(0x9E37_79B9_7F4A_7C15);
|
||||
let mut z = *state;
|
||||
z = (z ^ (z >> 30)).wrapping_mul(0xBF58_476D_1CE4_E5B9);
|
||||
z = (z ^ (z >> 27)).wrapping_mul(0x94D0_49BB_1331_11EB);
|
||||
z ^ (z >> 31)
|
||||
}
|
||||
fn unif01(state: &mut u64) -> f32 {
|
||||
((split_mix64(state) >> 40) as f32) / ((1u64 << 24) as f32)
|
||||
}
|
||||
fn gauss(state: &mut u64) -> f32 {
|
||||
let u1 = unif01(state).max(1e-7);
|
||||
let u2 = unif01(state);
|
||||
(-2.0 * u1.ln()).sqrt() * (std::f32::consts::TAU * u2).cos()
|
||||
}
|
||||
fn planted(dim: usize, n: usize, clusters: usize, seed: u64) -> Vec<Vec<f32>> {
|
||||
let centres: Vec<Vec<f32>> = (0..clusters)
|
||||
.map(|c| {
|
||||
let mut s = seed ^ (0xC0FFEE_u64.wrapping_mul(c as u64 + 1));
|
||||
(0..dim).map(|_| gauss(&mut s) * 3.0).collect()
|
||||
})
|
||||
.collect();
|
||||
(0..n)
|
||||
.map(|i| {
|
||||
let c = i % clusters;
|
||||
let mut s = seed ^ (i as u64).wrapping_mul(0x9E37);
|
||||
(0..dim).map(|d| centres[c][d] + gauss(&mut s) * 0.35).collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
fn params(seed: u64) -> HnswParams {
|
||||
HnswParams {
|
||||
m: 16,
|
||||
ef_construction: 200,
|
||||
ef_search: 64,
|
||||
seed,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn empty_quantized_search_is_empty_no_panic() {
|
||||
let idx = QuantizedHnswIndex::build(&[], 8, Metric::Cosine, params(1), 0x42, 16);
|
||||
assert!(idx.is_empty());
|
||||
assert!(idx.search_quantized(&[0.0; 8], 5, 16, 16).is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn single_node_quantized_returns_itself() {
|
||||
let v = vec![vec![1.0, 2.0, 3.0, 4.0]];
|
||||
let idx = QuantizedHnswIndex::build(&v, 4, Metric::L2, params(2), 0x7, 8);
|
||||
let r = idx.search_quantized(&v[0], 3, 16, 8);
|
||||
assert_eq!(r.len(), 1);
|
||||
assert_eq!(r[0].0, 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn k_zero_and_k_gt_n_no_panic() {
|
||||
let vectors = planted(16, 40, 4, 0xABCD);
|
||||
let idx = QuantizedHnswIndex::build(&vectors, 16, Metric::L2, params(3), 0x9, 32);
|
||||
assert!(idx.search_quantized(&vectors[0], 0, 16, 16).is_empty());
|
||||
let r = idx.search_quantized(&vectors[0], 1000, 64, 64);
|
||||
assert_eq!(r.len(), 40);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn ragged_query_no_panic() {
|
||||
let vectors = planted(16, 30, 3, 0x55);
|
||||
let idx = QuantizedHnswIndex::build(&vectors, 16, Metric::Cosine, params(4), 0xB, 16);
|
||||
assert!(!idx.search_quantized(&[1.0, 2.0, 3.0], 3, 16, 16).is_empty());
|
||||
let long: Vec<f32> = (0..100).map(|i| i as f32).collect();
|
||||
assert!(!idx.search_quantized(&long, 3, 16, 16).is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn quantized_is_deterministic() {
|
||||
let vectors = planted(32, 300, 8, 0x2468);
|
||||
let a = QuantizedHnswIndex::build(&vectors, 32, Metric::Cosine, params(0xFEED), 0xC0DE, 32);
|
||||
let b = QuantizedHnswIndex::build(&vectors, 32, Metric::Cosine, params(0xFEED), 0xC0DE, 32);
|
||||
let q = &vectors[100];
|
||||
assert_eq!(
|
||||
a.search_quantized(q, 10, 64, 32),
|
||||
b.search_quantized(q, 10, 64, 32),
|
||||
"quantized search must be deterministic"
|
||||
);
|
||||
}
|
||||
|
||||
/// Recall@10 of quantized-HNSW vs brute-force ground truth, averaged over
|
||||
/// queries. With an exact-float rerank, recall should be high (the rerank
|
||||
/// repairs most of the 1-bit traversal's coarseness). This is the quantized
|
||||
/// variant's correctness gate.
|
||||
#[test]
|
||||
fn quantized_recall_at_10_is_high_with_rerank() {
|
||||
let dim = 64;
|
||||
let n = 2000;
|
||||
let clusters = 32;
|
||||
let seed = 0x9999;
|
||||
let vectors = planted(dim, n, clusters, seed);
|
||||
// Generous rerank so the exact float repairs the coarse Hamming beam.
|
||||
let idx = QuantizedHnswIndex::build(&vectors, dim, Metric::L2, params(0xAAAA), 0x5EED, 64);
|
||||
|
||||
let mut total = 0.0f64;
|
||||
let n_queries = 64;
|
||||
for q in 0..n_queries {
|
||||
let c = q % clusters;
|
||||
let mut cs = seed ^ (0xC0FFEE_u64.wrapping_mul(c as u64 + 1));
|
||||
let centre: Vec<f32> = (0..dim).map(|_| gauss(&mut cs) * 3.0).collect();
|
||||
let mut s = seed ^ 0xDEAD_0000 ^ (q as u64).wrapping_mul(0x2545_F491);
|
||||
let qv: Vec<f32> = (0..dim).map(|d| centre[d] + gauss(&mut s) * 0.35).collect();
|
||||
let truth: HashSet<u32> = idx
|
||||
.graph()
|
||||
.brute_force(&qv, 10)
|
||||
.into_iter()
|
||||
.map(|(id, _)| id)
|
||||
.collect();
|
||||
let got = idx.search_quantized(&qv, 10, 128, 64);
|
||||
let hit = got.iter().filter(|(id, _)| truth.contains(id)).count();
|
||||
total += hit as f64 / 10.0;
|
||||
}
|
||||
let recall = total / n_queries as f64;
|
||||
// The 1-bit code is coarse, so we do not demand the float 0.95 gate here;
|
||||
// but with a 64-wide rerank over an ef=128 beam it must be clearly useful
|
||||
// (well above random). ADR-261 reports the exact number; this gate just
|
||||
// catches a broken traversal/rerank.
|
||||
assert!(
|
||||
recall >= 0.80,
|
||||
"quantized recall@10 = {recall:.4} too low — traversal or rerank bug"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn zero_dim_no_panic() {
|
||||
let vectors = vec![vec![], vec![]];
|
||||
let idx = QuantizedHnswIndex::build(&vectors, 0, Metric::Cosine, params(5), 0x1, 4);
|
||||
let r = idx.search_quantized(&[], 2, 16, 4);
|
||||
assert_eq!(r.len(), 2);
|
||||
}
|
||||
}
|
||||
@@ -28,13 +28,25 @@
|
||||
|
||||
#[cfg(feature = "crv")]
|
||||
pub mod crv;
|
||||
pub mod ann_measure;
|
||||
pub mod coverage;
|
||||
pub mod estimator;
|
||||
pub mod event_log;
|
||||
pub mod hnsw;
|
||||
pub mod hnsw_quantized;
|
||||
pub mod mat;
|
||||
pub mod rotation;
|
||||
pub mod signal;
|
||||
pub mod sketch;
|
||||
pub mod viewpoint;
|
||||
|
||||
pub use estimator::{
|
||||
DistanceEstimator, EstimatorBank, EstimatorQuery, EstimatorSketch, SideInfo,
|
||||
};
|
||||
pub use event_log::{NoveltyEvent, PrivacyEventLog};
|
||||
pub use hnsw::{HnswIndex, HnswParams, Metric};
|
||||
pub use hnsw_quantized::QuantizedHnswIndex;
|
||||
pub use rotation::Rotation;
|
||||
pub use sketch::{
|
||||
Sketch, SketchBank, SketchError, WireSketch, WireSketchError, WIRE_SKETCH_FORMAT_VERSION,
|
||||
WIRE_SKETCH_MAGIC, WIRE_SKETCH_MAX_BYTES,
|
||||
|
||||
@@ -0,0 +1,373 @@
|
||||
//! RaBitQ **Pass 2** — deterministic randomized orthogonal rotation.
|
||||
//!
|
||||
//! Implements the "Pass 2" deferred in [`crate::sketch`]'s Pass-1 doc and in
|
||||
//! [ADR-156 §8](../../../../../docs/adr/ADR-156-ruvector-fusion-beyond-sota.md)
|
||||
//! (Multi-bit / Extended RaBitQ). The published *RaBitQ* algorithm
|
||||
//! (Gao & Long, SIGMOD 2024) wraps the 1-bit sign-quantization of Pass 1 with
|
||||
//! a **randomized orthogonal rotation** `R` applied to every embedding *before*
|
||||
//! sign-quantization. The rotation decorrelates coordinates so the per-bit sign
|
||||
//! carries more independent information, which gives both the paper's
|
||||
//! theoretical error bound and better top-K recall on anisotropic / correlated
|
||||
//! embedding distributions (exactly the case ADR-084's "Open questions" flagged
|
||||
//! for skewed spectrogram embeddings).
|
||||
//!
|
||||
//! # Why a Fast Hadamard Transform, not a dense d×d matrix
|
||||
//!
|
||||
//! A full dense orthogonal matrix `R ∈ ℝ^{d×d}` is **O(d²) memory and O(d²)
|
||||
//! time per vector**. ADR-084's wire format already provisions for embeddings
|
||||
//! up to `u16::MAX = 65,535` dimensions; a dense rotation there is ~4.3 G
|
||||
//! floats (17 GiB) — completely infeasible on the cluster-Pi / edge targets
|
||||
//! this sketch is built for.
|
||||
//!
|
||||
//! Instead we use the **randomized Hadamard transform** (the "HD" construction,
|
||||
//! a.k.a. a structured Johnson–Lindenstrauss / fast-JL rotation):
|
||||
//!
|
||||
//! ```text
|
||||
//! R · x = H · D · x
|
||||
//! ```
|
||||
//!
|
||||
//! where `D` is a diagonal matrix of random ±1 sign flips and `H` is the
|
||||
//! (normalized) Walsh–Hadamard matrix applied via the **Fast Hadamard
|
||||
//! Transform (FHT)**. The FHT is `O(d log d)` time and `O(1)` extra memory
|
||||
//! (in-place butterfly); `D` is `O(d)` memory (one sign per dimension, packed).
|
||||
//! `H` and `D` are each orthogonal, so `R = H·D` is orthogonal and therefore
|
||||
//! **norm-preserving** — a hard requirement for a rotation that must not distort
|
||||
//! relative distances. This is the same fast-orthogonal trick used by Fast-JL,
|
||||
//! Structured Orthogonal Random Features, and the RaBitQ reference rotation.
|
||||
//!
|
||||
//! # Determinism (index-time == query-time)
|
||||
//!
|
||||
//! The rotation **must** be identical when the bank is built and when it is
|
||||
//! queried, or the two sign-quantizations live in different rotated frames and
|
||||
//! hamming distance becomes meaningless. We therefore derive the ±1 sign flips
|
||||
//! deterministically from a stored `u64` seed via a SplitMix64 PRNG — **never**
|
||||
//! an unseeded / OS RNG. Two [`Rotation`]s built from the same `(seed, dim)`
|
||||
//! produce bit-identical output for the same input (pinned by
|
||||
//! `rotation_is_deterministic_for_seed`).
|
||||
//!
|
||||
//! # Power-of-two padding
|
||||
//!
|
||||
//! The FHT is defined on lengths that are powers of two. For a `d` that is not
|
||||
//! a power of two we pad the (sign-flipped) input with zeros up to the next
|
||||
//! power of two `m = next_pow2(d)`, run the length-`m` FHT, and then **read back
|
||||
//! the first `d` coordinates**. Zero-padding + orthogonal `H` keeps the
|
||||
//! transform norm-preserving on the padded vector; we sign-quantize the first
|
||||
//! `d` rotated coordinates so the sketch dimension is unchanged from Pass 1
|
||||
//! (API-compatible: same `embedding_dim`, same packed-byte length, same
|
||||
//! `SketchBank` schema).
|
||||
|
||||
/// A deterministic randomized orthogonal rotation (FHT-based) applied to an
|
||||
/// embedding before sign-quantization — RaBitQ Pass 2.
|
||||
///
|
||||
/// Construct once per `(seed, dim)` and reuse for **every** embedding that goes
|
||||
/// into the same [`crate::SketchBank`] (and for every query against it). The
|
||||
/// seed is stored so the rotation is reproducible across processes and runs.
|
||||
///
|
||||
/// # Invariants
|
||||
///
|
||||
/// - `dim` is the source-embedding dimension (the sketch keeps this dimension).
|
||||
/// - `padded` is `next_pow2(dim)` — the FHT working length.
|
||||
/// - `signs` has exactly `padded` entries (`+1.0` / `-1.0`), derived from
|
||||
/// `seed` via SplitMix64. Padding positions get signs too; they only ever
|
||||
/// multiply zeros, so their value is irrelevant to the result but they keep
|
||||
/// the construction uniform.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct Rotation {
|
||||
/// Source-embedding dimension; the rotated sketch keeps this dimension.
|
||||
dim: usize,
|
||||
/// FHT working length = `next_pow2(dim)`.
|
||||
padded: usize,
|
||||
/// Random ±1 sign flips (the diagonal `D`), length `padded`.
|
||||
signs: Vec<f32>,
|
||||
/// The seed the sign flips were derived from (stored for reproducibility).
|
||||
seed: u64,
|
||||
}
|
||||
|
||||
impl Rotation {
|
||||
/// Build a rotation for `dim`-dimensional embeddings from a fixed `seed`.
|
||||
///
|
||||
/// The same `(seed, dim)` always yields a bit-identical rotation, so an
|
||||
/// index built with `Rotation::new(seed, d)` and a query rotated with a
|
||||
/// freshly-constructed `Rotation::new(seed, d)` agree exactly.
|
||||
///
|
||||
/// `dim == 0` yields an identity (empty) rotation — `apply` returns an
|
||||
/// empty vector — which keeps the constructor total (no panic on a
|
||||
/// degenerate dimension).
|
||||
pub fn new(seed: u64, dim: usize) -> Self {
|
||||
let padded = next_pow2(dim);
|
||||
let mut signs = Vec::with_capacity(padded);
|
||||
// SplitMix64: a tiny, well-distributed, fully deterministic PRNG. We
|
||||
// only need a reproducible stream of bits to pick ±1 per dimension;
|
||||
// SplitMix64 is the standard seeding generator and is more than
|
||||
// adequate (and far better-mixed than the LCG used for bench fixtures).
|
||||
let mut state = seed;
|
||||
for _ in 0..padded {
|
||||
state = split_mix64(&mut state);
|
||||
// Use the top bit of the mixed word to choose the sign.
|
||||
signs.push(if state >> 63 == 1 { 1.0 } else { -1.0 });
|
||||
}
|
||||
Self {
|
||||
dim,
|
||||
padded,
|
||||
signs,
|
||||
seed,
|
||||
}
|
||||
}
|
||||
|
||||
/// The seed this rotation was derived from (for serialization / audit).
|
||||
#[inline]
|
||||
pub fn seed(&self) -> u64 {
|
||||
self.seed
|
||||
}
|
||||
|
||||
/// Source-embedding dimension this rotation expects.
|
||||
#[inline]
|
||||
pub fn dim(&self) -> usize {
|
||||
self.dim
|
||||
}
|
||||
|
||||
/// FHT working length (`next_pow2(dim)`).
|
||||
#[inline]
|
||||
pub fn padded_dim(&self) -> usize {
|
||||
self.padded
|
||||
}
|
||||
|
||||
/// Apply the rotation `R = H·D` to `embedding`, returning the first `dim`
|
||||
/// rotated coordinates.
|
||||
///
|
||||
/// If `embedding.len() != dim` the input is treated charitably: it is
|
||||
/// truncated or zero-extended to `dim` before rotation. This mirrors
|
||||
/// Pass 1's saturating tolerance and keeps the call total.
|
||||
///
|
||||
/// The returned vector has length `self.dim`. Its L2 norm equals the L2
|
||||
/// norm of the (dim-truncated / zero-extended) input up to floating-point
|
||||
/// rounding — see [`Rotation::apply`] tests and
|
||||
/// `rotation_preserves_norm`.
|
||||
pub fn apply(&self, embedding: &[f32]) -> Vec<f32> {
|
||||
if self.dim == 0 {
|
||||
return Vec::new();
|
||||
}
|
||||
let mut buf = self.apply_padded(embedding);
|
||||
// Read back the first `dim` rotated coordinates as the sketch input.
|
||||
buf.truncate(self.dim);
|
||||
buf
|
||||
}
|
||||
|
||||
/// Apply the rotation `R = H·D` and return **all `padded_dim` rotated
|
||||
/// coordinates** (not truncated to `dim`).
|
||||
///
|
||||
/// This is the frame the RaBitQ estimator ([`crate::estimator`]) works in:
|
||||
/// the 1-bit code `x̄ ∈ {±1/√D}^D` is unit over the **padded** length `D`,
|
||||
/// and the query dot product `⟨x̄, q'⟩` must be taken over that same `D`. For
|
||||
/// a power-of-two `dim`, `padded_dim == dim` and this equals
|
||||
/// [`Rotation::apply`]; for a non-power-of-two `dim` the tail coordinates
|
||||
/// (the zero-padded energy redistributed by the FHT) are retained here but
|
||||
/// dropped by `apply`.
|
||||
///
|
||||
/// `dim == 0` yields an empty vector. Ragged input is handled charitably
|
||||
/// (truncate / zero-extend to `dim`), as in [`Rotation::apply`].
|
||||
pub fn apply_padded(&self, embedding: &[f32]) -> Vec<f32> {
|
||||
if self.dim == 0 {
|
||||
return Vec::new();
|
||||
}
|
||||
// Build the padded, sign-flipped working buffer: buf = D · x, then 0-pad.
|
||||
let mut buf = vec![0.0f32; self.padded];
|
||||
let n = embedding.len().min(self.dim);
|
||||
for i in 0..n {
|
||||
buf[i] = embedding[i] * self.signs[i];
|
||||
}
|
||||
// (positions n..dim and dim..padded stay zero — zero-extend + pad)
|
||||
|
||||
// In-place normalized Fast Hadamard Transform.
|
||||
fht_normalized(&mut buf);
|
||||
buf
|
||||
}
|
||||
}
|
||||
|
||||
/// Smallest power of two `>= n` (with `next_pow2(0) == 1`, `next_pow2(1) == 1`).
|
||||
///
|
||||
/// Pulled out (and `pub(crate)`) so the sketch layer and tests can reason about
|
||||
/// the FHT working length without duplicating the rule.
|
||||
#[inline]
|
||||
pub(crate) fn next_pow2(n: usize) -> usize {
|
||||
if n <= 1 {
|
||||
return 1;
|
||||
}
|
||||
// `n` here is small relative to usize::MAX in every realistic embedding
|
||||
// (<= 65_535), so `next_power_of_two` cannot overflow.
|
||||
n.next_power_of_two()
|
||||
}
|
||||
|
||||
/// SplitMix64 step: advance `state` and return a well-mixed 64-bit word.
|
||||
///
|
||||
/// Reference algorithm (public domain, by Sebastiano Vigna). Deterministic and
|
||||
/// dependency-free — exactly what we need for a reproducible sign stream.
|
||||
#[inline]
|
||||
fn split_mix64(state: &mut u64) -> u64 {
|
||||
*state = state.wrapping_add(0x9E37_79B9_7F4A_7C15);
|
||||
let mut z = *state;
|
||||
z = (z ^ (z >> 30)).wrapping_mul(0xBF58_476D_1CE4_E5B9);
|
||||
z = (z ^ (z >> 27)).wrapping_mul(0x94D0_49BB_1331_11EB);
|
||||
z ^ (z >> 31)
|
||||
}
|
||||
|
||||
/// In-place **normalized** Fast Hadamard Transform on a power-of-two slice.
|
||||
///
|
||||
/// Computes `y = (1/√m) · H_m · x` in place, where `H_m` is the `m × m`
|
||||
/// Walsh–Hadamard matrix and `m = buf.len()` is a power of two. The `1/√m`
|
||||
/// normalization makes `H` orthogonal (`HᵀH = I`), so the transform preserves
|
||||
/// the L2 norm. Runs in `O(m log m)` with `O(1)` extra memory (the standard
|
||||
/// iterative butterfly).
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// Debug-asserts that `buf.len()` is a power of two. Callers in this module
|
||||
/// always pass `next_pow2(dim)`, so this never fires in practice; it documents
|
||||
/// the precondition.
|
||||
fn fht_normalized(buf: &mut [f32]) {
|
||||
let m = buf.len();
|
||||
debug_assert!(m.is_power_of_two(), "FHT length must be a power of two");
|
||||
if m <= 1 {
|
||||
return;
|
||||
}
|
||||
// Unnormalized in-place Walsh–Hadamard butterfly.
|
||||
let mut h = 1usize;
|
||||
while h < m {
|
||||
let mut i = 0usize;
|
||||
while i < m {
|
||||
for j in i..i + h {
|
||||
let x = buf[j];
|
||||
let y = buf[j + h];
|
||||
buf[j] = x + y;
|
||||
buf[j + h] = x - y;
|
||||
}
|
||||
i += h * 2;
|
||||
}
|
||||
h *= 2;
|
||||
}
|
||||
// Normalize by 1/√m so H is orthogonal (norm-preserving).
|
||||
let inv_sqrt_m = 1.0f32 / (m as f32).sqrt();
|
||||
for v in buf.iter_mut() {
|
||||
*v *= inv_sqrt_m;
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
fn l2(v: &[f32]) -> f32 {
|
||||
v.iter().map(|&x| x * x).sum::<f32>().sqrt()
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn next_pow2_rounds_up() {
|
||||
assert_eq!(next_pow2(0), 1);
|
||||
assert_eq!(next_pow2(1), 1);
|
||||
assert_eq!(next_pow2(2), 2);
|
||||
assert_eq!(next_pow2(3), 4);
|
||||
assert_eq!(next_pow2(128), 128);
|
||||
assert_eq!(next_pow2(129), 256);
|
||||
assert_eq!(next_pow2(200), 256);
|
||||
assert_eq!(next_pow2(65_535), 65_536);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn fht_is_norm_preserving_on_power_of_two() {
|
||||
// Pure FHT (no sign flips) must preserve L2 norm to fp tolerance.
|
||||
let mut v: Vec<f32> = (0..8).map(|i| (i as f32 - 3.5) * 0.7).collect();
|
||||
let before = l2(&v);
|
||||
fht_normalized(&mut v);
|
||||
let after = l2(&v);
|
||||
assert!(
|
||||
(before - after).abs() < 1e-5,
|
||||
"FHT changed norm: {before} -> {after}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn fht_self_inverse_normalized() {
|
||||
// Normalized H is symmetric and orthogonal, so H·H·x == x.
|
||||
let original: Vec<f32> = vec![1.0, -2.0, 3.0, 0.5];
|
||||
let mut v = original.clone();
|
||||
fht_normalized(&mut v);
|
||||
fht_normalized(&mut v);
|
||||
for (a, b) in original.iter().zip(v.iter()) {
|
||||
assert!((a - b).abs() < 1e-5, "H·H·x != x: {a} vs {b}");
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotation_is_deterministic_for_seed() {
|
||||
// Two rotations from the same (seed, dim) must produce identical
|
||||
// output for the same input — the index-time == query-time contract.
|
||||
let r1 = Rotation::new(0xDEAD_BEEF_CAFE_1234, 130);
|
||||
let r2 = Rotation::new(0xDEAD_BEEF_CAFE_1234, 130);
|
||||
let x: Vec<f32> = (0..130).map(|i| (i as f32 * 0.31).sin()).collect();
|
||||
let a = r1.apply(&x);
|
||||
let b = r2.apply(&x);
|
||||
assert_eq!(a.len(), 130);
|
||||
assert_eq!(a, b, "same seed must give identical rotation");
|
||||
|
||||
// A different seed must (almost surely) differ.
|
||||
let r3 = Rotation::new(0x0000_0000_0000_0001, 130);
|
||||
let c = r3.apply(&x);
|
||||
assert_ne!(a, c, "different seed must give different rotation");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotation_preserves_norm() {
|
||||
// R = H·D is orthogonal; on a power-of-two dim the first `dim`
|
||||
// coordinates ARE the whole transform, so norm is preserved exactly
|
||||
// (to fp tolerance). We test a power-of-two dim for the exact claim.
|
||||
let r = Rotation::new(42, 128);
|
||||
let x: Vec<f32> = (0..128).map(|i| ((i * 7 % 13) as f32 - 6.0) * 0.5).collect();
|
||||
let y = r.apply(&x);
|
||||
let before = l2(&x);
|
||||
let after = l2(&y);
|
||||
assert!(
|
||||
(before - after).abs() < 1e-3 * before.max(1.0),
|
||||
"rotation changed norm: {before} -> {after}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotation_non_power_of_two_preserves_norm_via_padding() {
|
||||
// For a non-power-of-two dim, reading back the first `dim` coords of a
|
||||
// padded FHT only preserves norm if the padded tail carries ~no energy.
|
||||
// We assert the rotated norm does not EXCEED the input norm (the padded
|
||||
// transform is non-expansive on the truncated read-back) and stays
|
||||
// within a loose band — enough to confirm padding is sane, not a hard
|
||||
// exact-norm claim.
|
||||
let r = Rotation::new(7, 130); // pads 130 -> 256
|
||||
assert_eq!(r.padded_dim(), 256);
|
||||
let x: Vec<f32> = (0..130).map(|i| (i as f32 * 0.13).cos()).collect();
|
||||
let y = r.apply(&x);
|
||||
assert_eq!(y.len(), 130);
|
||||
let before = l2(&x);
|
||||
let after = l2(&y);
|
||||
// Truncated read-back is non-expansive: ||y|| <= ||Hx|| == ||x||.
|
||||
assert!(
|
||||
after <= before + 1e-4,
|
||||
"truncated rotation expanded norm: {before} -> {after}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotation_dim_zero_is_empty() {
|
||||
let r = Rotation::new(1, 0);
|
||||
assert!(r.apply(&[]).is_empty());
|
||||
assert!(r.apply(&[1.0, 2.0]).is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotation_handles_ragged_input() {
|
||||
// Charitable length handling: short input zero-extends, long truncates.
|
||||
let r = Rotation::new(99, 64);
|
||||
let short = r.apply(&[1.0, 2.0, 3.0]); // zero-extended to 64
|
||||
assert_eq!(short.len(), 64);
|
||||
let long: Vec<f32> = (0..200).map(|i| i as f32).collect();
|
||||
let truncated = r.apply(&long); // truncated to 64
|
||||
assert_eq!(truncated.len(), 64);
|
||||
}
|
||||
}
|
||||
@@ -40,8 +40,8 @@
|
||||
//! All sites take a `&Sketch` instead of an `&[f32]`; the bridge to dense
|
||||
//! embeddings is `Sketch::from_embedding`.
|
||||
|
||||
use crate::rotation::Rotation;
|
||||
use ruvector_core::quantization::{BinaryQuantized, QuantizedVector};
|
||||
use std::cmp::Reverse;
|
||||
use std::collections::BinaryHeap;
|
||||
|
||||
/// Errors raised by the sketch API.
|
||||
@@ -151,6 +151,42 @@ impl Sketch {
|
||||
Ok(Self::from_embedding(embedding, sketch_version))
|
||||
}
|
||||
|
||||
/// Construct a sketch from a dense f32 embedding **with RaBitQ Pass 2
|
||||
/// rotation** ([ADR-156 §8](../../../../../docs/adr/ADR-156-ruvector-fusion-beyond-sota.md)).
|
||||
///
|
||||
/// Applies the deterministic randomized orthogonal rotation `R = H·D`
|
||||
/// (Fast Hadamard Transform + seeded ±1 sign flips, see [`Rotation`]) to
|
||||
/// the embedding *before* sign-quantization. The rotation decorrelates
|
||||
/// coordinates so each sign bit carries more independent information,
|
||||
/// improving top-K recall on anisotropic / correlated embedding
|
||||
/// distributions — the published RaBitQ construction.
|
||||
///
|
||||
/// The resulting sketch has the **same `embedding_dim`, packed-byte
|
||||
/// length, and `sketch_version`** as a Pass-1 sketch of the same input, so
|
||||
/// it is fully interchangeable in [`SketchBank`] and [`WireSketch`]. The
|
||||
/// *only* requirement is that the index and the query use the **same
|
||||
/// [`Rotation`]** (same seed + dim) — otherwise their sign bits live in
|
||||
/// different rotated frames and the hamming distance is meaningless.
|
||||
///
|
||||
/// Pass-1 (`from_embedding`) and Pass-2 sketches must **not** be mixed in
|
||||
/// one bank. Use [`SketchBank::with_rotation`] to make a bank that rotates
|
||||
/// every insert and query consistently.
|
||||
pub fn from_embedding_rotated(
|
||||
embedding: &[f32],
|
||||
sketch_version: u16,
|
||||
rotation: &Rotation,
|
||||
) -> Self {
|
||||
let rotated = rotation.apply(embedding);
|
||||
// Preserve the *source* embedding_dim semantics of Pass 1 (saturating
|
||||
// to u16::MAX) so banks/wire framing are byte-identical to Pass 1.
|
||||
let embedding_dim = embedding.len().min(u16::MAX as usize) as u16;
|
||||
Self {
|
||||
inner: BinaryQuantized::quantize(&rotated),
|
||||
embedding_dim,
|
||||
sketch_version,
|
||||
}
|
||||
}
|
||||
|
||||
/// Hamming distance to another sketch in `[0, embedding_dim]`.
|
||||
///
|
||||
/// Returns `None` if the two sketches have different `embedding_dim` or
|
||||
@@ -417,29 +453,113 @@ pub struct SketchBank {
|
||||
embedding_dim: Option<u16>,
|
||||
/// Locked at first insertion; all subsequent inserts must match.
|
||||
sketch_version: Option<u16>,
|
||||
/// Optional RaBitQ Pass-2 rotation ([ADR-156 §8]). When `Some`, the
|
||||
/// embedding-taking helpers ([`SketchBank::insert_embedding`],
|
||||
/// [`SketchBank::topk_embedding`], [`SketchBank::novelty_embedding`])
|
||||
/// rotate every embedding through this exact rotation before sketching, so
|
||||
/// index-time and query-time sketches always share one rotated frame. The
|
||||
/// raw [`SketchBank::insert`] / [`SketchBank::topk`] paths are unchanged —
|
||||
/// callers using pre-built sketches are responsible for having rotated them
|
||||
/// with the same `Rotation`.
|
||||
rotation: Option<Rotation>,
|
||||
}
|
||||
|
||||
impl SketchBank {
|
||||
/// Create an empty bank. Dimension and version are locked at the first
|
||||
/// `insert` call.
|
||||
/// `insert` call. No Pass-2 rotation (pure Pass-1, default behaviour).
|
||||
pub fn new() -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
embedding_dim: None,
|
||||
sketch_version: None,
|
||||
rotation: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create a bank with a pre-locked `embedding_dim` and `sketch_version`.
|
||||
/// Use when the bank's expected schema is known at construction.
|
||||
/// No Pass-2 rotation (pure Pass-1).
|
||||
pub fn with_schema(embedding_dim: u16, sketch_version: u16) -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
embedding_dim: Some(embedding_dim),
|
||||
sketch_version: Some(sketch_version),
|
||||
rotation: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Create a **RaBitQ Pass-2** bank that rotates every embedding through
|
||||
/// `rotation` before sketching ([ADR-156 §8]).
|
||||
///
|
||||
/// Use the embedding-taking helpers ([`SketchBank::insert_embedding`],
|
||||
/// [`SketchBank::topk_embedding`], [`SketchBank::novelty_embedding`]) with
|
||||
/// this bank so the index and queries share the same rotated frame. The
|
||||
/// `embedding_dim` / `sketch_version` schema is still locked at first
|
||||
/// insert exactly as for a Pass-1 bank — a Pass-2 sketch is byte-identical
|
||||
/// in shape to a Pass-1 sketch, only its bits differ.
|
||||
pub fn with_rotation(rotation: Rotation) -> Self {
|
||||
Self {
|
||||
entries: Vec::new(),
|
||||
embedding_dim: None,
|
||||
sketch_version: None,
|
||||
rotation: Some(rotation),
|
||||
}
|
||||
}
|
||||
|
||||
/// The Pass-2 rotation this bank applies to embeddings, if any.
|
||||
#[inline]
|
||||
pub fn rotation(&self) -> Option<&Rotation> {
|
||||
self.rotation.as_ref()
|
||||
}
|
||||
|
||||
/// Sketch a raw embedding using this bank's rotation policy: Pass-2
|
||||
/// (`from_embedding_rotated`) if the bank has a rotation, else Pass-1
|
||||
/// (`from_embedding`). The single place index-time and query-time sketching
|
||||
/// agree on the rotated frame.
|
||||
fn sketch_embedding(&self, embedding: &[f32], sketch_version: u16) -> Sketch {
|
||||
match &self.rotation {
|
||||
Some(r) => Sketch::from_embedding_rotated(embedding, sketch_version, r),
|
||||
None => Sketch::from_embedding(embedding, sketch_version),
|
||||
}
|
||||
}
|
||||
|
||||
/// Insert a raw embedding, sketching it through the bank's rotation policy.
|
||||
/// Convenience wrapper over [`SketchBank::insert`] that guarantees the
|
||||
/// stored sketch used the same (Pass-1 or Pass-2) frame the queries will.
|
||||
pub fn insert_embedding(
|
||||
&mut self,
|
||||
id: u32,
|
||||
embedding: &[f32],
|
||||
sketch_version: u16,
|
||||
) -> Result<(), SketchError> {
|
||||
let sketch = self.sketch_embedding(embedding, sketch_version);
|
||||
self.insert(id, sketch)
|
||||
}
|
||||
|
||||
/// Top-K over a raw query embedding, sketched through the bank's rotation
|
||||
/// policy. Equivalent to `bank.topk(&bank.sketch(query), k)` but cannot get
|
||||
/// the rotation frame wrong.
|
||||
pub fn topk_embedding(
|
||||
&self,
|
||||
query: &[f32],
|
||||
sketch_version: u16,
|
||||
k: usize,
|
||||
) -> Result<Vec<(u32, u32)>, SketchError> {
|
||||
let q = self.sketch_embedding(query, sketch_version);
|
||||
self.topk(&q, k)
|
||||
}
|
||||
|
||||
/// Novelty of a raw query embedding, sketched through the bank's rotation
|
||||
/// policy. See [`SketchBank::novelty`].
|
||||
pub fn novelty_embedding(
|
||||
&self,
|
||||
query: &[f32],
|
||||
sketch_version: u16,
|
||||
) -> Result<f32, SketchError> {
|
||||
let q = self.sketch_embedding(query, sketch_version);
|
||||
self.novelty(&q)
|
||||
}
|
||||
|
||||
/// Number of sketches in the bank.
|
||||
#[inline]
|
||||
pub fn len(&self) -> usize {
|
||||
@@ -523,12 +643,22 @@ impl SketchBank {
|
||||
});
|
||||
}
|
||||
}
|
||||
// Pass-1.5 optimisation: O(n log k) partial sort via a fixed-size
|
||||
// max-heap of `Reverse((distance, id))`. The heap's `peek()`
|
||||
// returns the *largest* of the current best-k. Each candidate is
|
||||
// compared against the heap top in O(1); only better candidates
|
||||
// trigger an O(log k) push/pop. Avoids touching the long tail of
|
||||
// large-distance entries that the truncate would have discarded.
|
||||
// Partial top-K via a fixed-size **max-heap** of `(distance, id)`.
|
||||
// `BinaryHeap` is a max-heap, so `peek()` is the *largest* distance
|
||||
// currently held — the worst of the running best-k. Each candidate is
|
||||
// O(1)-compared against that worst; only a *smaller* distance triggers
|
||||
// an O(log k) pop+push, evicting the current worst. The heap therefore
|
||||
// retains the k *smallest* distances. Total O(n log k), touching the
|
||||
// long tail only with a single comparison each.
|
||||
//
|
||||
// BUG FIX (ADR-156 §8 Pass-2 work): this loop previously used
|
||||
// `BinaryHeap<Reverse<(d, id)>>` and called the peek "the largest".
|
||||
// `Reverse` turns the max-heap into a **min-heap**, so `peek()` was the
|
||||
// *smallest* distance; evicting on `d < worst` then kept the k
|
||||
// *farthest* neighbours and returned them as "nearest". The pre-existing
|
||||
// unit tests only exercised the `n <= k` fast path (≤ 3 entries), so the
|
||||
// inversion went unnoticed until the Pass-2 coverage harness measured
|
||||
// near-random top-K on n > k. Pinned by `topk_heap_path_returns_nearest`.
|
||||
//
|
||||
// Fast path: when n ≤ k there is nothing to discard, so a plain
|
||||
// collect + sort is faster than building a heap.
|
||||
@@ -543,28 +673,25 @@ impl SketchBank {
|
||||
return Ok(scored);
|
||||
}
|
||||
|
||||
let mut heap: BinaryHeap<Reverse<(u32, u32)>> = BinaryHeap::with_capacity(k + 1);
|
||||
let mut heap: BinaryHeap<(u32, u32)> = BinaryHeap::with_capacity(k + 1);
|
||||
for (id, sk) in &self.entries {
|
||||
let d = sk.distance_unchecked(query);
|
||||
if heap.len() < k {
|
||||
heap.push(Reverse((d, *id)));
|
||||
} else if let Some(&Reverse((worst, _))) = heap.peek() {
|
||||
// L1 hardening (PR #435 review): structural `if let` rather
|
||||
// than `.expect("heap len == k > 0")`. The branch is
|
||||
// mathematically unreachable when `heap.len() >= k > 0`,
|
||||
// but a defensive pattern makes the impossibility a type
|
||||
// property rather than a runtime invariant. Same hot-path
|
||||
// cost (one bounds check); zero panic risk.
|
||||
heap.push((d, *id));
|
||||
} else if let Some(&(worst, _)) = heap.peek() {
|
||||
// `peek()` is the largest distance in the best-k (max-heap).
|
||||
// The `if let` is defensive: when `heap.len() == k > 0` the
|
||||
// heap is non-empty, so this never takes the `else`. Same
|
||||
// hot-path cost (one bounds check), zero panic risk.
|
||||
if d < worst {
|
||||
heap.pop();
|
||||
heap.push(Reverse((d, *id)));
|
||||
heap.push((d, *id));
|
||||
}
|
||||
}
|
||||
}
|
||||
// Drain heap into a Vec — already in (Reverse) descending order;
|
||||
// sort to expose ascending-by-distance per the public contract.
|
||||
let mut scored: Vec<(u32, u32)> =
|
||||
heap.into_iter().map(|Reverse((d, id))| (id, d)).collect();
|
||||
// Drain the max-heap and sort ascending-by-distance per the public
|
||||
// contract (heap drain order is unspecified beyond the root).
|
||||
let mut scored: Vec<(u32, u32)> = heap.into_iter().map(|(d, id)| (id, d)).collect();
|
||||
scored.sort_by_key(|&(_, d)| d);
|
||||
Ok(scored)
|
||||
}
|
||||
@@ -653,6 +780,45 @@ mod tests {
|
||||
assert!(topk[1].1 <= topk[2].1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn topk_heap_path_returns_nearest() {
|
||||
// Regression for the heap-inversion bug found during ADR-156 §8 Pass-2
|
||||
// work: with n > k the topk used a min-heap (`Reverse`) but treated its
|
||||
// peek as the max, so it returned the k *farthest* sketches. Build a
|
||||
// bank where the answer is unambiguous and assert the genuine nearest
|
||||
// come back. The OLD code returns the farthest here and fails.
|
||||
let dim = 64;
|
||||
let k = 4;
|
||||
// Query is all-positive (every bit 1).
|
||||
let query = Sketch::from_embedding(&vec![1.0f32; dim], 1);
|
||||
let mut bank = SketchBank::new();
|
||||
// id j has its first `j` dims flipped negative → hamming j to the
|
||||
// all-positive query. So nearest-4 are ids 0,1,2,3 (hamming 0,1,2,3);
|
||||
// farthest are 5..8. n = 9 > k = 4 → exercises the heap path.
|
||||
//
|
||||
// CRITICAL ordering: insert FARTHEST-FIRST (id 8 down to 0). This fills
|
||||
// the heap's first k slots with far entries, so the nearest entries
|
||||
// arrive only after the heap is full and MUST trigger eviction of the
|
||||
// current worst. The old `Reverse` (min-heap-as-max) bug peeked the
|
||||
// smallest distance and never evicted, so it kept the first-seen
|
||||
// (farthest) k and this assertion fails on the old code. Inserting
|
||||
// nearest-first would mask the bug (the heap fills with the right
|
||||
// answer by luck), so the order here is load-bearing.
|
||||
for j in (0..=8u32).rev() {
|
||||
let mut v = vec![1.0f32; dim];
|
||||
for d in v.iter_mut().take(j as usize) {
|
||||
*d = -1.0;
|
||||
}
|
||||
bank.insert(j, Sketch::from_embedding(&v, 1)).unwrap();
|
||||
}
|
||||
let top = bank.topk(&query, k).unwrap();
|
||||
assert_eq!(top.len(), k);
|
||||
let ids: Vec<u32> = top.iter().map(|&(id, _)| id).collect();
|
||||
let dists: Vec<u32> = top.iter().map(|&(_, d)| d).collect();
|
||||
assert_eq!(ids, vec![0, 1, 2, 3], "topk must return the NEAREST k, got {ids:?}");
|
||||
assert_eq!(dists, vec![0, 1, 2, 3], "distances must be the smallest k");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn bank_topk_zero_returns_empty() {
|
||||
let mut bank = SketchBank::new();
|
||||
@@ -852,4 +1018,122 @@ mod tests {
|
||||
SketchError::SketchVersionMismatch { .. }
|
||||
));
|
||||
}
|
||||
|
||||
// ─── ADR-156 §8 / ADR-084 Pass 2 — randomized rotation ───────────────────
|
||||
|
||||
#[test]
|
||||
fn rotated_sketch_has_same_shape_as_pass1() {
|
||||
// A Pass-2 sketch must be byte-shape-identical to a Pass-1 sketch of
|
||||
// the same input: same embedding_dim, same packed-byte length, same
|
||||
// sketch_version. Only the bits differ. This is what lets Pass-2
|
||||
// sketches travel through the unchanged WireSketch / SketchBank schema.
|
||||
let v: Vec<f32> = (0..128).map(|i| (i as f32 * 0.21).sin()).collect();
|
||||
let rot = Rotation::new(0xA5A5_A5A5, 128);
|
||||
let p1 = Sketch::from_embedding(&v, 3);
|
||||
let p2 = Sketch::from_embedding_rotated(&v, 3, &rot);
|
||||
assert_eq!(p1.embedding_dim(), p2.embedding_dim());
|
||||
assert_eq!(p1.sketch_version(), p2.sketch_version());
|
||||
assert_eq!(p1.packed_bytes().len(), p2.packed_bytes().len());
|
||||
// The rotation actually changed the bits (else it would be a no-op on
|
||||
// this correlated input).
|
||||
assert_ne!(
|
||||
p1.packed_bytes(),
|
||||
p2.packed_bytes(),
|
||||
"rotation should change the sign bits on correlated input"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotated_sketch_is_deterministic_for_seed() {
|
||||
// Same (seed, dim) rotation → identical sketch bits across constructions
|
||||
// (the index-time == query-time contract, at the sketch layer).
|
||||
let v: Vec<f32> = (0..96).map(|i| ((i * 5 % 11) as f32 - 5.0) * 0.3).collect();
|
||||
let s1 = Sketch::from_embedding_rotated(&v, 1, &Rotation::new(7, 96));
|
||||
let s2 = Sketch::from_embedding_rotated(&v, 1, &Rotation::new(7, 96));
|
||||
assert_eq!(s1.distance_unchecked(&s2), 0, "same seed must agree exactly");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn rotated_bank_self_match_is_zero_distance() {
|
||||
// A rotated bank queried with the same embedding it stored must return
|
||||
// that id at distance 0 — proves the bank rotates index and query in
|
||||
// the same frame.
|
||||
let rot = Rotation::new(0xBEEF, 64);
|
||||
let mut bank = SketchBank::with_rotation(rot);
|
||||
let v: Vec<f32> = (0..64).map(|i| (i as f32 * 0.37).cos()).collect();
|
||||
bank.insert_embedding(42, &v, 1).unwrap();
|
||||
let top = bank.topk_embedding(&v, 1, 1).unwrap();
|
||||
assert_eq!(top.len(), 1);
|
||||
assert_eq!(top[0].0, 42);
|
||||
assert_eq!(top[0].1, 0, "self-query in a rotated bank must be distance 0");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn pass2_coverage_not_worse_than_pass1() {
|
||||
// The core regression: on a small fixed anisotropic fixture, Pass-2
|
||||
// (rotation) coverage must be >= Pass-1 coverage. Rotation must not
|
||||
// *hurt* recall. (We do not assert a hard >= 90% here — that is the
|
||||
// measurement reported in the ADR, not a unit-test invariant — but we
|
||||
// do pin that rotation is not a regression.)
|
||||
use crate::coverage::{measure_pass1, measure_pass2, CoverageParams};
|
||||
let p = CoverageParams {
|
||||
n: 512,
|
||||
n_queries: 32,
|
||||
n_clusters: 32,
|
||||
..CoverageParams::aether_default(0x00C0_FFEE)
|
||||
};
|
||||
let c1 = measure_pass1(p).coverage;
|
||||
let c2 = measure_pass2(p, 0x1234_5678_9ABC_DEF0).coverage;
|
||||
assert!(
|
||||
c2 + 1e-9 >= c1,
|
||||
"Pass-2 coverage {c2:.4} regressed below Pass-1 {c1:.4}"
|
||||
);
|
||||
}
|
||||
|
||||
/// Deterministic, test-runnable coverage measurement that PRINTS the
|
||||
/// numbers quoted in ADR-084 / ADR-156 §8. Run with `--nocapture` to see:
|
||||
/// cargo test -p wifi-densepose-ruvector --no-default-features \
|
||||
/// pass2_coverage_report -- --nocapture
|
||||
#[test]
|
||||
fn pass2_coverage_report() {
|
||||
use crate::coverage::{measure_pass1, measure_pass2, CoverageParams};
|
||||
let base = CoverageParams::aether_default(0xAD00_0084);
|
||||
let rot_seed = 0x5EED_C0DE_1234_5678u64;
|
||||
println!(
|
||||
"\n=== ADR-156 §8 RaBitQ Pass-2 coverage report (anisotropic synthetic) ==="
|
||||
);
|
||||
println!(
|
||||
"dim={} N={} K={} queries={} master_seed=0x{:X} rotation_seed=0x{:X}",
|
||||
base.dim, base.n, base.k, base.n_queries, base.seed, rot_seed
|
||||
);
|
||||
// Strict bar: candidate_k == K.
|
||||
let p1 = measure_pass1(base).coverage;
|
||||
let p2 = measure_pass2(base, rot_seed).coverage;
|
||||
println!(
|
||||
"candidate_k=K={:<2} Pass1={:6.2}% Pass2={:6.2}% bar=90% {}",
|
||||
base.k,
|
||||
p1 * 100.0,
|
||||
p2 * 100.0,
|
||||
if p2 >= 0.90 { "PASS" } else { "BELOW-BAR" }
|
||||
);
|
||||
// Over-fetch curve (models fetch C >= K candidates, refine to K).
|
||||
for &c in &[16usize, 24, 32, 64] {
|
||||
let pc = CoverageParams {
|
||||
candidate_k: c,
|
||||
..base
|
||||
};
|
||||
let cp1 = measure_pass1(pc).coverage;
|
||||
let cp2 = measure_pass2(pc, rot_seed).coverage;
|
||||
println!(
|
||||
"candidate_k={:<3} Pass1={:6.2}% Pass2={:6.2}%",
|
||||
c,
|
||||
cp1 * 100.0,
|
||||
cp2 * 100.0
|
||||
);
|
||||
}
|
||||
println!("========================================================================\n");
|
||||
// Always-true sanity so the test asserts something.
|
||||
assert!((0.0..=1.0).contains(&p1));
|
||||
assert!((0.0..=1.0).contains(&p2));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -254,6 +254,98 @@ mod tests {
|
||||
);
|
||||
}
|
||||
|
||||
/// REGRESSION (ADR-080 #3, CWE-598 — token in URL query string).
|
||||
///
|
||||
/// ADR-080 flagged "JWT in URL" as a HIGH finding (tokens in query strings
|
||||
/// leak into logs, proxies, browser history, `Referer`). The current
|
||||
/// sensing-server only ever reads the token from the `Authorization: Bearer`
|
||||
/// header — there is no `?token=` / `?access_token=` query path in
|
||||
/// `require_bearer` (see [`require_bearer`] above, which only inspects the
|
||||
/// `AUTHORIZATION` header). This test pins that: a request carrying the
|
||||
/// correct token *only* in the query string is still `401`, while the same
|
||||
/// token in the header is `200`. If anyone ever re-introduces a query-string
|
||||
/// token path, this fails.
|
||||
#[tokio::test]
|
||||
async fn query_string_token_is_never_accepted() {
|
||||
let r = wrap(AuthState::from_token("s3cr3t"));
|
||||
// Correct token, but supplied only in the URL — must NOT authenticate.
|
||||
assert_eq!(
|
||||
status(r.clone(), "GET", "/api/v1/info?token=s3cr3t", None).await,
|
||||
StatusCode::UNAUTHORIZED,
|
||||
"?token= in the query string must not authenticate (CWE-598)"
|
||||
);
|
||||
assert_eq!(
|
||||
status(
|
||||
r.clone(),
|
||||
"GET",
|
||||
"/api/v1/info?access_token=s3cr3t",
|
||||
None
|
||||
)
|
||||
.await,
|
||||
StatusCode::UNAUTHORIZED,
|
||||
"?access_token= in the query string must not authenticate (CWE-598)"
|
||||
);
|
||||
// A query token must not "help" a request that also lacks the header,
|
||||
// even combined with an unrelated param.
|
||||
assert_eq!(
|
||||
status(
|
||||
r.clone(),
|
||||
"GET",
|
||||
"/api/v1/info?foo=bar&token=s3cr3t",
|
||||
None
|
||||
)
|
||||
.await,
|
||||
StatusCode::UNAUTHORIZED
|
||||
);
|
||||
// The header path is the only accepted channel — same token, header,
|
||||
// succeeds. (Proves we didn't just break auth entirely.)
|
||||
assert_eq!(
|
||||
status(r, "GET", "/api/v1/info?token=s3cr3t", Some("s3cr3t")).await,
|
||||
StatusCode::OK,
|
||||
"the Authorization: Bearer header is the supported channel"
|
||||
);
|
||||
}
|
||||
|
||||
/// REGRESSION (ADR-080 #1 — X-Forwarded-For spoofing).
|
||||
///
|
||||
/// The bearer middleware authenticates on the token alone and must be
|
||||
/// completely insensitive to a client-supplied `X-Forwarded-For` header:
|
||||
/// an attacker cannot flip an auth decision by spoofing XFF. A wrong token
|
||||
/// stays `401` and a right token stays `200` regardless of XFF. (The
|
||||
/// sensing-server has no IP-based rate-limit / allowlist that XFF could
|
||||
/// bypass; this locks in that auth itself never consults XFF.)
|
||||
#[tokio::test]
|
||||
async fn xff_header_never_affects_auth_decision() {
|
||||
let r = wrap(AuthState::from_token("s3cr3t"));
|
||||
async fn with_xff(router: Router, token: Option<&str>, xff: &str) -> StatusCode {
|
||||
let mut req = Request::builder()
|
||||
.method("GET")
|
||||
.uri("/api/v1/info")
|
||||
.header("X-Forwarded-For", xff)
|
||||
.body(Body::empty())
|
||||
.unwrap();
|
||||
if let Some(t) = token {
|
||||
req.headers_mut()
|
||||
.insert(AUTHORIZATION, format!("Bearer {t}").parse().unwrap());
|
||||
}
|
||||
router.oneshot(req).await.unwrap().status()
|
||||
}
|
||||
// Spoofed XFF + no/ wrong token ⇒ still rejected.
|
||||
assert_eq!(
|
||||
with_xff(r.clone(), None, "127.0.0.1").await,
|
||||
StatusCode::UNAUTHORIZED
|
||||
);
|
||||
assert_eq!(
|
||||
with_xff(r.clone(), Some("nope"), "10.0.0.1, 127.0.0.1").await,
|
||||
StatusCode::UNAUTHORIZED
|
||||
);
|
||||
// Spoofed XFF + correct token ⇒ still accepted (XFF is irrelevant).
|
||||
assert_eq!(
|
||||
with_xff(r, Some("s3cr3t"), "evil-proxy").await,
|
||||
StatusCode::OK
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn enabled_never_gates_paths_outside_api_v1() {
|
||||
let r = wrap(AuthState::from_token("s3cr3t"));
|
||||
|
||||
@@ -0,0 +1,251 @@
|
||||
//! Generic, leak-free error responses for the sensing-server HTTP API.
|
||||
//!
|
||||
//! ## ADR-080 finding #2 — leaked internal errors in responses
|
||||
//!
|
||||
//! Several handlers historically serialized the *internal* error `Display`
|
||||
//! (`format!("{e}")`, `err.to_string()`, a panicked `JoinError`) straight into
|
||||
//! the JSON response body. That leaks server internals to any client: OS error
|
||||
//! strings can carry filesystem paths, a `JoinError` carries the panic message
|
||||
//! (`task … panicked`), and an upstream-fetch error can carry an internal URL.
|
||||
//! ADR-080 flagged this HIGH (CWE-209: Generation of Error Message Containing
|
||||
//! Sensitive Information). The HOMECORE/M7 sweep (ADR-161) covered
|
||||
//! `homecore-server`, **not** this crate, so the finding stayed open.
|
||||
//!
|
||||
//! ## Contract
|
||||
//!
|
||||
//! [`internal_error`] logs the full detail **server-side only** (at `error`
|
||||
//! level, tagged with a correlation id) and returns a *generic* body to the
|
||||
//! client:
|
||||
//!
|
||||
//! ```json
|
||||
//! { "error": "internal_error", "correlation_id": "a1b2c3d4e5f60718", "success": false }
|
||||
//! ```
|
||||
//!
|
||||
//! The correlation id lets an operator grep the server log for the matching
|
||||
//! detail line without ever shipping that detail to the client. The body
|
||||
//! deliberately contains no `Display`/`Debug` of the underlying error, no file
|
||||
//! paths, and never the word `panicked`.
|
||||
//!
|
||||
//! Handlers that previously returned `Json<serde_json::Value>` keep doing so via
|
||||
//! [`internal_error_json`]; handlers that return `(StatusCode, Json<…>)` use
|
||||
//! [`internal_error`]. A "service unavailable" flavor ([`upstream_unavailable`])
|
||||
//! exists for the 503 upstream-fetch path so it, too, stops leaking the raw
|
||||
//! upstream error.
|
||||
|
||||
use std::fmt::Display;
|
||||
use std::sync::atomic::{AtomicU64, Ordering};
|
||||
|
||||
use axum::{http::StatusCode, response::Json};
|
||||
use serde_json::json;
|
||||
|
||||
/// Monotonic component of the correlation id, so two errors in the same
|
||||
/// nanosecond still get distinct ids. Wraps harmlessly.
|
||||
static CORRELATION_COUNTER: AtomicU64 = AtomicU64::new(0);
|
||||
|
||||
/// Generate a short, opaque correlation id (16 lowercase hex chars). Built from
|
||||
/// a nanosecond timestamp XORed with a monotonic counter — unique enough to tie
|
||||
/// a client-visible id back to a single server-side log line without pulling in
|
||||
/// a UUID dependency. It is **not** a security token; it is only an opaque
|
||||
/// log-join key, so a non-cryptographic source is fine.
|
||||
pub fn correlation_id() -> String {
|
||||
let nanos = std::time::SystemTime::now()
|
||||
.duration_since(std::time::UNIX_EPOCH)
|
||||
.map(|d| d.as_nanos() as u64)
|
||||
.unwrap_or(0);
|
||||
let seq = CORRELATION_COUNTER.fetch_add(1, Ordering::Relaxed);
|
||||
// Mix the counter into the high bits so concurrent calls in the same
|
||||
// nanosecond don't collide.
|
||||
let mixed = nanos ^ seq.rotate_left(40);
|
||||
format!("{mixed:016x}")
|
||||
}
|
||||
|
||||
/// Build a generic internal-error response **and log the real detail
|
||||
/// server-side**. The client sees only `{"error":"internal_error",
|
||||
/// "correlation_id":…,"success":false}` with a `500` status; the detail is
|
||||
/// written to the `error`-level log tagged with the same correlation id.
|
||||
///
|
||||
/// `context` is a short, *static* description of where the error happened
|
||||
/// (e.g. `"model delete"`); it is safe to log but is **not** sent to the
|
||||
/// client.
|
||||
pub fn internal_error(context: &str, detail: impl Display) -> (StatusCode, Json<serde_json::Value>) {
|
||||
let cid = correlation_id();
|
||||
// Server-side only — this is where the real detail lives.
|
||||
tracing::error!(
|
||||
correlation_id = %cid,
|
||||
context = context,
|
||||
detail = %detail,
|
||||
"internal error (detail logged server-side only; client received a generic body)"
|
||||
);
|
||||
(
|
||||
StatusCode::INTERNAL_SERVER_ERROR,
|
||||
Json(json!({
|
||||
"error": "internal_error",
|
||||
"correlation_id": cid,
|
||||
"success": false,
|
||||
})),
|
||||
)
|
||||
}
|
||||
|
||||
/// Same as [`internal_error`] but returns a bare `Json` body (HTTP `200` at the
|
||||
/// transport layer) for the legacy handlers that are typed
|
||||
/// `-> Json<serde_json::Value>` and signal failure via `"success": false`
|
||||
/// rather than an HTTP status code. The detail is still logged server-side and
|
||||
/// never reaches the client.
|
||||
pub fn internal_error_json(context: &str, detail: impl Display) -> Json<serde_json::Value> {
|
||||
let cid = correlation_id();
|
||||
tracing::error!(
|
||||
correlation_id = %cid,
|
||||
context = context,
|
||||
detail = %detail,
|
||||
"internal error (detail logged server-side only; client received a generic body)"
|
||||
);
|
||||
Json(json!({
|
||||
"error": "internal_error",
|
||||
"correlation_id": cid,
|
||||
"success": false,
|
||||
}))
|
||||
}
|
||||
|
||||
/// Generic `503 Service Unavailable` for an upstream dependency that failed,
|
||||
/// without leaking the raw upstream error (which can carry an internal URL or
|
||||
/// connection detail). Detail is logged server-side with a correlation id.
|
||||
pub fn upstream_unavailable(
|
||||
context: &str,
|
||||
detail: impl Display,
|
||||
) -> (StatusCode, Json<serde_json::Value>) {
|
||||
let cid = correlation_id();
|
||||
tracing::warn!(
|
||||
correlation_id = %cid,
|
||||
context = context,
|
||||
detail = %detail,
|
||||
"upstream unavailable (detail logged server-side only; client received a generic body)"
|
||||
);
|
||||
(
|
||||
StatusCode::SERVICE_UNAVAILABLE,
|
||||
Json(json!({
|
||||
"error": "upstream_unavailable",
|
||||
"correlation_id": cid,
|
||||
})),
|
||||
)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// A "detail" string carrying the kind of internal information the old
|
||||
/// `format!("{e}")` path would have leaked: a filesystem path, an OS error,
|
||||
/// and the word `panicked`.
|
||||
const LEAKY_DETAIL: &str =
|
||||
"task 42 panicked at 'C:\\Users\\ruv\\secret\\models\\foo.rvf': No such file or directory (os error 2)";
|
||||
|
||||
/// Recursively collect every string value in a JSON document, so a test can
|
||||
/// assert no leaky substring appears *anywhere* in the body (not just in a
|
||||
/// single known field).
|
||||
fn all_strings(v: &serde_json::Value, out: &mut Vec<String>) {
|
||||
match v {
|
||||
serde_json::Value::String(s) => out.push(s.clone()),
|
||||
serde_json::Value::Array(a) => a.iter().for_each(|x| all_strings(x, out)),
|
||||
serde_json::Value::Object(o) => o.values().for_each(|x| all_strings(x, out)),
|
||||
_ => {}
|
||||
}
|
||||
}
|
||||
|
||||
fn body_strings(body: &Json<serde_json::Value>) -> Vec<String> {
|
||||
let mut out = Vec::new();
|
||||
all_strings(&body.0, &mut out);
|
||||
out
|
||||
}
|
||||
|
||||
/// REGRESSION (ADR-080 #2): the response body must NOT contain the panic
|
||||
/// message, the filesystem path, or the OS error string. The pre-fix code
|
||||
/// returned `format!("{e}")` / `join_err.to_string()` directly, so the body
|
||||
/// *did* contain `panicked`, the path, and `os error 2` — this test fails
|
||||
/// on that old behavior.
|
||||
#[test]
|
||||
fn internal_error_body_does_not_leak_detail() {
|
||||
let (status, body) = internal_error("unit-test", LEAKY_DETAIL);
|
||||
assert_eq!(status, StatusCode::INTERNAL_SERVER_ERROR);
|
||||
for s in body_strings(&body) {
|
||||
assert!(
|
||||
!s.contains("panicked"),
|
||||
"response body leaked the panic message: {s:?}"
|
||||
);
|
||||
assert!(
|
||||
!s.contains("secret"),
|
||||
"response body leaked a filesystem path: {s:?}"
|
||||
);
|
||||
assert!(
|
||||
!s.contains("os error"),
|
||||
"response body leaked an OS error string: {s:?}"
|
||||
);
|
||||
assert!(
|
||||
!s.contains(".rvf"),
|
||||
"response body leaked a file name/path: {s:?}"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/// The generic body still carries a correlation id so an operator can join
|
||||
/// the client report to the server log line that *does* hold the detail.
|
||||
#[test]
|
||||
fn internal_error_body_is_generic_with_correlation_id() {
|
||||
let (_status, body) = internal_error("unit-test", LEAKY_DETAIL);
|
||||
assert_eq!(body.0["error"], "internal_error");
|
||||
assert_eq!(body.0["success"], false);
|
||||
let cid = body.0["correlation_id"]
|
||||
.as_str()
|
||||
.expect("correlation_id must be a string");
|
||||
assert_eq!(cid.len(), 16, "correlation id should be 16 hex chars");
|
||||
assert!(
|
||||
cid.chars().all(|c| c.is_ascii_hexdigit()),
|
||||
"correlation id should be hex: {cid:?}"
|
||||
);
|
||||
}
|
||||
|
||||
/// Same leak guarantee for the bare-`Json` (legacy "success: false")
|
||||
/// variant used by handlers that don't return an HTTP status.
|
||||
#[test]
|
||||
fn internal_error_json_does_not_leak_detail() {
|
||||
let body = internal_error_json("unit-test", LEAKY_DETAIL);
|
||||
assert_eq!(body.0["error"], "internal_error");
|
||||
assert_eq!(body.0["success"], false);
|
||||
for s in body_strings(&body) {
|
||||
assert!(!s.contains("panicked"), "leaked panic message: {s:?}");
|
||||
assert!(!s.contains("secret"), "leaked filesystem path: {s:?}");
|
||||
assert!(!s.contains("os error"), "leaked OS error: {s:?}");
|
||||
}
|
||||
}
|
||||
|
||||
/// The 503 upstream flavor must likewise not echo the raw upstream error
|
||||
/// (which can carry an internal URL / connection string).
|
||||
#[test]
|
||||
fn upstream_unavailable_does_not_leak_detail() {
|
||||
let (status, body) = upstream_unavailable(
|
||||
"edge-registry",
|
||||
"https://internal-host.local:9000/app-registry.json: connection refused",
|
||||
);
|
||||
assert_eq!(status, StatusCode::SERVICE_UNAVAILABLE);
|
||||
for s in body_strings(&body) {
|
||||
assert!(
|
||||
!s.contains("internal-host"),
|
||||
"leaked internal upstream host: {s:?}"
|
||||
);
|
||||
assert!(
|
||||
!s.contains("connection refused"),
|
||||
"leaked upstream connection detail: {s:?}"
|
||||
);
|
||||
}
|
||||
assert_eq!(body.0["error"], "upstream_unavailable");
|
||||
assert!(body.0["correlation_id"].is_string());
|
||||
}
|
||||
|
||||
/// Correlation ids are unique across rapid successive calls (so two errors
|
||||
/// can be told apart in the log even under load).
|
||||
#[test]
|
||||
fn correlation_ids_are_unique() {
|
||||
let a = correlation_id();
|
||||
let b = correlation_id();
|
||||
assert_ne!(a, b, "successive correlation ids must differ: {a} == {b}");
|
||||
}
|
||||
}
|
||||
@@ -0,0 +1,241 @@
|
||||
//! Field-peak localization for the Observatory 3D view (issue #1050).
|
||||
//!
|
||||
//! ## What this is (and is not)
|
||||
//!
|
||||
//! The `/ws/sensing` `sensing_update` frame already carries a real `signal_field`
|
||||
//! — a 20×20 grid built by `generate_signal_field()` from **measured subcarrier
|
||||
//! variances** weighted by the **measured motion-band power**. The grid's hot
|
||||
//! cells are the strongest scatterers in that field representation; as the CSI
|
||||
//! changes (a person moving through the link), the peak cell moves with it.
|
||||
//!
|
||||
//! This module reads the **strongest peak(s)** out of that real field and maps
|
||||
//! the peak cell to the Observatory room's world coordinates. That gives the
|
||||
//! 3D figure a position + motion magnitude that are **derived from real signal
|
||||
//! data**, so the figure now tracks where the field energy concentrates.
|
||||
//!
|
||||
//! ### Honesty caveat (do not over-claim)
|
||||
//!
|
||||
//! The field's subcarrier→angle mapping in `generate_signal_field()` is a
|
||||
//! *representation*, not calibrated multistatic triangulation in metric room
|
||||
//! coordinates. A single ESP32 link cannot resolve a true (x, z) room position.
|
||||
//! So the emitted `position` is **"strongest field peak in the room model"**,
|
||||
//! not survey-grade localization. It is real (a function of live CSI), it moves
|
||||
//! with real motion, and it is honest about its source — but it is NOT a
|
||||
//! calibrated person fix. Per-person skeletal `pose` keypoints in room
|
||||
//! coordinates remain gated on the pose model + paired ground-truth data
|
||||
//! (ADR-079), so `pose` here is only ever set from a real aggregate posture
|
||||
//! estimate when one exists, and is `None` otherwise (never fabricated).
|
||||
//!
|
||||
//! ## Coordinate mapping
|
||||
//!
|
||||
//! The Observatory builds its field point cloud (see `ui/observatory/js/main.js`
|
||||
//! `_buildSignalField`) as, for grid cell `(ix, iz)` of a `20×20` grid:
|
||||
//!
|
||||
//! ```text
|
||||
//! world_x = (ix - gridSize/2) * 0.6
|
||||
//! world_z = (iz - gridSize/2) * 0.5
|
||||
//! world_y = 0 (floor)
|
||||
//! ```
|
||||
//!
|
||||
//! and indexes the field as `idx = iz * gridSize + ix` — identical to the
|
||||
//! server's `generate_signal_field()` layout (`values[z * grid + x]`). We map
|
||||
//! the peak cell with the **same** transform so the figure lands exactly on the
|
||||
//! field hotspot it is standing on.
|
||||
|
||||
/// World-space scale factor for the X (width) axis, matching the Observatory's
|
||||
/// `_buildSignalField`: `world_x = (ix - nx/2) * X_SCALE`.
|
||||
pub const X_SCALE: f64 = 0.6;
|
||||
/// World-space scale factor for the Z (depth) axis, matching the Observatory's
|
||||
/// `_buildSignalField`: `world_z = (iz - nz/2) * Z_SCALE`.
|
||||
pub const Z_SCALE: f64 = 0.5;
|
||||
|
||||
/// Minimum normalized field value (`signal_field.values` are normalized to
|
||||
/// `[0, 1]`) for a cell to be considered a real peak rather than background
|
||||
/// attenuation. Below this we treat the field as having no localizable hotspot.
|
||||
pub const PEAK_THRESHOLD: f64 = 0.35;
|
||||
|
||||
/// A localized field peak in Observatory world coordinates.
|
||||
#[derive(Debug, Clone, Copy, PartialEq)]
|
||||
pub struct FieldPeak {
|
||||
/// World position `[x, y, z]` in Observatory scene units (meters). `y` is
|
||||
/// always `0.0` — the field is a floor-plane grid with no height info.
|
||||
pub position: [f64; 3],
|
||||
/// Normalized field intensity at the peak cell, in `[0, 1]`.
|
||||
pub intensity: f64,
|
||||
/// Source grid cell `(ix, iz)` the peak was read from (for tests/debug).
|
||||
pub cell: (usize, usize),
|
||||
}
|
||||
|
||||
/// Map a grid cell `(ix, iz)` of an `nx × nz` field to Observatory world
|
||||
/// coordinates, matching `ui/observatory/js/main.js::_buildSignalField`.
|
||||
#[must_use]
|
||||
pub fn cell_to_world(ix: usize, iz: usize, nx: usize, nz: usize) -> [f64; 3] {
|
||||
let wx = (ix as f64 - nx as f64 / 2.0) * X_SCALE;
|
||||
let wz = (iz as f64 - nz as f64 / 2.0) * Z_SCALE;
|
||||
[wx, 0.0, wz]
|
||||
}
|
||||
|
||||
/// Extract up to `max_peaks` strongest, spatially-separated peaks from a
|
||||
/// `signal_field` grid.
|
||||
///
|
||||
/// * `values` — row-major field grid, `values[iz * nx + ix]`, normalized to
|
||||
/// `[0, 1]` (as produced by `generate_signal_field`).
|
||||
/// * `nx`, `nz` — grid dimensions (the field's `grid_size` is `[nx, 1, nz]`).
|
||||
/// * `max_peaks` — how many person positions to extract (≥ 1).
|
||||
///
|
||||
/// Returns peaks sorted strongest-first. Each successive peak is forced to be
|
||||
/// at least `min_separation_cells` away from all previously selected peaks so
|
||||
/// two persons don't collapse onto the same hotspot. Returns an **empty**
|
||||
/// vector when no cell exceeds [`PEAK_THRESHOLD`] — an empty / no-presence
|
||||
/// field yields no phantom person.
|
||||
#[must_use]
|
||||
pub fn extract_peaks(
|
||||
values: &[f64],
|
||||
nx: usize,
|
||||
nz: usize,
|
||||
max_peaks: usize,
|
||||
min_separation_cells: f64,
|
||||
) -> Vec<FieldPeak> {
|
||||
if nx == 0 || nz == 0 || values.len() < nx * nz || max_peaks == 0 {
|
||||
return Vec::new();
|
||||
}
|
||||
|
||||
// Collect all cells above threshold, strongest first.
|
||||
let mut candidates: Vec<(usize, usize, f64)> = Vec::new();
|
||||
for iz in 0..nz {
|
||||
for ix in 0..nx {
|
||||
let v = values[iz * nx + ix];
|
||||
if v >= PEAK_THRESHOLD {
|
||||
candidates.push((ix, iz, v));
|
||||
}
|
||||
}
|
||||
}
|
||||
candidates.sort_by(|a, b| b.2.total_cmp(&a.2));
|
||||
|
||||
let mut peaks: Vec<FieldPeak> = Vec::new();
|
||||
for (ix, iz, v) in candidates {
|
||||
if peaks.len() >= max_peaks {
|
||||
break;
|
||||
}
|
||||
// Enforce spatial separation from already-chosen peaks (in cell units).
|
||||
let too_close = peaks.iter().any(|p| {
|
||||
let dx = p.cell.0 as f64 - ix as f64;
|
||||
let dz = p.cell.1 as f64 - iz as f64;
|
||||
(dx * dx + dz * dz).sqrt() < min_separation_cells
|
||||
});
|
||||
if too_close {
|
||||
continue;
|
||||
}
|
||||
peaks.push(FieldPeak {
|
||||
position: cell_to_world(ix, iz, nx, nz),
|
||||
intensity: v,
|
||||
cell: (ix, iz),
|
||||
});
|
||||
}
|
||||
peaks
|
||||
}
|
||||
|
||||
/// Convert measured `motion_band_power` to the `motion_score` scale the
|
||||
/// Observatory UI expects.
|
||||
///
|
||||
/// The UI compares `motion_score > 50` to switch between calm and energetic
|
||||
/// emission (see `_updateDotMatrixMist` / `_updateParticleTrail`). The raw
|
||||
/// `motion_band_power` is already in roughly that band for live ESP32 data
|
||||
/// (the issue reports `motion_band_power: 63.3` while moving), so we pass it
|
||||
/// through directly, clamped to a sane `[0, 100]` display range. This keeps the
|
||||
/// emitted value a **direct, real** function of measured motion energy rather
|
||||
/// than a re-scaled invention.
|
||||
#[must_use]
|
||||
pub fn motion_score_from_power(motion_band_power: f64) -> f64 {
|
||||
motion_band_power.clamp(0.0, 100.0)
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn cell_to_world_matches_observatory_layout() {
|
||||
// Center cell of a 20×20 grid maps near origin.
|
||||
let c = cell_to_world(10, 10, 20, 20);
|
||||
assert!((c[0] - 0.0).abs() < 1e-9);
|
||||
assert_eq!(c[1], 0.0);
|
||||
assert!((c[2] - 0.0).abs() < 1e-9);
|
||||
|
||||
// Corner cell (0,0) maps to the room's near-left corner.
|
||||
let corner = cell_to_world(0, 0, 20, 20);
|
||||
assert!((corner[0] - (-6.0)).abs() < 1e-9); // (0-10)*0.6
|
||||
assert!((corner[2] - (-5.0)).abs() < 1e-9); // (0-10)*0.5
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn extract_peaks_finds_known_hotspot() {
|
||||
// 20×20 field, all background, single strong peak at cell (15, 4).
|
||||
let nx = 20;
|
||||
let nz = 20;
|
||||
let mut values = vec![0.05; nx * nz];
|
||||
let peak_ix = 15;
|
||||
let peak_iz = 4;
|
||||
values[peak_iz * nx + peak_ix] = 1.0;
|
||||
|
||||
let peaks = extract_peaks(&values, nx, nz, 1, 3.0);
|
||||
assert_eq!(peaks.len(), 1);
|
||||
assert_eq!(peaks[0].cell, (peak_ix, peak_iz));
|
||||
|
||||
// Position must match the Observatory cell→world transform within tol.
|
||||
let expected = cell_to_world(peak_ix, peak_iz, nx, nz);
|
||||
assert!((peaks[0].position[0] - expected[0]).abs() < 1e-9);
|
||||
assert!((peaks[0].position[2] - expected[2]).abs() < 1e-9);
|
||||
// Sanity: (15-10)*0.6 = 3.0, (4-10)*0.5 = -3.0
|
||||
assert!((peaks[0].position[0] - 3.0).abs() < 1e-9);
|
||||
assert!((peaks[0].position[2] - (-3.0)).abs() < 1e-9);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn empty_field_yields_no_peaks() {
|
||||
let nx = 20;
|
||||
let nz = 20;
|
||||
// All cells below PEAK_THRESHOLD — no presence.
|
||||
let values = vec![0.10; nx * nz];
|
||||
let peaks = extract_peaks(&values, nx, nz, 3, 3.0);
|
||||
assert!(
|
||||
peaks.is_empty(),
|
||||
"below-threshold field must not produce a phantom peak"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn two_separated_peaks_do_not_collapse() {
|
||||
let nx = 20;
|
||||
let nz = 20;
|
||||
let mut values = vec![0.05; nx * nz];
|
||||
values[2 * nx + 3] = 0.95; // peak A at (3, 2)
|
||||
values[15 * nx + 17] = 0.90; // peak B at (17, 15)
|
||||
|
||||
let peaks = extract_peaks(&values, nx, nz, 2, 3.0);
|
||||
assert_eq!(peaks.len(), 2);
|
||||
// Strongest first.
|
||||
assert_eq!(peaks[0].cell, (3, 2));
|
||||
assert_eq!(peaks[1].cell, (17, 15));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn nearby_secondary_peak_is_suppressed() {
|
||||
let nx = 20;
|
||||
let nz = 20;
|
||||
let mut values = vec![0.05; nx * nz];
|
||||
values[10 * nx + 10] = 1.00; // primary
|
||||
values[10 * nx + 11] = 0.99; // adjacent — should be suppressed (sep 3.0)
|
||||
|
||||
let peaks = extract_peaks(&values, nx, nz, 2, 3.0);
|
||||
assert_eq!(peaks.len(), 1, "adjacent cell must not become a 2nd person");
|
||||
assert_eq!(peaks[0].cell, (10, 10));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn motion_score_passthrough_and_clamp() {
|
||||
assert!((motion_score_from_power(63.3) - 63.3).abs() < 1e-9);
|
||||
assert_eq!(motion_score_from_power(-5.0), 0.0);
|
||||
assert_eq!(motion_score_from_power(250.0), 100.0);
|
||||
}
|
||||
}
|
||||
@@ -362,6 +362,49 @@ mod tests {
|
||||
);
|
||||
}
|
||||
|
||||
/// REGRESSION (ADR-080 #1 — X-Forwarded-For / X-Forwarded-Host spoofing).
|
||||
///
|
||||
/// The DNS-rebinding allowlist must decide purely on the real `Host` header
|
||||
/// and ignore any client-supplied forwarding headers. Otherwise an attacker
|
||||
/// could spoof `X-Forwarded-Host: localhost` (or `X-Forwarded-For`) to slip a
|
||||
/// foreign `Host` past the allowlist. This test sends a rejected `Host:
|
||||
/// evil.com` *with* allowlisted forwarding headers and asserts the request is
|
||||
/// still `421` — the forwarded headers must not bypass the control. It also
|
||||
/// confirms an allowed `Host` stays `200` regardless of a hostile XFF.
|
||||
#[tokio::test]
|
||||
async fn forwarded_headers_never_bypass_host_allowlist() {
|
||||
let r = router(HostAllowlist::loopback_only());
|
||||
async fn with_forwarded(
|
||||
router: Router,
|
||||
host: &str,
|
||||
xff: &str,
|
||||
xfh: &str,
|
||||
) -> StatusCode {
|
||||
let req = Request::builder()
|
||||
.method("GET")
|
||||
.uri("/api/v1/pose/current")
|
||||
.header(HOST, host)
|
||||
.header("X-Forwarded-For", xff)
|
||||
.header("X-Forwarded-Host", xfh)
|
||||
.body(Body::empty())
|
||||
.unwrap();
|
||||
router.oneshot(req).await.unwrap().status()
|
||||
}
|
||||
// Foreign Host + spoofed allowlisted forwarding headers ⇒ still rejected.
|
||||
assert_eq!(
|
||||
with_forwarded(r.clone(), "evil.com", "127.0.0.1", "localhost").await,
|
||||
StatusCode::MISDIRECTED_REQUEST,
|
||||
"X-Forwarded-* must not let a foreign Host bypass the allowlist"
|
||||
);
|
||||
// Allowed Host + hostile forwarding headers ⇒ still allowed (forwarded
|
||||
// headers are simply not consulted).
|
||||
assert_eq!(
|
||||
with_forwarded(r, "127.0.0.1:8080", "evil.com", "evil.com").await,
|
||||
StatusCode::OK,
|
||||
"the real Host header is the only signal; XFF/XFH are ignored"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn disabled_allowlist_is_no_op() {
|
||||
let r = router(HostAllowlist::disabled());
|
||||
|
||||
@@ -5,6 +5,7 @@
|
||||
//! - RVF (RuVector Format) binary container for model weights
|
||||
//! - Opt-in bearer-token auth for the `/api/v1/*` HTTP surface (`bearer_auth`)
|
||||
//! - Host-header allowlist / DNS-rebinding defense (`host_validation`)
|
||||
//! - Generic, leak-free internal-error responses (`error_response`, ADR-080 #2)
|
||||
//! - Real-time CSI introspection / low-latency tap (`introspection`, ADR-099)
|
||||
|
||||
pub mod bearer_auth;
|
||||
@@ -13,6 +14,7 @@ pub mod dataset;
|
||||
pub mod edge_registry;
|
||||
#[allow(dead_code)]
|
||||
pub mod embedding;
|
||||
pub mod error_response;
|
||||
pub mod graph_transformer;
|
||||
pub mod host_validation;
|
||||
pub mod introspection;
|
||||
|
||||
@@ -14,6 +14,7 @@ pub mod cli;
|
||||
pub mod csi;
|
||||
mod engine_bridge;
|
||||
mod field_bridge;
|
||||
mod field_localize;
|
||||
mod model_format;
|
||||
mod multistatic_bridge;
|
||||
pub mod pose;
|
||||
@@ -24,7 +25,9 @@ pub mod types;
|
||||
mod vital_signs;
|
||||
|
||||
// Training pipeline modules (exposed via lib.rs)
|
||||
use wifi_densepose_sensing_server::{dataset, embedding, graph_transformer, trainer};
|
||||
use wifi_densepose_sensing_server::{
|
||||
dataset, embedding, error_response, graph_transformer, trainer,
|
||||
};
|
||||
|
||||
use ruvector_mincut::{DynamicMinCut, MinCutBuilder};
|
||||
use std::collections::{HashMap, VecDeque};
|
||||
@@ -404,6 +407,24 @@ struct PersonDetection {
|
||||
keypoints: Vec<PoseKeypoint>,
|
||||
bbox: BoundingBox,
|
||||
zone: String,
|
||||
/// Room-world position `[x, y, z]` (Observatory scene units / meters),
|
||||
/// derived from the strongest `signal_field` peak this person sits on
|
||||
/// (issue #1050). `y` is `0.0` — the field is a floor-plane grid. This is
|
||||
/// a real field-peak readout, not calibrated triangulation; see
|
||||
/// `field_localize` for the honesty caveat. Defaults to `[0,0,0]` until
|
||||
/// field positions are attached by `attach_field_positions`.
|
||||
#[serde(default)]
|
||||
position: [f64; 3],
|
||||
/// Motion magnitude on the Observatory's `0..100` scale, passed through
|
||||
/// from the measured `motion_band_power` (issue #1050).
|
||||
#[serde(default)]
|
||||
motion_score: f64,
|
||||
/// Coarse posture label (`"standing"`/`"lying"`/…) when a **real** aggregate
|
||||
/// posture estimate exists, else `None`. Never fabricated — per-person
|
||||
/// skeletal pose in room coordinates remains gated on the pose model
|
||||
/// (ADR-079). The Observatory defaults to `'standing'` when this is absent.
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pose: Option<String>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
@@ -2570,6 +2591,8 @@ async fn windows_wifi_task(state: SharedState, tick_ms: u64) {
|
||||
if !tracked.is_empty() {
|
||||
update.persons = Some(tracked);
|
||||
}
|
||||
// #1050: attach real signal_field-peak positions to each person.
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
if let Ok(json) = serde_json::to_string(&update) {
|
||||
let _ = s.tx.send(json);
|
||||
@@ -2723,6 +2746,8 @@ async fn windows_wifi_fallback_tick(state: &SharedState, seq: u32) {
|
||||
if !tracked.is_empty() {
|
||||
update.persons = Some(tracked);
|
||||
}
|
||||
// #1050: attach real signal_field-peak positions to each person.
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
if let Ok(json) = serde_json::to_string(&update) {
|
||||
let _ = s.tx.send(json);
|
||||
@@ -3161,12 +3186,21 @@ async fn handle_ws_pose_client(mut socket: WebSocket, state: SharedState) {
|
||||
x: kp[0], y: kp[1], z: kp[2], confidence: kp[3],
|
||||
})
|
||||
.collect();
|
||||
let [nx, _ny, nz] = sensing.signal_field.grid_size;
|
||||
let peak = field_localize::extract_peaks(
|
||||
&sensing.signal_field.values, nx, nz, 1, 3.0,
|
||||
).into_iter().next();
|
||||
vec![PersonDetection {
|
||||
id: 1,
|
||||
confidence: sensing.classification.confidence,
|
||||
bbox: BoundingBox { x: 260.0, y: 150.0, width: 120.0, height: 220.0 },
|
||||
keypoints,
|
||||
zone: "zone_1".into(),
|
||||
position: peak.map_or([0.0, 0.0, 0.0], |p| p.position),
|
||||
motion_score: field_localize::motion_score_from_power(
|
||||
sensing.features.motion_band_power,
|
||||
),
|
||||
pose: sensing.posture.clone(),
|
||||
}]
|
||||
}).unwrap_or_else(|| {
|
||||
// Prefer tracked persons from broadcast if available
|
||||
@@ -3945,6 +3979,53 @@ fn derive_single_person_pose(
|
||||
height: (max_y - min_y).max(160.0),
|
||||
},
|
||||
zone: format!("zone_{}", person_idx + 1),
|
||||
// Position/motion_score/pose are attached from the real signal_field
|
||||
// peaks by `attach_field_positions` after the tracker step (#1050);
|
||||
// default here so the synthetic-skeleton geometry stays unchanged.
|
||||
position: [0.0, 0.0, 0.0],
|
||||
motion_score: 0.0,
|
||||
pose: None,
|
||||
}
|
||||
}
|
||||
|
||||
/// Attach real, field-derived per-person world positions to a `SensingUpdate`'s
|
||||
/// `persons` (issue #1050).
|
||||
///
|
||||
/// For each detected person we read a strongest-peak position out of the frame's
|
||||
/// real `signal_field` (the same grid the Observatory already renders) and map
|
||||
/// it to room-world coordinates via `field_localize::cell_to_world`. `motion_score`
|
||||
/// is passed through from the measured `motion_band_power`; `pose` is taken from
|
||||
/// the real aggregate `posture` estimate when present, else left `None` (never
|
||||
/// fabricated). Persons beyond the number of resolvable field peaks fall back to
|
||||
/// the strongest peak so they remain co-located with real energy rather than at
|
||||
/// a fake origin; if the field has no peak above threshold the position stays at
|
||||
/// `[0,0,0]` and `motion_score` still reflects real motion power.
|
||||
fn attach_field_positions(update: &mut SensingUpdate) {
|
||||
let Some(persons) = update.persons.as_mut() else {
|
||||
return;
|
||||
};
|
||||
if persons.is_empty() {
|
||||
return;
|
||||
}
|
||||
|
||||
let [nx, _ny, nz] = update.signal_field.grid_size;
|
||||
let peaks = field_localize::extract_peaks(
|
||||
&update.signal_field.values,
|
||||
nx,
|
||||
nz,
|
||||
persons.len().max(1),
|
||||
3.0,
|
||||
);
|
||||
|
||||
let motion_score = field_localize::motion_score_from_power(update.features.motion_band_power);
|
||||
let pose_label = update.posture.clone();
|
||||
|
||||
for (i, person) in persons.iter_mut().enumerate() {
|
||||
if let Some(peak) = peaks.get(i).or_else(|| peaks.first()) {
|
||||
person.position = peak.position;
|
||||
}
|
||||
person.motion_score = motion_score;
|
||||
person.pose = pose_label.clone();
|
||||
}
|
||||
}
|
||||
|
||||
@@ -4280,7 +4361,7 @@ async fn delete_model(
|
||||
State(state): State<SharedState>,
|
||||
Path(id): Path<String>,
|
||||
) -> Json<serde_json::Value> {
|
||||
// ADR-050: Sanitize path to prevent directory traversal
|
||||
// ADR-166: Sanitize path to prevent directory traversal
|
||||
let safe_id = std::path::Path::new(&id)
|
||||
.file_name()
|
||||
.and_then(|f| f.to_str())
|
||||
@@ -4291,10 +4372,9 @@ async fn delete_model(
|
||||
let path = effective_models_dir().join(format!("{}.rvf", safe_id));
|
||||
if path.exists() {
|
||||
if let Err(e) = std::fs::remove_file(&path) {
|
||||
warn!("Failed to delete model file {:?}: {}", path, e);
|
||||
return Json(
|
||||
serde_json::json!({ "error": format!("delete failed: {e}"), "success": false }),
|
||||
);
|
||||
// ADR-080 #2: log the OS error (incl. path) server-side only; the
|
||||
// client gets a generic body + correlation id, no leaked path.
|
||||
return error_response::internal_error_json("model delete", e);
|
||||
}
|
||||
// If this was the active model, unload it
|
||||
let mut s = state.write().await;
|
||||
@@ -4434,11 +4514,9 @@ async fn start_recording(
|
||||
let file = match std::fs::File::create(&rec_path) {
|
||||
Ok(f) => f,
|
||||
Err(e) => {
|
||||
warn!("Failed to create recording file {:?}: {}", rec_path, e);
|
||||
return Json(serde_json::json!({
|
||||
"error": format!("cannot create file: {e}"),
|
||||
"success": false,
|
||||
}));
|
||||
// ADR-080 #2: the OS error can carry the recordings path; log it
|
||||
// server-side only and return a generic body + correlation id.
|
||||
return error_response::internal_error_json("recording create", e);
|
||||
}
|
||||
};
|
||||
|
||||
@@ -4550,7 +4628,7 @@ async fn delete_recording(
|
||||
State(state): State<SharedState>,
|
||||
Path(id): Path<String>,
|
||||
) -> Json<serde_json::Value> {
|
||||
// ADR-050: Sanitize path to prevent directory traversal
|
||||
// ADR-166: Sanitize path to prevent directory traversal
|
||||
let safe_id = std::path::Path::new(&id)
|
||||
.file_name()
|
||||
.and_then(|f| f.to_str())
|
||||
@@ -4561,10 +4639,8 @@ async fn delete_recording(
|
||||
let path = PathBuf::from("data/recordings").join(format!("{}.jsonl", safe_id));
|
||||
if path.exists() {
|
||||
if let Err(e) = std::fs::remove_file(&path) {
|
||||
warn!("Failed to delete recording {:?}: {}", path, e);
|
||||
return Json(
|
||||
serde_json::json!({ "error": format!("delete failed: {e}"), "success": false }),
|
||||
);
|
||||
// ADR-080 #2: log the OS error (incl. path) server-side only.
|
||||
return error_response::internal_error_json("recording delete", e);
|
||||
}
|
||||
let mut s = state.write().await;
|
||||
s.recordings
|
||||
@@ -4773,10 +4849,8 @@ async fn calibration_start(State(state): State<SharedState>) -> Json<serde_json:
|
||||
"message": "Calibration started — keep room empty while frames accumulate.",
|
||||
}))
|
||||
}
|
||||
Err(e) => Json(serde_json::json!({
|
||||
"success": false,
|
||||
"error": format!("{e}"),
|
||||
})),
|
||||
// ADR-080 #2: FieldModel init error chain stays server-side only.
|
||||
Err(e) => error_response::internal_error_json("calibration start", e),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -4796,10 +4870,8 @@ async fn calibration_stop(State(state): State<SharedState>) -> Json<serde_json::
|
||||
"frame_count": fm.calibration_frame_count(),
|
||||
}))
|
||||
}
|
||||
Err(e) => Json(serde_json::json!({
|
||||
"success": false,
|
||||
"error": format!("{e}"),
|
||||
})),
|
||||
// ADR-080 #2: finalize error chain stays server-side only.
|
||||
Err(e) => error_response::internal_error_json("calibration stop", e),
|
||||
}
|
||||
} else {
|
||||
Json(serde_json::json!({
|
||||
@@ -4895,26 +4967,13 @@ async fn edge_registry_endpoint(
|
||||
Ok(Ok(resp)) => Ok(Json(
|
||||
serde_json::to_value(resp).unwrap_or(serde_json::json!({})),
|
||||
)),
|
||||
Ok(Err(err)) => {
|
||||
tracing::warn!(error = %err, "edge_registry upstream fetch failed and no cache");
|
||||
Err((
|
||||
StatusCode::SERVICE_UNAVAILABLE,
|
||||
Json(serde_json::json!({
|
||||
"error": "edge_registry_upstream_unavailable",
|
||||
"detail": err.to_string()
|
||||
})),
|
||||
))
|
||||
}
|
||||
Err(join_err) => {
|
||||
tracing::error!(error = %join_err, "edge_registry spawn_blocking task panicked");
|
||||
Err((
|
||||
StatusCode::INTERNAL_SERVER_ERROR,
|
||||
Json(serde_json::json!({
|
||||
"error": "edge_registry_internal_error",
|
||||
"detail": join_err.to_string()
|
||||
})),
|
||||
))
|
||||
}
|
||||
// ADR-080 #2: the upstream error can carry an internal URL/connection
|
||||
// detail — log it server-side only and return a generic 503.
|
||||
Ok(Err(err)) => Err(error_response::upstream_unavailable("edge_registry", err)),
|
||||
// ADR-080 #2: a panicked spawn_blocking surfaces "task … panicked" via
|
||||
// JoinError::Display — never ship that to the client. Generic 500 +
|
||||
// correlation id; the panic detail is logged server-side.
|
||||
Err(join_err) => Err(error_response::internal_error("edge_registry", join_err)),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -5493,6 +5552,8 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
|
||||
if !tracked.is_empty() {
|
||||
update.persons = Some(tracked);
|
||||
}
|
||||
// #1050: attach real signal_field-peak positions to each person.
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
if let Ok(json) = serde_json::to_string(&update) {
|
||||
let _ = s.tx.send(json);
|
||||
@@ -5923,6 +5984,8 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
|
||||
if !tracked.is_empty() {
|
||||
update.persons = Some(tracked);
|
||||
}
|
||||
// #1050: attach real signal_field-peak positions to each person.
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
if let Ok(json) = serde_json::to_string(&update) {
|
||||
let _ = s.tx.send(json);
|
||||
@@ -6096,6 +6159,8 @@ async fn simulated_data_task(state: SharedState, tick_ms: u64) {
|
||||
if !tracked.is_empty() {
|
||||
update.persons = Some(tracked);
|
||||
}
|
||||
// #1050: attach real signal_field-peak positions to each person.
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
if update.classification.presence {
|
||||
s.total_detections += 1;
|
||||
@@ -6964,8 +7029,12 @@ async fn main() {
|
||||
eprintln!("Starting training for {} epochs...", args.epochs);
|
||||
let result = t.run_training(train_data, val_data);
|
||||
eprintln!("Training complete in {:.1}s", result.total_time_secs);
|
||||
// ADR-155 §2.1: `best_pck` is RAW-threshold PCK (no torso norm) and
|
||||
// `best_oks` uses the fake-Gold area=1.0 proxy — NOT the canonical
|
||||
// hip↔hip `pck_canonical` / COCO OKS. Label them distinctly so the
|
||||
// printed numbers are never read as claim-grade canonical metrics.
|
||||
eprintln!(
|
||||
" Best epoch: {}, PCK@0.2: {:.4}, OKS mAP: {:.4}",
|
||||
" Best epoch: {}, pck_raw@0.2: {:.4}, oks_map(area=1.0 proxy): {:.4}",
|
||||
result.best_epoch, result.best_pck, result.best_oks
|
||||
);
|
||||
|
||||
@@ -7375,7 +7444,7 @@ async fn main() {
|
||||
tokio::spawn(simulated_data_task(state.clone(), args.tick_ms));
|
||||
}
|
||||
|
||||
// ADR-050: Parse bind address once, use for all listeners
|
||||
// ADR-166: Parse bind address once, use for all listeners
|
||||
let bind_ip: std::net::IpAddr = args
|
||||
.bind_addr
|
||||
.parse()
|
||||
@@ -8236,3 +8305,171 @@ mod export_rvf_mode_tests {
|
||||
assert!(!export_emits_placeholder_demo(false, true, false));
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod observatory_persons_field_position_tests {
|
||||
//! Issue #1050 — the Observatory 3D figure animates from per-person
|
||||
//! `position` / `motion_score` / `pose` carried on `sensing_update.persons`.
|
||||
//!
|
||||
//! These tests pin the public WS contract: a frame that detects a person on
|
||||
//! a known signal_field peak must emit a `persons` array whose first entry
|
||||
//! carries a `position` derived from that peak (matching the Observatory's
|
||||
//! cell→world transform), a real `motion_score`, and a serialized frame
|
||||
//! that round-trips. An empty / no-presence field must emit `persons: []`
|
||||
//! (or no person), never a phantom person at a fabricated origin.
|
||||
|
||||
use super::*;
|
||||
|
||||
/// Build a 20×20 signal_field that is background everywhere except a single
|
||||
/// strong normalized peak at grid cell `(ix, iz)`.
|
||||
fn field_with_peak(ix: usize, iz: usize) -> SignalField {
|
||||
let nx = 20usize;
|
||||
let nz = 20usize;
|
||||
let mut values = vec![0.05f64; nx * nz];
|
||||
values[iz * nx + ix] = 1.0;
|
||||
SignalField {
|
||||
grid_size: [nx, 1, nz],
|
||||
values,
|
||||
}
|
||||
}
|
||||
|
||||
/// Build an all-background (below-threshold) 20×20 field — no localizable
|
||||
/// hotspot, modelling an empty / no-presence room.
|
||||
fn empty_field() -> SignalField {
|
||||
SignalField {
|
||||
grid_size: [20, 1, 20],
|
||||
values: vec![0.05f64; 20 * 20],
|
||||
}
|
||||
}
|
||||
|
||||
fn base_update(signal_field: SignalField, presence: bool, motion_band_power: f64) -> SensingUpdate {
|
||||
SensingUpdate {
|
||||
msg_type: "sensing_update".to_string(),
|
||||
timestamp: 1.0,
|
||||
source: "test".to_string(),
|
||||
tick: 1,
|
||||
nodes: vec![],
|
||||
features: FeatureInfo {
|
||||
mean_rssi: -60.0,
|
||||
variance: 48.6,
|
||||
motion_band_power,
|
||||
breathing_band_power: 0.0,
|
||||
dominant_freq_hz: 1.0,
|
||||
change_points: 0,
|
||||
spectral_power: 0.0,
|
||||
},
|
||||
classification: ClassificationInfo {
|
||||
motion_level: if presence { "present_moving".to_string() } else { "absent".to_string() },
|
||||
presence,
|
||||
confidence: 0.8,
|
||||
},
|
||||
signal_field,
|
||||
vital_signs: None,
|
||||
enhanced_motion: None,
|
||||
enhanced_breathing: None,
|
||||
posture: None,
|
||||
signal_quality_score: None,
|
||||
quality_verdict: None,
|
||||
bssid_count: None,
|
||||
pose_keypoints: None,
|
||||
model_status: None,
|
||||
persons: None,
|
||||
estimated_persons: Some(1),
|
||||
node_features: None,
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn sensing_update_emits_persons_with_field_derived_position() {
|
||||
// Person present, motion energy 63.3, a hotspot at cell (15, 4).
|
||||
let peak_ix = 15;
|
||||
let peak_iz = 4;
|
||||
let mut update = base_update(field_with_peak(peak_ix, peak_iz), true, 63.3);
|
||||
|
||||
// Pipeline order: derive raw skeleton, then attach real field positions.
|
||||
update.persons = Some(derive_pose_from_sensing(&update));
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
let persons = update.persons.as_ref().expect("persons should be Some");
|
||||
assert!(!persons.is_empty(), "a present person must be emitted");
|
||||
|
||||
// Position must match the Observatory cell→world transform for (15, 4):
|
||||
// x = (15-10)*0.6 = 3.0 ; z = (4-10)*0.5 = -3.0 ; y = 0.
|
||||
let p0 = &persons[0];
|
||||
assert!((p0.position[0] - 3.0).abs() < 1e-6, "x={}", p0.position[0]);
|
||||
assert!((p0.position[1] - 0.0).abs() < 1e-9);
|
||||
assert!((p0.position[2] - (-3.0)).abs() < 1e-6, "z={}", p0.position[2]);
|
||||
|
||||
// motion_score is the measured motion_band_power passed through (≤100).
|
||||
assert!((p0.motion_score - 63.3).abs() < 1e-6, "motion_score={}", p0.motion_score);
|
||||
|
||||
// The serialized WS frame must carry the new fields by their exact
|
||||
// contract names the Observatory UI reads.
|
||||
let v = serde_json::to_value(&update).unwrap();
|
||||
let arr = v["persons"].as_array().expect("persons must be a JSON array");
|
||||
assert_eq!(arr.len(), persons.len());
|
||||
let pj = &arr[0];
|
||||
assert!(pj.get("position").is_some(), "person.position missing from WS frame");
|
||||
assert!(pj.get("motion_score").is_some(), "person.motion_score missing from WS frame");
|
||||
assert!((pj["position"][0].as_f64().unwrap() - 3.0).abs() < 1e-6);
|
||||
assert!((pj["position"][2].as_f64().unwrap() - (-3.0)).abs() < 1e-6);
|
||||
assert!((pj["motion_score"].as_f64().unwrap() - 63.3).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn pose_is_real_when_posture_present_and_absent_otherwise() {
|
||||
// No aggregate posture estimate → pose is None (never fabricated).
|
||||
let mut no_posture = base_update(field_with_peak(10, 10), true, 40.0);
|
||||
no_posture.persons = Some(derive_pose_from_sensing(&no_posture));
|
||||
attach_field_positions(&mut no_posture);
|
||||
let p = &no_posture.persons.as_ref().unwrap()[0];
|
||||
assert!(p.pose.is_none(), "pose must stay None when no real posture exists");
|
||||
// skip_serializing_if drops the key entirely (UI defaults to 'standing').
|
||||
let v = serde_json::to_value(&no_posture).unwrap();
|
||||
assert!(v["persons"][0].get("pose").is_none());
|
||||
|
||||
// Real aggregate posture present → pose is carried through verbatim.
|
||||
let mut with_posture = base_update(field_with_peak(10, 10), true, 40.0);
|
||||
with_posture.posture = Some("lying".to_string());
|
||||
with_posture.persons = Some(derive_pose_from_sensing(&with_posture));
|
||||
attach_field_positions(&mut with_posture);
|
||||
let p2 = &with_posture.persons.as_ref().unwrap()[0];
|
||||
assert_eq!(p2.pose.as_deref(), Some("lying"));
|
||||
let v2 = serde_json::to_value(&with_posture).unwrap();
|
||||
assert_eq!(v2["persons"][0]["pose"], "lying");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn empty_room_yields_no_phantom_person() {
|
||||
// No presence → derive_pose_from_sensing returns no persons at all.
|
||||
let mut update = base_update(empty_field(), false, 2.0);
|
||||
update.persons = Some(derive_pose_from_sensing(&update));
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
let persons = update.persons.as_ref().unwrap();
|
||||
assert!(
|
||||
persons.is_empty(),
|
||||
"no-presence frame must not emit a phantom person, got {} persons",
|
||||
persons.len()
|
||||
);
|
||||
|
||||
// And in the serialized frame the array is empty (no fake origin person).
|
||||
let v = serde_json::to_value(&update).unwrap();
|
||||
assert_eq!(v["persons"].as_array().unwrap().len(), 0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn present_but_below_threshold_field_keeps_position_at_origin_not_fabricated() {
|
||||
// Presence is true but the field has no peak above PEAK_THRESHOLD — we
|
||||
// must NOT invent a position; it stays at the [0,0,0] default while
|
||||
// motion_score still reflects the real measured motion power. This is
|
||||
// the honest degenerate case (no localizable hotspot to report).
|
||||
let mut update = base_update(empty_field(), true, 55.0);
|
||||
update.persons = Some(derive_pose_from_sensing(&update));
|
||||
attach_field_positions(&mut update);
|
||||
|
||||
let p = &update.persons.as_ref().unwrap()[0];
|
||||
assert_eq!(p.position, [0.0, 0.0, 0.0], "no peak → default origin, not fabricated coords");
|
||||
assert!((p.motion_score - 55.0).abs() < 1e-6, "motion_score stays real");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -192,6 +192,11 @@ pub fn derive_single_person_pose(
|
||||
height: (max_y - min_y).max(160.0),
|
||||
},
|
||||
zone: format!("zone_{}", person_idx + 1),
|
||||
// Field-derived fields (#1050) — defaulted here; the live `/ws/sensing`
|
||||
// path attaches real positions via `attach_field_positions`.
|
||||
position: [0.0, 0.0, 0.0],
|
||||
motion_score: 0.0,
|
||||
pose: None,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -176,6 +176,13 @@ pub fn tracker_to_person_detections(tracker: &PoseTracker) -> Vec<PersonDetectio
|
||||
keypoints,
|
||||
bbox,
|
||||
zone: "tracked".to_string(),
|
||||
// Field-derived position/motion_score/pose are (re)attached from
|
||||
// the live signal_field by `attach_field_positions` after this
|
||||
// tracker step (#1050); the Kalman tracker smooths keypoints only,
|
||||
// so we default here and let the field readout fill them in.
|
||||
position: [0.0, 0.0, 0.0],
|
||||
motion_score: 0.0,
|
||||
pose: None,
|
||||
}
|
||||
})
|
||||
.collect()
|
||||
@@ -329,6 +336,9 @@ mod tests {
|
||||
height: 1.0,
|
||||
},
|
||||
zone: "test".to_string(),
|
||||
position: [0.0, 0.0, 0.0],
|
||||
motion_score: 0.0,
|
||||
pose: None,
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -285,7 +285,24 @@ impl WarmupCosineScheduler {
|
||||
|
||||
// ── Validation metrics ─────────────────────────────────────────────────────
|
||||
|
||||
/// Percentage of Correct Keypoints at a distance threshold.
|
||||
/// **RAW-threshold** Percentage of Correct Keypoints — a keypoint is correct
|
||||
/// iff its raw L2 distance to the target is `≤ thr`, with **NO torso/bbox
|
||||
/// normalization**.
|
||||
///
|
||||
/// # ADR-155 §2.1 / §8 — DIVERGENT from canonical (relabel, do NOT conflate)
|
||||
///
|
||||
/// This is **not** the canonical hip↔hip torso-normalized
|
||||
/// `wifi_densepose_train::pck_canonical`. It is the most divergent PCK in the
|
||||
/// workspace: an unnormalized raw-distance count (the ADR-155 §1 "PCK-4
|
||||
/// raw-threshold" class). It drives the live sensing-server CLI's reported
|
||||
/// `best_pck` (see `Trainer::compute_validation_metrics`, `main.rs` training
|
||||
/// path), which prints/serializes as `PCK@0.2` — that label is **raw-threshold
|
||||
/// PCK**, NOT canonical PCK@0.2. ADR-155 Milestone-1 resolves the collision by
|
||||
/// relabelling the *reported* number (`pck_raw@0.2` in logs/JSON) rather than
|
||||
/// silently changing this `pub` API's math; unifying onto `pck_canonical`
|
||||
/// (requires a torso scale + the train crate as a dep) is a tracked §8 backlog
|
||||
/// item. The ADR-155 §1 table did not enumerate this live `trainer.rs` kernel —
|
||||
/// flagged here as a missed divergence.
|
||||
pub fn pck_at_threshold(pred: &[(f32, f32, f32)], target: &[(f32, f32, f32)], thr: f32) -> f32 {
|
||||
let n = pred.len().min(target.len());
|
||||
if n == 0 {
|
||||
@@ -340,6 +357,20 @@ pub fn oks_single(
|
||||
}
|
||||
|
||||
/// Mean OKS over multiple predictions (simplified mAP).
|
||||
///
|
||||
/// # ADR-155 §2.1 / §8 — FAKE-GOLD `area = 1.0` (flagged finding, not yet fixed)
|
||||
///
|
||||
/// This passes `area = 1.0` to [`oks_single`] — the **exact "fake Gold tier"
|
||||
/// pattern** ADR-155 §2.1 said it had closed in `ruview_metrics` / the train
|
||||
/// crate's `compute_oks`. With keypoints in a small coordinate range and
|
||||
/// `area = 1.0`, every squared distance is tiny relative to `2 σ² area`, so the
|
||||
/// exponential kernel returns ≈1.0 and the reported OKS is inflated regardless
|
||||
/// of pose quality. This live sensing-server kernel was **not** in the ADR-155
|
||||
/// §1 table and is still on the inflating `area = 1.0` path; it drives the live
|
||||
/// `best_oks` (`main.rs`). Until it is unified onto the canonical
|
||||
/// pose-extent-derived scale (tracked as an ADR-155 §8 backlog item), the value
|
||||
/// is relabelled `oks_map(area=1.0 proxy)` everywhere it surfaces and must NOT
|
||||
/// be read as a claim-grade COCO OKS.
|
||||
pub fn oks_map(preds: &[Vec<(f32, f32, f32)>], targets: &[Vec<(f32, f32, f32)>]) -> f32 {
|
||||
let n = preds.len().min(targets.len());
|
||||
if n == 0 {
|
||||
@@ -349,6 +380,7 @@ pub fn oks_map(preds: &[Vec<(f32, f32, f32)>], targets: &[Vec<(f32, f32, f32)>])
|
||||
.iter()
|
||||
.zip(targets.iter())
|
||||
.take(n)
|
||||
// area = 1.0 is the fake-Gold proxy (see fn doc / ADR-155 §8).
|
||||
.map(|(p, t)| oks_single(p, t, &COCO_KEYPOINT_SIGMAS, 1.0))
|
||||
.sum();
|
||||
s / n as f32
|
||||
@@ -1271,6 +1303,34 @@ mod tests {
|
||||
fn pck_all_wrong_is_0() {
|
||||
assert!(pck_at_threshold(&mkp(0.0), &mkp(100.0), 0.2) < 1e-6);
|
||||
}
|
||||
|
||||
/// ADR-155 §2.1 / §8: pin that the live `pck_at_threshold` is **raw-threshold**
|
||||
/// (no torso normalization) and is therefore a genuinely different metric
|
||||
/// from the canonical hip↔hip PCK — justifying RELABEL, not silent unify.
|
||||
///
|
||||
/// Two scenes with the **same absolute keypoint error** but **different torso
|
||||
/// sizes** must get the **same** raw PCK (because raw PCK ignores scale),
|
||||
/// whereas a torso-normalized PCK would score them differently. We assert the
|
||||
/// raw verdict is scale-invariant: a 0.15-unit error is "correct" at thr=0.2
|
||||
/// regardless of how far apart the hips are.
|
||||
#[test]
|
||||
fn pck_at_threshold_is_raw_unnormalized_not_canonical() {
|
||||
// Target: one keypoint at origin, vis=1. (Single-joint scene.)
|
||||
let target = vec![(0.0f32, 0.0f32, 1.0f32)];
|
||||
// Prediction off by exactly 0.15 in x.
|
||||
let pred = vec![(0.15f32, 0.0f32, 1.0f32)];
|
||||
|
||||
// Raw threshold 0.2: 0.15 ≤ 0.2 ⇒ correct ⇒ PCK 1.0, independent of any
|
||||
// torso scale (there is none in this kernel).
|
||||
let raw = pck_at_threshold(&pred, &target, 0.2);
|
||||
assert!((raw - 1.0).abs() < 1e-6, "raw PCK ignores scale; expected 1.0, got {raw}");
|
||||
|
||||
// Same absolute error, tighter raw threshold 0.1: 0.15 > 0.1 ⇒ wrong ⇒ 0.0.
|
||||
// The verdict is set purely by the absolute distance vs thr — the
|
||||
// signature of a raw (un-normalized) PCK, NOT canonical torso-relative PCK.
|
||||
let raw_tight = pck_at_threshold(&pred, &target, 0.1);
|
||||
assert!(raw_tight < 1e-6, "raw PCK is absolute-distance only; expected 0.0, got {raw_tight}");
|
||||
}
|
||||
#[test]
|
||||
fn oks_perfect_is_1() {
|
||||
assert!((oks_single(&mkp(0.0), &mkp(0.0), &COCO_KEYPOINT_SIGMAS, 1.0) - 1.0).abs() < 1e-6);
|
||||
|
||||
@@ -163,15 +163,26 @@ fn default_lora_epochs() -> u32 {
|
||||
}
|
||||
|
||||
/// Current training status (returned by `GET /api/v1/train/status`).
|
||||
///
|
||||
/// NOTE (ADR-155 §2.1): `val_pck` / `best_pck` carry the **torso-HEIGHT** PCK
|
||||
/// proxy from [`compute_pck_torso_height`] (pixel-space, nose→hip-midpoint),
|
||||
/// which is **deliberately distinct** from the canonical hip↔hip
|
||||
/// `wifi_densepose_train::pck_canonical`. The wire field names are kept for
|
||||
/// API/UI back-compat, but these are torso-height progress proxies, NOT the
|
||||
/// canonical reported-accuracy PCK@0.2 and must not be conflated with it.
|
||||
/// `val_oks` is a rough `0.88 × pck` proxy, not a COCO OKS.
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct TrainingStatus {
|
||||
pub active: bool,
|
||||
pub epoch: u32,
|
||||
pub total_epochs: u32,
|
||||
pub train_loss: f64,
|
||||
/// Torso-HEIGHT PCK@0.2 proxy (NOT canonical hip↔hip PCK — see struct doc).
|
||||
pub val_pck: f64,
|
||||
/// Rough OKS proxy (`0.88 × val_pck`), NOT a COCO OKS.
|
||||
pub val_oks: f64,
|
||||
pub lr: f64,
|
||||
/// Best torso-HEIGHT PCK@0.2 proxy seen so far (NOT canonical PCK).
|
||||
pub best_pck: f64,
|
||||
pub best_epoch: u32,
|
||||
pub patience_remaining: u32,
|
||||
@@ -199,13 +210,19 @@ impl Default for TrainingStatus {
|
||||
}
|
||||
|
||||
/// Progress update sent over WebSocket.
|
||||
///
|
||||
/// NOTE (ADR-155 §2.1): `val_pck`/`val_oks` are the torso-HEIGHT PCK proxy and
|
||||
/// its `0.88×` OKS proxy — NOT the canonical hip↔hip `pck_canonical`/COCO OKS.
|
||||
/// See [`TrainingStatus`] and [`compute_pck_torso_height`].
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
pub struct TrainingProgress {
|
||||
pub epoch: u32,
|
||||
pub batch: u32,
|
||||
pub total_batches: u32,
|
||||
pub train_loss: f64,
|
||||
/// Torso-HEIGHT PCK@0.2 proxy (NOT canonical hip↔hip PCK).
|
||||
pub val_pck: f64,
|
||||
/// Rough OKS proxy (`0.88 × val_pck`), NOT a COCO OKS.
|
||||
pub val_oks: f64,
|
||||
pub lr: f64,
|
||||
pub phase: String,
|
||||
@@ -789,19 +806,39 @@ fn compute_mse(predictions: &[Vec<f64>], targets: &[Vec<f64>]) -> f64 {
|
||||
total / (n * predictions[0].len().max(1) as f64)
|
||||
}
|
||||
|
||||
/// Compute PCK@0.2 (Percentage of Correct Keypoints at threshold 0.2 of torso height).
|
||||
/// Compute **PCK_torso-height@`threshold`** — a metric DELIBERATELY DISTINCT
|
||||
/// from the canonical hip↔hip PCK (`wifi_densepose_train::pck_canonical`).
|
||||
///
|
||||
/// Torso height is estimated as the distance between nose (kp 0) and the midpoint
|
||||
/// of the two hips (kps 11, 12).
|
||||
/// # Why this is `_torso_height`, not the canonical PCK (ADR-155 §2.1 / §8 — RESOLVED)
|
||||
///
|
||||
/// NOTE (ADR-155 §Tier-1.1, DEFERRED backlog item): this is a *separate*,
|
||||
/// torso-HEIGHT-normalized implementation distinct from the canonical hip↔hip
|
||||
/// `wifi_densepose_train::metrics::pck_canonical`. It drives the live server's
|
||||
/// in-loop progress display and is NOT the reported-accuracy metric. Unifying
|
||||
/// it with the canonical definition is tracked as a deferred ADR-155 backlog
|
||||
/// item — left unchanged here to avoid destabilising the running training
|
||||
/// service and to keep this milestone scoped to the train/nn subsystem.
|
||||
fn compute_pck(predictions: &[Vec<f64>], targets: &[Vec<f64>], threshold_ratio: f64) -> f64 {
|
||||
/// ADR-155 unified the workspace's reported-accuracy PCK to ONE definition:
|
||||
/// **hip↔hip torso WIDTH**, on `[0,1]`-normalized `[17,2]` keypoints. This
|
||||
/// live-server function is **not** that metric and must never be conflated
|
||||
/// with it. It is genuinely different on three load-bearing axes:
|
||||
///
|
||||
/// 1. **Coordinate space.** It operates on **pixel-space** teacher targets on a
|
||||
/// 640×480 canvas (`compute_teacher_targets`), not `[0,1]` MM-Fi coords —
|
||||
/// hence the `.max(50.0)` *pixel* torso floor below.
|
||||
/// 2. **Normalization axis.** It normalizes by torso **HEIGHT** (vertical
|
||||
/// nose→hip-midpoint distance), not canonical torso **WIDTH** (hip↔hip).
|
||||
/// Routing through `pck_canonical` would silently change which body axis
|
||||
/// sets the scale, altering every live number this drives.
|
||||
/// 3. **Layout.** It consumes `[17×3]`-flattened `Vec<Vec<f64>>` (x,y,z), not
|
||||
/// `ndarray::Array2<f32>`; `wifi-densepose-sensing-server` does not depend on
|
||||
/// `wifi-densepose-train` or `ndarray`.
|
||||
///
|
||||
/// Because the math is load-bearing (a running training service's progress
|
||||
/// display), ADR-155 Milestone-1 resolves the label collision by **relabelling**
|
||||
/// rather than forcing a false identity: the function and the metric it produces
|
||||
/// are named `_torso_height` everywhere they surface (this fn, the log line),
|
||||
/// and the `val_pck`/`best_pck` API fields document the divergence. The reported
|
||||
/// in-loop value is a torso-HEIGHT PCK proxy on heuristic teacher targets — it is
|
||||
/// NOT a claim-grade accuracy number and is NOT the canonical hip↔hip PCK@0.2.
|
||||
fn compute_pck_torso_height(
|
||||
predictions: &[Vec<f64>],
|
||||
targets: &[Vec<f64>],
|
||||
threshold_ratio: f64,
|
||||
) -> f64 {
|
||||
if predictions.is_empty() {
|
||||
return 0.0;
|
||||
}
|
||||
@@ -1166,8 +1203,11 @@ async fn real_training_loop(
|
||||
|
||||
let val_preds = forward(val_x, &weights, &bias, n_feat, N_TARGETS);
|
||||
let val_mse = compute_mse(&val_preds, val_y);
|
||||
let val_pck = compute_pck(&val_preds, val_y, 0.2);
|
||||
let val_oks = val_pck * 0.88; // approximate OKS from PCK
|
||||
// torso-HEIGHT PCK proxy (NOT canonical hip↔hip PCK@0.2 — see
|
||||
// compute_pck_torso_height / ADR-155 §2.1). Surfaced as `val_pck` for
|
||||
// wire-format back-compat but is a torso-height proxy, not a claim.
|
||||
let val_pck = compute_pck_torso_height(&val_preds, val_y, 0.2);
|
||||
let val_oks = val_pck * 0.88; // rough OKS proxy from torso-height PCK (NOT canonical OKS)
|
||||
|
||||
let val_progress = TrainingProgress {
|
||||
epoch,
|
||||
@@ -1224,14 +1264,17 @@ async fn real_training_loop(
|
||||
};
|
||||
}
|
||||
|
||||
// Logs label this `pck_torso_h@0.2` so it is never read as the canonical
|
||||
// hip↔hip PCK@0.2 (ADR-155 §2.1). It is a torso-HEIGHT proxy on heuristic
|
||||
// teacher targets, not a claim-grade accuracy number.
|
||||
info!(
|
||||
"Epoch {epoch}/{total_epochs}: loss={train_loss:.6}, val_pck={val_pck:.4}, \
|
||||
val_mse={val_mse:.4}, best_pck={best_pck:.4}@{best_epoch}, patience={patience_remaining}"
|
||||
"Epoch {epoch}/{total_epochs}: loss={train_loss:.6}, pck_torso_h@0.2={val_pck:.4}, \
|
||||
val_mse={val_mse:.4}, best_pck_torso_h={best_pck:.4}@{best_epoch}, patience={patience_remaining}"
|
||||
);
|
||||
|
||||
// Early stopping.
|
||||
if patience_remaining == 0 {
|
||||
info!("Early stopping at epoch {epoch} (best={best_epoch}, PCK={best_pck:.4})");
|
||||
info!("Early stopping at epoch {epoch} (best={best_epoch}, pck_torso_h@0.2={best_pck:.4})");
|
||||
let stop_progress = TrainingProgress {
|
||||
epoch,
|
||||
batch: total_batches,
|
||||
@@ -1368,7 +1411,7 @@ async fn real_training_loop(
|
||||
error!("Failed to write trained model RVF: {e}");
|
||||
} else {
|
||||
info!(
|
||||
"Trained model saved: {} ({} params, PCK={:.4})",
|
||||
"Trained model saved: {} ({} params, pck_torso_h@0.2={:.4})",
|
||||
rvf_path.display(),
|
||||
total_params,
|
||||
best_pck
|
||||
@@ -1969,13 +2012,69 @@ mod tests {
|
||||
tgt[37] = 100.0; // right hip y
|
||||
let preds = vec![tgt.clone()];
|
||||
let targets = vec![tgt];
|
||||
let pck = compute_pck(&preds, &targets, 0.2);
|
||||
let pck = compute_pck_torso_height(&preds, &targets, 0.2);
|
||||
assert!(
|
||||
(pck - 1.0).abs() < 1e-9,
|
||||
"Perfect prediction should give PCK=1.0"
|
||||
);
|
||||
}
|
||||
|
||||
/// ADR-155 §2.1 / §8 (RESOLVED): the live-server PCK is torso-HEIGHT
|
||||
/// normalized and is **labelled distinctly** from the canonical hip↔hip
|
||||
/// PCK. This test pins the *divergence*: the same prediction error gives a
|
||||
/// different verdict under torso-HEIGHT (nose→hip, vertical) than under an
|
||||
/// independent hip↔hip-WIDTH (horizontal) computation — proving the two are
|
||||
/// genuinely different metrics, so relabelling (not unifying) is correct.
|
||||
///
|
||||
/// Construction (pixel-space, one keypoint of interest = left_shoulder kp5):
|
||||
/// * nose(0).y = 0, hips(11,12).y = 100 ⇒ torso HEIGHT = 100.
|
||||
/// ⇒ torso-height threshold @0.2 = 20 px.
|
||||
/// * hips x: left(11).x = 0, right(12).x = 10 ⇒ torso WIDTH = 10.
|
||||
/// ⇒ a hip↔hip-WIDTH threshold @0.2 = 2 px.
|
||||
/// * Predicted kp5 is 5 px off in x from its target.
|
||||
/// - torso-HEIGHT verdict: 5 ≤ 20 ⇒ CORRECT.
|
||||
/// - hip↔hip-WIDTH verdict: 5 > 2 ⇒ WRONG.
|
||||
/// The two normalizers must disagree on this exact sample.
|
||||
#[test]
|
||||
fn torso_pck_is_labelled_distinctly_from_canonical() {
|
||||
// Targets: hips define both axes; kp5 is the joint under test.
|
||||
let mut tgt = vec![0.0; N_TARGETS];
|
||||
tgt[0 * 3] = 0.0; // nose x
|
||||
tgt[0 * 3 + 1] = 0.0; // nose y
|
||||
tgt[5 * 3] = 0.0; // l_shoulder x (target)
|
||||
tgt[5 * 3 + 1] = 50.0; // l_shoulder y
|
||||
tgt[11 * 3] = 0.0; // l_hip x
|
||||
tgt[11 * 3 + 1] = 100.0; // l_hip y
|
||||
tgt[12 * 3] = 10.0; // r_hip x ⇒ hip↔hip WIDTH = 10
|
||||
tgt[12 * 3 + 1] = 100.0; // r_hip y ⇒ torso HEIGHT (nose→hip) = 100
|
||||
|
||||
// Prediction: identical except kp5 x is +5 px off.
|
||||
let mut pred = tgt.clone();
|
||||
pred[5 * 3] = 5.0; // 5 px error in x on kp5
|
||||
|
||||
// Live-server torso-HEIGHT PCK: error 5 ≤ 0.2×100 = 20 ⇒ kp5 counts
|
||||
// correct, so ALL 17 joints correct ⇒ PCK = 1.0.
|
||||
let pck_height = compute_pck_torso_height(&[pred.clone()], &[tgt.clone()], 0.2);
|
||||
assert!(
|
||||
(pck_height - 1.0).abs() < 1e-9,
|
||||
"torso-HEIGHT PCK should pass kp5 (5px ≤ 20px), got {pck_height}"
|
||||
);
|
||||
|
||||
// Independent hip↔hip-WIDTH verdict on kp5: error 5 > 0.2×10 = 2 ⇒ kp5
|
||||
// is WRONG. This is the canonical normalization axis (width, not height).
|
||||
let hip_width = (tgt[12 * 3] - tgt[11 * 3]).abs(); // = 10
|
||||
let kp5_err = (pred[5 * 3] - tgt[5 * 3]).abs(); // = 5
|
||||
let width_threshold = 0.2 * hip_width; // = 2
|
||||
assert!(
|
||||
kp5_err > width_threshold,
|
||||
"hip↔hip-WIDTH should REJECT kp5 (5px > 2px) — the two metrics must disagree"
|
||||
);
|
||||
|
||||
// Therefore torso-HEIGHT PCK (1.0) ≠ the hip↔hip-WIDTH verdict on this
|
||||
// sample: the live `val_pck` is genuinely a different metric and is
|
||||
// correctly labelled `pck_torso_h`, never conflated with canonical PCK.
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn infer_pose_returns_17_keypoints() {
|
||||
let n_sub = 56;
|
||||
|
||||
@@ -203,6 +203,21 @@ pub struct PersonDetection {
|
||||
pub keypoints: Vec<PoseKeypoint>,
|
||||
pub bbox: BoundingBox,
|
||||
pub zone: String,
|
||||
/// Room-world position `[x, y, z]` (Observatory scene units / meters),
|
||||
/// derived from the strongest `signal_field` peak (issue #1050). `y` is
|
||||
/// `0.0` — the field is a floor-plane grid. Real field-peak readout, not
|
||||
/// calibrated triangulation. Defaults to `[0,0,0]`.
|
||||
#[serde(default)]
|
||||
pub position: [f64; 3],
|
||||
/// Motion magnitude on the Observatory's `0..100` scale, passed through
|
||||
/// from the measured `motion_band_power` (issue #1050).
|
||||
#[serde(default)]
|
||||
pub motion_score: f64,
|
||||
/// Coarse posture label when a real aggregate posture estimate exists,
|
||||
/// else `None`. Never fabricated; per-person skeletal pose remains gated
|
||||
/// on the pose model (ADR-079).
|
||||
#[serde(skip_serializing_if = "Option::is_none")]
|
||||
pub pose: Option<String>,
|
||||
}
|
||||
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
|
||||
@@ -71,6 +71,12 @@ harness = false
|
||||
name = "features_bench"
|
||||
harness = false
|
||||
|
||||
## ADR-154 Milestone-2: P2 "bench-first" perf items (§7.4 #5/#6/#7/#8/#20).
|
||||
## #8 (field_model eigendecompose) is measured only under the eigenvalue feature.
|
||||
[[bench]]
|
||||
name = "dsp_perf_bench"
|
||||
harness = false
|
||||
|
||||
## ADR-134: CIR estimator throughput benchmarks
|
||||
[[bench]]
|
||||
name = "cir_bench"
|
||||
|
||||
@@ -0,0 +1,353 @@
|
||||
//! ADR-154 Milestone-2 perf benchmarks (§7.4 P2 "bench-first" items).
|
||||
//!
|
||||
//! PROOF discipline (ADR-154 §0): every P2 item is **benched before touched**.
|
||||
//! A micro-opt is landed only if the bench proves the path hot; otherwise the
|
||||
//! committed bench *is* the result — a MEASURED-NULL that proves the rewrite was
|
||||
//! unnecessary (exactly the §5.x "already amortized" pattern). No speedup is
|
||||
//! claimed without a before/after number from here.
|
||||
//!
|
||||
//! Reproduce (compile-only):
|
||||
//! cargo bench -p wifi-densepose-signal --no-default-features \
|
||||
//! --bench dsp_perf_bench --no-run
|
||||
//!
|
||||
//! Reproduce (full run, writes target/criterion/ HTML):
|
||||
//! cargo bench -p wifi-densepose-signal --no-default-features --bench dsp_perf_bench
|
||||
//!
|
||||
//! Groups:
|
||||
//! * `multistatic_attention` (#5) — `node_attention_weights` at 2..8 nodes ×
|
||||
//! 56 subcarriers. Re-derives consensus/softmax each call; no scratch to
|
||||
//! reuse → expected MEASURED-NULL.
|
||||
//! * `tomography_reconstruct` (#6) — full ISTA solve. The two voxel buffers are
|
||||
//! allocated once per `reconstruct()` (then `.fill`-reused across
|
||||
//! iterations), so the per-solve alloc is 2×n_voxels vs an
|
||||
//! O(iters·links·voxels) compute → expected MEASURED-NULL.
|
||||
//! * `pose_kalman_update` (#7) — Kalman predict+update loop. The "gain
|
||||
//! matrices" are fixed-size **stack** arrays (`[[f32;3];6]`), not heap —
|
||||
//! nothing to reuse → expected MEASURED-NULL.
|
||||
//! * `spectrogram_multi_subcarrier` (#20) — `compute_multi_subcarrier_spectrogram`:
|
||||
//! fresh-planner-per-subcarrier (BEFORE) vs hoisted-plan (AFTER, shipped).
|
||||
//! The per-subcarrier FFT re-plan is the likely real win.
|
||||
//! * `field_model_occupancy` (#8, `eigenvalue` only) — per-call n×n
|
||||
//! eigendecomposition in `estimate_occupancy`. MEASUREMENT-ONLY: quantifies
|
||||
//! the recompute cost; incremental SVD is a sized future project, not a
|
||||
//! micro-fix.
|
||||
|
||||
use criterion::{black_box, criterion_group, criterion_main, BenchmarkId, Criterion, Throughput};
|
||||
use ndarray::Array2;
|
||||
use rustfft::FftPlanner;
|
||||
use std::f64::consts::PI;
|
||||
use std::time::Duration;
|
||||
|
||||
use wifi_densepose_signal::ruvsense::multistatic::node_attention_weights;
|
||||
use wifi_densepose_signal::ruvsense::pose_tracker::KeypointState;
|
||||
use wifi_densepose_signal::ruvsense::tomography::{
|
||||
LinkGeometry, Position3D, RfTomographer, TomographyConfig,
|
||||
};
|
||||
use wifi_densepose_signal::spectrogram::{
|
||||
compute_multi_subcarrier_spectrogram, compute_spectrogram, Spectrogram, SpectrogramConfig,
|
||||
WindowFunction,
|
||||
};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// #5 multistatic node_attention_weights
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn make_node_amplitudes(n_nodes: usize, n_sub: usize) -> Vec<Vec<f32>> {
|
||||
(0..n_nodes)
|
||||
.map(|n| {
|
||||
(0..n_sub)
|
||||
.map(|s| {
|
||||
let phase = (n as f32 * 0.31 + s as f32 * 0.07) % std::f32::consts::TAU;
|
||||
0.5 + 0.4 * phase.sin()
|
||||
})
|
||||
.collect()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn bench_multistatic_attention(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("multistatic_attention");
|
||||
group.measurement_time(Duration::from_secs(3));
|
||||
let n_sub = 56; // canonical-56 grid
|
||||
|
||||
for &n_nodes in &[2usize, 4, 8] {
|
||||
let owned = make_node_amplitudes(n_nodes, n_sub);
|
||||
let refs: Vec<&[f32]> = owned.iter().map(|v| v.as_slice()).collect();
|
||||
group.throughput(Throughput::Elements(1));
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("weights", n_nodes),
|
||||
&refs,
|
||||
|b, amplitudes| {
|
||||
b.iter(|| black_box(node_attention_weights(black_box(amplitudes), 1.0)));
|
||||
},
|
||||
);
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// #6 tomography reconstruct (ISTA L1)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn make_tomographer(n_links: usize) -> (RfTomographer, Vec<f64>) {
|
||||
// A modest 8x8x4 grid (256 voxels), n_links TX/RX pairs around the box.
|
||||
let config = TomographyConfig {
|
||||
nx: 8,
|
||||
ny: 8,
|
||||
nz: 4,
|
||||
bounds: [0.0, 0.0, 0.0, 4.0, 4.0, 2.0],
|
||||
lambda: 0.01,
|
||||
max_iterations: 50,
|
||||
tolerance: 1e-6,
|
||||
min_links: 8,
|
||||
};
|
||||
let mut links = Vec::with_capacity(n_links);
|
||||
for i in 0..n_links {
|
||||
let t = i as f64 / n_links as f64;
|
||||
links.push(LinkGeometry {
|
||||
tx: Position3D {
|
||||
x: 4.0 * (t * PI).cos().abs(),
|
||||
y: 0.0,
|
||||
z: 1.0,
|
||||
},
|
||||
rx: Position3D {
|
||||
x: 4.0 * (t * PI).sin().abs(),
|
||||
y: 4.0,
|
||||
z: 1.0,
|
||||
},
|
||||
link_id: i,
|
||||
});
|
||||
}
|
||||
let tomo = RfTomographer::new(config, &links).unwrap();
|
||||
// Deterministic attenuations (one occupied region in the middle).
|
||||
let attenuations: Vec<f64> = (0..n_links)
|
||||
.map(|i| 0.1 + 0.05 * ((i as f64 * 0.3).sin()))
|
||||
.collect();
|
||||
(tomo, attenuations)
|
||||
}
|
||||
|
||||
fn bench_tomography_reconstruct(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("tomography_reconstruct");
|
||||
group.measurement_time(Duration::from_secs(4));
|
||||
|
||||
for &n_links in &[16usize, 32] {
|
||||
let (tomo, atten) = make_tomographer(n_links);
|
||||
group.throughput(Throughput::Elements(1));
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("solve", n_links),
|
||||
&(tomo, atten),
|
||||
|b, (tomo, atten)| {
|
||||
b.iter(|| black_box(tomo.reconstruct(black_box(atten)).unwrap().occupied_count));
|
||||
},
|
||||
);
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// #7 pose tracker Kalman update loop
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn bench_pose_kalman_update(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("pose_kalman_update");
|
||||
group.measurement_time(Duration::from_secs(3));
|
||||
|
||||
// 17 keypoints (COCO-17), N predict+update cycles — a realistic frame batch.
|
||||
for &n_updates in &[17usize, 170] {
|
||||
group.throughput(Throughput::Elements(n_updates as u64));
|
||||
group.bench_with_input(BenchmarkId::new("cycles", n_updates), &n_updates, |b, &n| {
|
||||
b.iter(|| {
|
||||
let mut acc = 0.0_f32;
|
||||
for k in 0..n {
|
||||
let mut state = KeypointState::new(
|
||||
(k as f32 * 0.1).sin(),
|
||||
(k as f32 * 0.2).cos(),
|
||||
1.0 + (k as f32 * 0.05),
|
||||
);
|
||||
state.predict(0.05, 0.5);
|
||||
let meas = [
|
||||
(k as f32 * 0.1).sin() + 0.01,
|
||||
(k as f32 * 0.2).cos() - 0.01,
|
||||
1.0 + (k as f32 * 0.05),
|
||||
];
|
||||
state.update(&meas, 0.1, 1.0);
|
||||
acc += state.state[0];
|
||||
}
|
||||
black_box(acc)
|
||||
});
|
||||
});
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// #20 multi-subcarrier spectrogram: fresh-planner vs hoisted plan
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
fn make_csi_temporal(n_samples: usize, n_sc: usize) -> Array2<f64> {
|
||||
Array2::from_shape_fn((n_samples, n_sc), |(t, sc)| {
|
||||
let freq = 0.7 + sc as f64 * 0.13;
|
||||
(2.0 * PI * freq * t as f64 / 100.0).sin()
|
||||
+ 0.3 * (2.0 * PI * (freq * 2.1) * t as f64 / 100.0).cos()
|
||||
})
|
||||
}
|
||||
|
||||
/// BEFORE: re-plan the FFT inside `compute_spectrogram` for every subcarrier.
|
||||
/// Faithful transcription of the pre-ADR-154-M2 `compute_multi_subcarrier_spectrogram`.
|
||||
fn multi_fresh_planner(
|
||||
csi: &Array2<f64>,
|
||||
sample_rate: f64,
|
||||
config: &SpectrogramConfig,
|
||||
) -> Vec<Spectrogram> {
|
||||
let (_, n_sc) = csi.dim();
|
||||
(0..n_sc)
|
||||
.map(|sc| {
|
||||
let col: Vec<f64> = csi.column(sc).to_vec();
|
||||
// compute_spectrogram builds a fresh FftPlanner on every call.
|
||||
compute_spectrogram(&col, sample_rate, config).unwrap()
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
fn bench_spectrogram_multi_subcarrier(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("spectrogram_multi_subcarrier");
|
||||
group.measurement_time(Duration::from_secs(5));
|
||||
let sample_rate = 100.0;
|
||||
|
||||
// Realistic: 600 temporal samples (~6 s @ 100 Hz) across 56 subcarriers,
|
||||
// window 128. n_sc re-plans removed by the hoist.
|
||||
for &(n_samples, n_sc, window) in &[(600usize, 56usize, 128usize), (600, 56, 256)] {
|
||||
let csi = make_csi_temporal(n_samples, n_sc);
|
||||
let config = SpectrogramConfig {
|
||||
window_size: window,
|
||||
hop_size: 64,
|
||||
window_fn: WindowFunction::Hann,
|
||||
power: true,
|
||||
};
|
||||
group.throughput(Throughput::Elements(n_sc as u64));
|
||||
|
||||
// BEFORE: fresh planner per subcarrier.
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("fresh_planner", format!("sc{n_sc}_w{window}")),
|
||||
&config,
|
||||
|b, cfg| {
|
||||
b.iter(|| black_box(multi_fresh_planner(black_box(&csi), sample_rate, cfg).len()));
|
||||
},
|
||||
);
|
||||
|
||||
// AFTER: hoisted plan (the shipped `compute_multi_subcarrier_spectrogram`).
|
||||
group.bench_with_input(
|
||||
BenchmarkId::new("hoisted_plan", format!("sc{n_sc}_w{window}")),
|
||||
&config,
|
||||
|b, cfg| {
|
||||
b.iter(|| {
|
||||
black_box(
|
||||
compute_multi_subcarrier_spectrogram(black_box(&csi), sample_rate, cfg)
|
||||
.unwrap()
|
||||
.len(),
|
||||
)
|
||||
});
|
||||
},
|
||||
);
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// A standalone FftPlanner sanity micro-bench documenting the cost the hoist
|
||||
// removes: building+planning a length-N forward FFT once.
|
||||
fn bench_fft_plan_cost(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("fft_plan_cost");
|
||||
group.measurement_time(Duration::from_secs(2));
|
||||
for &n in &[128usize, 256] {
|
||||
group.bench_with_input(BenchmarkId::new("plan_forward", n), &n, |b, &n| {
|
||||
b.iter(|| {
|
||||
let mut planner = FftPlanner::<f64>::new();
|
||||
black_box(planner.plan_fft_forward(black_box(n)))
|
||||
});
|
||||
});
|
||||
}
|
||||
group.finish();
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// #8 field_model SVD/eigendecomposition recompute (MEASUREMENT-ONLY)
|
||||
// ---------------------------------------------------------------------------
|
||||
// `estimate_occupancy` builds an n×n covariance and eigendecomposes it on every
|
||||
// call (BLAS, `eigenvalue` feature). This bench quantifies that per-call cost so
|
||||
// ADR-154 §7.4 #8 can record a number; incremental SVD is a sized future item,
|
||||
// NOT attempted here.
|
||||
#[cfg(feature = "eigenvalue")]
|
||||
mod eig {
|
||||
use super::*;
|
||||
use wifi_densepose_signal::ruvsense::field_model::{FieldModel, FieldModelConfig};
|
||||
|
||||
fn calibrated_model(n_sub: usize, n_links: usize) -> FieldModel {
|
||||
let config = FieldModelConfig {
|
||||
n_subcarriers: n_sub,
|
||||
n_links,
|
||||
n_modes: 3,
|
||||
min_calibration_frames: 20,
|
||||
baseline_expiry_s: 86_400.0,
|
||||
};
|
||||
let mut model = FieldModel::new(config).unwrap();
|
||||
// Feed deterministic calibration frames: [n_links][n_sub] per observation.
|
||||
for f in 0..30 {
|
||||
let obs: Vec<Vec<f64>> = (0..n_links)
|
||||
.map(|l| {
|
||||
(0..n_sub)
|
||||
.map(|s| {
|
||||
0.5 + 0.3
|
||||
* ((f as f64 * 0.1 + l as f64 * 0.2 + s as f64 * 0.05).sin())
|
||||
})
|
||||
.collect()
|
||||
})
|
||||
.collect();
|
||||
model.feed_calibration(&obs).unwrap();
|
||||
}
|
||||
model.finalize_calibration(0, 0).unwrap();
|
||||
model
|
||||
}
|
||||
|
||||
pub fn bench_field_model_occupancy(c: &mut Criterion) {
|
||||
let mut group = c.benchmark_group("field_model_occupancy");
|
||||
group.measurement_time(Duration::from_secs(4));
|
||||
let n_sub = 56;
|
||||
let model = calibrated_model(n_sub, 4);
|
||||
// Sliding window of recent frames (50 ~ 2.5 s @ 20 Hz).
|
||||
let frames: Vec<Vec<f64>> = (0..50)
|
||||
.map(|t| {
|
||||
(0..n_sub)
|
||||
.map(|s| 0.5 + 0.3 * ((t as f64 * 0.15 + s as f64 * 0.07).sin()))
|
||||
.collect()
|
||||
})
|
||||
.collect();
|
||||
group.throughput(Throughput::Elements(1));
|
||||
group.bench_function(BenchmarkId::new("eigh", n_sub), |b| {
|
||||
b.iter(|| black_box(model.estimate_occupancy(black_box(&frames))));
|
||||
});
|
||||
group.finish();
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(feature = "eigenvalue")]
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_multistatic_attention,
|
||||
bench_tomography_reconstruct,
|
||||
bench_pose_kalman_update,
|
||||
bench_spectrogram_multi_subcarrier,
|
||||
bench_fft_plan_cost,
|
||||
eig::bench_field_model_occupancy,
|
||||
);
|
||||
|
||||
#[cfg(not(feature = "eigenvalue"))]
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_multistatic_attention,
|
||||
bench_tomography_reconstruct,
|
||||
bench_pose_kalman_update,
|
||||
bench_spectrogram_multi_subcarrier,
|
||||
bench_fft_plan_cost,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
@@ -197,4 +197,61 @@ mod tests {
|
||||
Err(CsiRatioError::LengthMismatch { .. })
|
||||
));
|
||||
}
|
||||
|
||||
// ADR-154 §7.4 #19: the CSI *ratio model*. The classic ratio is
|
||||
// `H_i[k] / H_j[k]`, which blows up (±inf / NaN) when `H_j[k]` approaches
|
||||
// zero — the case a `1e-12` division-guard epsilon is meant to protect. This
|
||||
// module deliberately implements the ratio as the **conjugate product**
|
||||
// `H_i * conj(H_j)` (SpotFi/IndoTrack), which has *no division* and is
|
||||
// therefore finite even at and below the `1e-12` magnitude boundary. This
|
||||
// test pins that property: at the epsilon boundary the output is finite and
|
||||
// exactly the conjugate product (no silent NaN/inf from a hidden divide).
|
||||
#[test]
|
||||
fn ratio_finite_at_and_below_1e_12_epsilon() {
|
||||
let eps = 1e-12_f64;
|
||||
// Reference at unit magnitude; target swept across / under the epsilon
|
||||
// boundary a naive H_i/H_j division would need to guard.
|
||||
let h_ref = vec![
|
||||
Complex64::from_polar(1.0, 0.3),
|
||||
Complex64::from_polar(1.0, 0.3),
|
||||
Complex64::from_polar(1.0, 0.3),
|
||||
Complex64::from_polar(1.0, 0.3),
|
||||
];
|
||||
let h_target = vec![
|
||||
Complex64::new(eps, 0.0), // exactly at the epsilon
|
||||
Complex64::new(eps * 0.5, 0.0), // below the epsilon
|
||||
Complex64::new(0.0, eps), // imaginary axis, at epsilon
|
||||
Complex64::new(0.0, 0.0), // exact zero — div would be inf/NaN
|
||||
];
|
||||
|
||||
let ratio = conjugate_multiply(&h_ref, &h_target).unwrap();
|
||||
assert_eq!(ratio.len(), 4);
|
||||
for (k, r) in ratio.iter().enumerate() {
|
||||
assert!(
|
||||
r.re.is_finite() && r.im.is_finite(),
|
||||
"conjugate-multiply ratio must be finite at boundary k={k}: {r:?}"
|
||||
);
|
||||
}
|
||||
|
||||
// The near-zero / zero target collapses the product toward zero (the
|
||||
// physically correct "no measurable path" answer), never to inf/NaN.
|
||||
assert!(
|
||||
ratio[3].norm() == 0.0,
|
||||
"exact-zero target → zero product, got {}",
|
||||
ratio[3].norm()
|
||||
);
|
||||
// The at-epsilon entries equal the exact conjugate product (bit-exact).
|
||||
let expected0 = h_ref[0] * h_target[0].conj();
|
||||
assert_eq!(ratio[0].re.to_bits(), expected0.re.to_bits());
|
||||
assert_eq!(ratio[0].im.to_bits(), expected0.im.to_bits());
|
||||
|
||||
// The full pipeline (amplitude/phase extraction) is also finite here.
|
||||
let mut m = Array2::<Complex64>::zeros((1, 4));
|
||||
for (k, &v) in ratio.iter().enumerate() {
|
||||
m[[0, k]] = v;
|
||||
}
|
||||
let (amp, phase) = ratio_to_amplitude_phase(&m);
|
||||
assert!(amp.iter().all(|a| a.is_finite()));
|
||||
assert!(phase.iter().all(|p| p.is_finite()));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -43,11 +43,22 @@ pub struct HampelResult {
|
||||
/// MAD = 0.6745 * σ → σ = MAD / 0.6745 = 1.4826 * MAD
|
||||
const MAD_SCALE: f64 = 1.4826;
|
||||
|
||||
/// Zero-MAD epsilon (ADR-154 §7.4 — de-magicked). When the estimated σ falls
|
||||
/// at/below this, the window is treated as constant (degenerate MAD): any
|
||||
/// deviation larger than this same epsilon flags the sample as an outlier.
|
||||
/// Empirical guard against an all-equal window, not a tuned operating point.
|
||||
const ZERO_MAD_EPSILON: f64 = 1e-15;
|
||||
|
||||
/// Apply Hampel filter to a 1D signal.
|
||||
///
|
||||
/// For each sample, computes the median and MAD of the surrounding window.
|
||||
/// If the sample deviates from the median by more than `threshold * σ_est`,
|
||||
/// it is replaced with the median.
|
||||
///
|
||||
/// # Errors
|
||||
/// - [`HampelError::EmptySignal`] if `signal` is empty.
|
||||
/// - [`HampelError::InvalidWindow`] if `config.half_window == 0` (a window of
|
||||
/// one sample has zero MAD and cannot estimate σ).
|
||||
pub fn hampel_filter(signal: &[f64], config: &HampelConfig) -> Result<HampelResult, HampelError> {
|
||||
if signal.is_empty() {
|
||||
return Err(HampelError::EmptySignal);
|
||||
@@ -75,13 +86,13 @@ pub fn hampel_filter(signal: &[f64], config: &HampelConfig) -> Result<HampelResu
|
||||
sigma_estimates.push(sigma);
|
||||
|
||||
let deviation = (signal[i] - med).abs();
|
||||
let is_outlier = if sigma > 1e-15 {
|
||||
let is_outlier = if sigma > ZERO_MAD_EPSILON {
|
||||
// Normal case: compare deviation to threshold * sigma
|
||||
deviation > config.threshold * sigma
|
||||
} else {
|
||||
// Zero-MAD case: all window values identical except possibly this sample.
|
||||
// Any non-zero deviation from the median is an outlier.
|
||||
deviation > 1e-15
|
||||
deviation > ZERO_MAD_EPSILON
|
||||
};
|
||||
|
||||
if is_outlier {
|
||||
@@ -233,4 +244,48 @@ mod tests {
|
||||
Err(HampelError::EmptySignal)
|
||||
));
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked zero-MAD epsilon must equal the prior literal.
|
||||
#[test]
|
||||
fn zero_mad_epsilon_unchanged_from_literal() {
|
||||
assert_eq!(ZERO_MAD_EPSILON, 1e-15);
|
||||
assert_eq!(MAD_SCALE, 1.4826);
|
||||
}
|
||||
|
||||
/// `half_window == 0` is the documented invalid-window boundary; pins the
|
||||
/// previously-untested error path.
|
||||
#[test]
|
||||
fn test_zero_half_window_error() {
|
||||
let config = HampelConfig {
|
||||
half_window: 0,
|
||||
threshold: 3.0,
|
||||
};
|
||||
assert!(matches!(
|
||||
hampel_filter(&[1.0, 2.0, 3.0], &config),
|
||||
Err(HampelError::InvalidWindow)
|
||||
));
|
||||
// half_window = 1 is the smallest valid window.
|
||||
let ok = HampelConfig {
|
||||
half_window: 1,
|
||||
threshold: 3.0,
|
||||
};
|
||||
assert!(hampel_filter(&[1.0, 2.0, 3.0], &ok).is_ok());
|
||||
}
|
||||
|
||||
/// Zero-MAD (constant) window: a single deviating sample is flagged via the
|
||||
/// degenerate-MAD branch; a fully constant signal flags nothing.
|
||||
#[test]
|
||||
fn test_zero_mad_constant_window() {
|
||||
// Fully constant -> no outliers (deviation is 0, not > epsilon).
|
||||
let constant = vec![5.0; 20];
|
||||
let r = hampel_filter(&constant, &HampelConfig::default()).unwrap();
|
||||
assert!(r.outlier_indices.is_empty());
|
||||
// A single spike in an otherwise-constant signal -> flagged.
|
||||
let mut spiked = vec![5.0; 20];
|
||||
spiked[10] = 5.5;
|
||||
let r = hampel_filter(&spiked, &HampelConfig::default()).unwrap();
|
||||
assert!(r.outlier_indices.contains(&10));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -8,6 +8,66 @@ use chrono::{DateTime, Utc};
|
||||
use serde::{Deserialize, Serialize};
|
||||
use std::collections::VecDeque;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Tuning constants (ADR-154 §7.4 #18 — de-magicked; EMPIRICAL DEFAULTS).
|
||||
//
|
||||
// These were previously bare literals inside the scoring functions. They are
|
||||
// lifted to named, documented consts so the implicit weighting becomes
|
||||
// explicit and a future retune is a visible, tested change. The values are
|
||||
// **unchanged** from the original literals — boundary/characterization tests
|
||||
// pin the current behaviour. None of these is calibrated against labelled
|
||||
// occupancy data; they are heuristic fusion weights.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Motion-score fusion weights when a Doppler component is present.
|
||||
/// `(variance, correlation, phase, doppler)` — sums to 1.0.
|
||||
const MOTION_WEIGHTS_WITH_DOPPLER: (f64, f64, f64, f64) = (0.3, 0.2, 0.2, 0.3);
|
||||
|
||||
/// Motion-score fusion weights with no Doppler component.
|
||||
/// `(variance, correlation, phase)` — sums to 1.0.
|
||||
const MOTION_WEIGHTS_NO_DOPPLER: (f64, f64, f64) = (0.4, 0.3, 0.3);
|
||||
|
||||
/// Doppler magnitude (Hz-ish, arbitrary units) that maps to a full-scale
|
||||
/// (1.0) Doppler motion component. Larger magnitudes saturate at 1.0.
|
||||
const DOPPLER_FULL_SCALE_MAGNITUDE: f64 = 100.0;
|
||||
|
||||
/// Reference variance that maps to a full-scale (1.0) heuristic motion score
|
||||
/// when no calibrated baseline is available. Empirical default.
|
||||
const VARIANCE_HEURISTIC_FULL_SCALE: f64 = 0.5;
|
||||
|
||||
/// Reference phase variance that maps to a full-scale (1.0) phase motion
|
||||
/// component. Empirical default.
|
||||
const PHASE_VARIANCE_FULL_SCALE: f64 = 0.5;
|
||||
|
||||
/// Blend weight between phase-variance and phase-coherence in the phase score.
|
||||
const PHASE_SCORE_VARIANCE_WEIGHT: f64 = 0.5;
|
||||
|
||||
/// Reference dynamic range that maps to a full-scale (1.0) amplitude-quality
|
||||
/// confidence indicator. Empirical default.
|
||||
const AMP_QUALITY_FULL_SCALE_RANGE: f64 = 2.0;
|
||||
|
||||
/// Confidence-indicator blend weights (`amplitude`, `phase`, `correlation`,
|
||||
/// `doppler`) — each is the fraction of total confidence that indicator
|
||||
/// contributes when present.
|
||||
const CONF_WEIGHT_AMPLITUDE: f64 = 0.3;
|
||||
const CONF_WEIGHT_PHASE: f64 = 0.3;
|
||||
const CONF_WEIGHT_CORRELATION: f64 = 0.2;
|
||||
const CONF_WEIGHT_DOPPLER: f64 = 0.2;
|
||||
|
||||
/// Minimum baseline floor added before dividing by the calibration baseline
|
||||
/// variance, preventing a divide-by-zero on an all-constant calibration.
|
||||
const BASELINE_VARIANCE_FLOOR: f64 = 1e-10;
|
||||
|
||||
/// Lower / upper clamp for the adaptive human-detection threshold
|
||||
/// (`mean + 1σ` of recent motion scores). Keeps the adaptive threshold inside
|
||||
/// a sane operating band. Empirical default.
|
||||
const ADAPTIVE_THRESHOLD_MIN: f64 = 0.3;
|
||||
const ADAPTIVE_THRESHOLD_MAX: f64 = 0.95;
|
||||
|
||||
/// Minimum history length before the adaptive threshold engages; below this
|
||||
/// the configured fixed threshold is used.
|
||||
const ADAPTIVE_THRESHOLD_MIN_HISTORY: usize = 10;
|
||||
|
||||
/// Motion score with component breakdown
|
||||
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||
pub struct MotionScore {
|
||||
@@ -37,12 +97,11 @@ impl MotionScore {
|
||||
) -> Self {
|
||||
// Calculate weighted total
|
||||
let total = if let Some(doppler) = doppler_component {
|
||||
0.3 * variance_component
|
||||
+ 0.2 * correlation_component
|
||||
+ 0.2 * phase_component
|
||||
+ 0.3 * doppler
|
||||
let (wv, wc, wp, wd) = MOTION_WEIGHTS_WITH_DOPPLER;
|
||||
wv * variance_component + wc * correlation_component + wp * phase_component + wd * doppler
|
||||
} else {
|
||||
0.4 * variance_component + 0.3 * correlation_component + 0.3 * phase_component
|
||||
let (wv, wc, wp) = MOTION_WEIGHTS_NO_DOPPLER;
|
||||
wv * variance_component + wc * correlation_component + wp * phase_component
|
||||
};
|
||||
|
||||
Self {
|
||||
@@ -304,7 +363,7 @@ impl MotionDetector {
|
||||
// Calculate Doppler-based score if available
|
||||
let doppler_score = features.doppler.as_ref().map(|d| {
|
||||
// Normalize Doppler magnitude to 0-1 range
|
||||
(d.mean_magnitude / 100.0).clamp(0.0, 1.0)
|
||||
(d.mean_magnitude / DOPPLER_FULL_SCALE_MAGNITUDE).clamp(0.0, 1.0)
|
||||
});
|
||||
|
||||
let motion_score = MotionScore::new(
|
||||
@@ -355,11 +414,11 @@ impl MotionDetector {
|
||||
|
||||
// Normalize using baseline if available
|
||||
if let Some(baseline) = self.baseline_variance {
|
||||
let ratio = mean_variance / (baseline + 1e-10);
|
||||
let ratio = mean_variance / (baseline + BASELINE_VARIANCE_FLOOR);
|
||||
(ratio - 1.0).max(0.0).tanh()
|
||||
} else {
|
||||
// Use heuristic normalization
|
||||
(mean_variance / 0.5).clamp(0.0, 1.0)
|
||||
(mean_variance / VARIANCE_HEURISTIC_FULL_SCALE).clamp(0.0, 1.0)
|
||||
}
|
||||
}
|
||||
|
||||
@@ -393,7 +452,9 @@ impl MotionDetector {
|
||||
let coherence_factor = 1.0 - phase.coherence.abs();
|
||||
|
||||
// Combine factors
|
||||
let score = 0.5 * (mean_variance / 0.5).clamp(0.0, 1.0) + 0.5 * coherence_factor;
|
||||
let w = PHASE_SCORE_VARIANCE_WEIGHT;
|
||||
let score = w * (mean_variance / PHASE_VARIANCE_FULL_SCALE).clamp(0.0, 1.0)
|
||||
+ (1.0 - w) * coherence_factor;
|
||||
score.clamp(0.0, 1.0)
|
||||
}
|
||||
|
||||
@@ -416,26 +477,27 @@ impl MotionDetector {
|
||||
let mut weight_sum = 0.0;
|
||||
|
||||
// Amplitude quality indicator
|
||||
let amp_quality = (features.amplitude.dynamic_range / 2.0).clamp(0.0, 1.0);
|
||||
confidence += amp_quality * 0.3;
|
||||
weight_sum += 0.3;
|
||||
let amp_quality =
|
||||
(features.amplitude.dynamic_range / AMP_QUALITY_FULL_SCALE_RANGE).clamp(0.0, 1.0);
|
||||
confidence += amp_quality * CONF_WEIGHT_AMPLITUDE;
|
||||
weight_sum += CONF_WEIGHT_AMPLITUDE;
|
||||
|
||||
// Phase coherence indicator
|
||||
let phase_quality = features.phase.coherence.abs();
|
||||
confidence += phase_quality * 0.3;
|
||||
weight_sum += 0.3;
|
||||
confidence += phase_quality * CONF_WEIGHT_PHASE;
|
||||
weight_sum += CONF_WEIGHT_PHASE;
|
||||
|
||||
// Correlation consistency indicator
|
||||
let corr_quality = (1.0 - features.correlation.correlation_spread).clamp(0.0, 1.0);
|
||||
confidence += corr_quality * 0.2;
|
||||
weight_sum += 0.2;
|
||||
confidence += corr_quality * CONF_WEIGHT_CORRELATION;
|
||||
weight_sum += CONF_WEIGHT_CORRELATION;
|
||||
|
||||
// Doppler quality if available
|
||||
if let Some(ref doppler) = features.doppler {
|
||||
let doppler_quality =
|
||||
(doppler.spread / doppler.mean_magnitude.max(1.0)).clamp(0.0, 1.0);
|
||||
confidence += (1.0 - doppler_quality) * 0.2;
|
||||
weight_sum += 0.2;
|
||||
confidence += (1.0 - doppler_quality) * CONF_WEIGHT_DOPPLER;
|
||||
weight_sum += CONF_WEIGHT_DOPPLER;
|
||||
}
|
||||
|
||||
if weight_sum > 0.0 {
|
||||
@@ -542,7 +604,7 @@ impl MotionDetector {
|
||||
|
||||
/// Calculate adaptive threshold based on recent history
|
||||
fn calculate_adaptive_threshold(&self) -> f64 {
|
||||
if self.motion_history.len() < 10 {
|
||||
if self.motion_history.len() < ADAPTIVE_THRESHOLD_MIN_HISTORY {
|
||||
return self.config.human_detection_threshold;
|
||||
}
|
||||
|
||||
@@ -555,7 +617,7 @@ impl MotionDetector {
|
||||
};
|
||||
|
||||
// Threshold is mean + 1 std deviation, clamped to reasonable range
|
||||
(mean + std).clamp(0.3, 0.95)
|
||||
(mean + std).clamp(ADAPTIVE_THRESHOLD_MIN, ADAPTIVE_THRESHOLD_MAX)
|
||||
}
|
||||
|
||||
/// Update baseline variance (for calibration)
|
||||
@@ -838,4 +900,127 @@ mod tests {
|
||||
let stats = detector.get_statistics();
|
||||
assert_eq!(stats.history_size, 10); // Should not exceed max
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4 #18: de-magic-constant + boundary characterization tests.
|
||||
// These pin CURRENT behaviour so a future retune is a visible, tested change.
|
||||
|
||||
/// The de-magicked tuning consts MUST equal the prior bare literals exactly
|
||||
/// (this milestone is cleanup — operating values are unchanged).
|
||||
#[test]
|
||||
fn motion_tuning_consts_unchanged_from_literals() {
|
||||
assert_eq!(MOTION_WEIGHTS_WITH_DOPPLER, (0.3, 0.2, 0.2, 0.3));
|
||||
assert_eq!(MOTION_WEIGHTS_NO_DOPPLER, (0.4, 0.3, 0.3));
|
||||
assert_eq!(DOPPLER_FULL_SCALE_MAGNITUDE, 100.0);
|
||||
assert_eq!(VARIANCE_HEURISTIC_FULL_SCALE, 0.5);
|
||||
assert_eq!(PHASE_VARIANCE_FULL_SCALE, 0.5);
|
||||
assert_eq!(PHASE_SCORE_VARIANCE_WEIGHT, 0.5);
|
||||
assert_eq!(AMP_QUALITY_FULL_SCALE_RANGE, 2.0);
|
||||
assert_eq!(CONF_WEIGHT_AMPLITUDE, 0.3);
|
||||
assert_eq!(CONF_WEIGHT_PHASE, 0.3);
|
||||
assert_eq!(CONF_WEIGHT_CORRELATION, 0.2);
|
||||
assert_eq!(CONF_WEIGHT_DOPPLER, 0.2);
|
||||
assert_eq!(BASELINE_VARIANCE_FLOOR, 1e-10);
|
||||
assert_eq!(ADAPTIVE_THRESHOLD_MIN, 0.3);
|
||||
assert_eq!(ADAPTIVE_THRESHOLD_MAX, 0.95);
|
||||
assert_eq!(ADAPTIVE_THRESHOLD_MIN_HISTORY, 10);
|
||||
// Fusion weights are a convex combination (sum to 1.0).
|
||||
let (wv, wc, wp, wd) = MOTION_WEIGHTS_WITH_DOPPLER;
|
||||
assert!((wv + wc + wp + wd - 1.0).abs() < 1e-12);
|
||||
let (wv, wc, wp) = MOTION_WEIGHTS_NO_DOPPLER;
|
||||
assert!((wv + wc + wp - 1.0).abs() < 1e-12);
|
||||
}
|
||||
|
||||
/// Doppler component saturates at full scale (`/100.0` then clamp(0,1)).
|
||||
/// Pins behaviour at/just-below/just-above the full-scale magnitude.
|
||||
#[test]
|
||||
fn doppler_component_saturates_at_full_scale() {
|
||||
use crate::features::DopplerFeatures;
|
||||
use ndarray::Array1;
|
||||
let make = |mag: f64| DopplerFeatures {
|
||||
shifts: Array1::zeros(1),
|
||||
peak_frequency: 0.0,
|
||||
mean_magnitude: mag,
|
||||
spread: 0.0,
|
||||
};
|
||||
let detector = MotionDetector::default_config();
|
||||
// just below full scale -> < 1.0
|
||||
let mut features = create_test_features(0.5);
|
||||
features.doppler = Some(make(DOPPLER_FULL_SCALE_MAGNITUDE - 1.0));
|
||||
let below = detector.analyze_motion(&features).score.doppler_component.unwrap();
|
||||
assert!(below < 1.0 && below > 0.98);
|
||||
// exactly full scale -> 1.0
|
||||
features.doppler = Some(make(DOPPLER_FULL_SCALE_MAGNITUDE));
|
||||
let at = detector.analyze_motion(&features).score.doppler_component.unwrap();
|
||||
assert_eq!(at, 1.0);
|
||||
// above full scale -> clamped to 1.0
|
||||
features.doppler = Some(make(DOPPLER_FULL_SCALE_MAGNITUDE * 10.0));
|
||||
let above = detector.analyze_motion(&features).score.doppler_component.unwrap();
|
||||
assert_eq!(above, 1.0);
|
||||
}
|
||||
|
||||
/// `calculate_correlation_score` returns 0.0 for n<2 (the small-matrix
|
||||
/// guard) and a finite, clamped value for n>=2. Pins the n=1 boundary.
|
||||
#[test]
|
||||
fn correlation_score_zero_below_n2_boundary() {
|
||||
use crate::features::CorrelationFeatures;
|
||||
use ndarray::Array2;
|
||||
let detector = MotionDetector::default_config();
|
||||
let one = CorrelationFeatures {
|
||||
matrix: Array2::from_elem((1, 1), 1.0),
|
||||
mean_correlation: 0.0,
|
||||
max_correlation: 0.0,
|
||||
correlation_spread: 0.0,
|
||||
};
|
||||
assert_eq!(detector.calculate_correlation_score(&one), 0.0);
|
||||
let two = CorrelationFeatures {
|
||||
matrix: Array2::from_shape_fn((2, 2), |(i, j)| if i == j { 1.0 } else { 0.0 }),
|
||||
mean_correlation: 0.0,
|
||||
max_correlation: 0.0,
|
||||
correlation_spread: 0.0,
|
||||
};
|
||||
let s = detector.calculate_correlation_score(&two);
|
||||
assert!(s.is_finite() && (0.0..=1.0).contains(&s));
|
||||
}
|
||||
|
||||
/// `calculate_temporal_variance` returns 0.0 with fewer than 2 history
|
||||
/// entries, finite otherwise. Pins the len<2 boundary.
|
||||
#[test]
|
||||
fn temporal_variance_zero_below_two_history() {
|
||||
let mut detector = MotionDetector::default_config();
|
||||
assert_eq!(detector.calculate_temporal_variance(), 0.0); // 0 entries
|
||||
detector
|
||||
.motion_history
|
||||
.push_back(MotionScore::new(0.5, 0.5, 0.5, None));
|
||||
assert_eq!(detector.calculate_temporal_variance(), 0.0); // 1 entry
|
||||
detector
|
||||
.motion_history
|
||||
.push_back(MotionScore::new(0.1, 0.1, 0.1, None));
|
||||
assert!(detector.calculate_temporal_variance() > 0.0); // 2 entries
|
||||
}
|
||||
|
||||
/// The adaptive threshold engages only at/after `ADAPTIVE_THRESHOLD_MIN_HISTORY`
|
||||
/// history entries; below it falls back to the configured fixed threshold.
|
||||
/// Pins the history=9 (fixed) vs history=10 (adaptive) boundary.
|
||||
#[test]
|
||||
fn adaptive_threshold_engages_at_history_boundary() {
|
||||
let config = MotionDetectorConfig::builder()
|
||||
.adaptive_threshold(true)
|
||||
.human_detection_threshold(0.8)
|
||||
.history_size(50)
|
||||
.build();
|
||||
let mut detector = MotionDetector::new(config);
|
||||
// Push exactly 9 entries: still uses the fixed configured threshold.
|
||||
for _ in 0..(ADAPTIVE_THRESHOLD_MIN_HISTORY - 1) {
|
||||
detector
|
||||
.motion_history
|
||||
.push_back(MotionScore::new(0.5, 0.5, 0.5, None));
|
||||
}
|
||||
assert_eq!(detector.calculate_adaptive_threshold(), 0.8);
|
||||
// 10th entry: adaptive band kicks in, clamped to [MIN, MAX].
|
||||
detector
|
||||
.motion_history
|
||||
.push_back(MotionScore::new(0.5, 0.5, 0.5, None));
|
||||
let t = detector.calculate_adaptive_threshold();
|
||||
assert!((ADAPTIVE_THRESHOLD_MIN..=ADAPTIVE_THRESHOLD_MAX).contains(&t));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -72,6 +72,44 @@ impl Default for AdversarialConfig {
|
||||
}
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Detection tuning constants (ADR-154 §7.4 #13 — DATA-GATED)
|
||||
// ---------------------------------------------------------------------------
|
||||
//
|
||||
// These were bare numeric literals buried in `check`/`check_consistency`. They
|
||||
// are EMPIRICAL DEFAULTS, not calibrated operating points — setting defensible
|
||||
// values needs labelled spoofed/clean CSI (the Wi-Spoof benchmark, §6.2/§7.3).
|
||||
// De-magicking + the boundary tests below make any future data-driven retune a
|
||||
// visible, tested change. The VALUES here are unchanged from the pre-ADR-154
|
||||
// behaviour; only their names and the pinning tests are new.
|
||||
|
||||
/// Gini coefficient above which the energy distribution is flagged as a
|
||||
/// `FieldModelViolation` (one link hogging the energy → likely injection).
|
||||
/// EMPIRICAL DEFAULT pending labelled calibration.
|
||||
const FIELD_MODEL_GINI_VIOLATION: f64 = 0.8;
|
||||
|
||||
/// Energy-conservation ratio (total / expected-for-body-count) above which the
|
||||
/// frame is flagged as an `EnergyViolation` (too much energy for the occupancy).
|
||||
/// EMPIRICAL DEFAULT pending labelled calibration.
|
||||
const ENERGY_RATIO_HIGH_VIOLATION: f64 = 2.0;
|
||||
|
||||
/// Energy-conservation ratio below which an *occupied* frame is flagged as an
|
||||
/// `EnergyViolation` (too little energy for a claimed body — possible dropout
|
||||
/// or masking). Only applied when `n_bodies > 0`. EMPIRICAL DEFAULT.
|
||||
const ENERGY_RATIO_LOW_VIOLATION: f64 = 0.1;
|
||||
|
||||
/// Fraction of the mean per-link energy a link must exceed to count as
|
||||
/// "active" in the multi-link consistency check. EMPIRICAL DEFAULT.
|
||||
const CONSISTENCY_ACTIVE_FRACTION_OF_MEAN: f64 = 0.1;
|
||||
|
||||
/// Weights of the four checks in the aggregate anomaly score (sum to 1.0).
|
||||
/// EMPIRICAL DEFAULTS — equal 0.2 split with consistency double-weighted (0.4)
|
||||
/// because single-link injection is the primary threat model (ADR-030 Tier 7).
|
||||
const SCORE_W_CONSISTENCY: f64 = 0.4;
|
||||
const SCORE_W_FIELD_MODEL: f64 = 0.2;
|
||||
const SCORE_W_TEMPORAL: f64 = 0.2;
|
||||
const SCORE_W_ENERGY: f64 = 0.2;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Detection results
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -250,13 +288,15 @@ impl AdversarialDetector {
|
||||
if consistency < self.config.consistency_threshold {
|
||||
violations.push(AnomalyType::SingleLinkInjection);
|
||||
}
|
||||
if field_residual > 0.8 {
|
||||
if field_residual > FIELD_MODEL_GINI_VIOLATION {
|
||||
violations.push(AnomalyType::FieldModelViolation);
|
||||
}
|
||||
if temporal > self.config.max_temporal_discontinuity {
|
||||
violations.push(AnomalyType::TemporalDiscontinuity);
|
||||
}
|
||||
if energy_ratio > 2.0 || (n_bodies > 0 && energy_ratio < 0.1) {
|
||||
if energy_ratio > ENERGY_RATIO_HIGH_VIOLATION
|
||||
|| (n_bodies > 0 && energy_ratio < ENERGY_RATIO_LOW_VIOLATION)
|
||||
{
|
||||
violations.push(AnomalyType::EnergyViolation);
|
||||
}
|
||||
|
||||
@@ -268,10 +308,10 @@ impl AdversarialDetector {
|
||||
};
|
||||
|
||||
// Score: weighted combination
|
||||
let anomaly_score = ((1.0 - consistency) * 0.4
|
||||
+ field_residual * 0.2
|
||||
+ (temporal / self.config.max_temporal_discontinuity).min(1.0) * 0.2
|
||||
+ ((energy_ratio - 1.0).abs() / 2.0).min(1.0) * 0.2)
|
||||
let anomaly_score = ((1.0 - consistency) * SCORE_W_CONSISTENCY
|
||||
+ field_residual * SCORE_W_FIELD_MODEL
|
||||
+ (temporal / self.config.max_temporal_discontinuity).min(1.0) * SCORE_W_TEMPORAL
|
||||
+ ((energy_ratio - 1.0).abs() / 2.0).min(1.0) * SCORE_W_ENERGY)
|
||||
.clamp(0.0, 1.0);
|
||||
|
||||
// Find affected links (highest single-link energy ratio)
|
||||
@@ -304,7 +344,8 @@ impl AdversarialDetector {
|
||||
}
|
||||
|
||||
let mean = total / energies.len() as f64;
|
||||
let threshold = mean * 0.1; // link must have at least 10% of mean energy
|
||||
// link must have at least CONSISTENCY_ACTIVE_FRACTION_OF_MEAN of mean energy
|
||||
let threshold = mean * CONSISTENCY_ACTIVE_FRACTION_OF_MEAN;
|
||||
|
||||
let active_count = energies.iter().filter(|&&e| e > threshold).count();
|
||||
active_count as f64 / energies.len() as f64
|
||||
@@ -641,4 +682,118 @@ mod tests {
|
||||
gini
|
||||
);
|
||||
}
|
||||
|
||||
// ── ADR-154 §7.4 #13: threshold characterization (DATA-GATED) ───────────
|
||||
// These pin the CURRENT empirical threshold values so a future labelled-data
|
||||
// retune is a visible, tested change. They do NOT assert the values are
|
||||
// "correct" — only that the named consts equal the de-magicked literals and
|
||||
// that the decision boundaries sit exactly where the old bare literals put
|
||||
// them.
|
||||
|
||||
/// The named consts must equal the original bare literals (no value drift).
|
||||
#[test]
|
||||
fn tuning_consts_unchanged_from_literals() {
|
||||
assert_eq!(FIELD_MODEL_GINI_VIOLATION, 0.8);
|
||||
assert_eq!(ENERGY_RATIO_HIGH_VIOLATION, 2.0);
|
||||
assert_eq!(ENERGY_RATIO_LOW_VIOLATION, 0.1);
|
||||
assert_eq!(CONSISTENCY_ACTIVE_FRACTION_OF_MEAN, 0.1);
|
||||
assert!(
|
||||
(SCORE_W_CONSISTENCY + SCORE_W_FIELD_MODEL + SCORE_W_TEMPORAL + SCORE_W_ENERGY - 1.0)
|
||||
.abs()
|
||||
< 1e-12,
|
||||
"score weights must sum to 1.0"
|
||||
);
|
||||
}
|
||||
|
||||
/// Energy-ratio HIGH boundary: the `> ENERGY_RATIO_HIGH_VIOLATION` decision
|
||||
/// flips just above 2.0. With max_energy_per_body=10 and n_bodies=1, total
|
||||
/// energy E gives ratio E/10, so E=20 is the boundary. Use a clean uniform
|
||||
/// distribution so ONLY the energy check can fire.
|
||||
#[test]
|
||||
fn energy_ratio_high_boundary() {
|
||||
let mk = |per_link: f64| {
|
||||
// 6 links, uniform → consistency=1, gini≈0, temporal=0 (first frame).
|
||||
vec![per_link; 6]
|
||||
};
|
||||
// ratio just BELOW 2.0 (total=19.2 → ratio 1.92): no energy violation.
|
||||
let mut det = AdversarialDetector::new(default_config()).unwrap();
|
||||
let below = det.check(&mk(3.2), 1, 0).unwrap(); // 6*3.2=19.2
|
||||
assert!(
|
||||
!below.anomaly_detected,
|
||||
"ratio 1.92 (<2.0) must not flag energy violation: {:?}",
|
||||
below.anomaly_type
|
||||
);
|
||||
// ratio just ABOVE 2.0 (total=21.0 → ratio 2.1): energy violation fires.
|
||||
let mut det2 = AdversarialDetector::new(default_config()).unwrap();
|
||||
let above = det2.check(&mk(3.5), 1, 0).unwrap(); // 6*3.5=21.0
|
||||
assert!(
|
||||
above.anomaly_detected,
|
||||
"ratio 2.1 (>2.0) must flag an anomaly"
|
||||
);
|
||||
}
|
||||
|
||||
/// Energy-ratio LOW boundary: an occupied frame with ratio < 0.1 flags an
|
||||
/// `EnergyViolation`. With n_bodies=1, max_energy_per_body=10, boundary
|
||||
/// total = 1.0 (ratio 0.1). Below it (total 0.9 → 0.09) must flag.
|
||||
#[test]
|
||||
fn energy_ratio_low_boundary() {
|
||||
// just ABOVE 0.1 (total 1.2 → ratio 0.12): no energy violation.
|
||||
let mut det = AdversarialDetector::new(default_config()).unwrap();
|
||||
let above = det.check(&vec![0.2; 6], 1, 0).unwrap(); // 6*0.2=1.2
|
||||
assert!(
|
||||
!above.anomaly_detected,
|
||||
"ratio 0.12 (>0.1) must not flag: {:?}",
|
||||
above.anomaly_type
|
||||
);
|
||||
// just BELOW 0.1 (total 0.6 → ratio 0.06): energy violation fires.
|
||||
let mut det2 = AdversarialDetector::new(default_config()).unwrap();
|
||||
let below = det2.check(&vec![0.1; 6], 1, 0).unwrap(); // 6*0.1=0.6
|
||||
assert!(
|
||||
below.anomaly_detected,
|
||||
"ratio 0.06 (<0.1) must flag an energy anomaly"
|
||||
);
|
||||
}
|
||||
|
||||
/// Field-model Gini boundary: `check_field_model` > 0.8 → FieldModelViolation.
|
||||
/// We directly characterize where the Gini crosses 0.8 for a one-hot vs
|
||||
/// uniform-tail mix, pinning the 0.8 const.
|
||||
#[test]
|
||||
fn field_model_gini_boundary() {
|
||||
let det = AdversarialDetector::new(default_config()).unwrap();
|
||||
// Fully concentrated (one-hot) over 6 links → Gini = (n-1)/n = 0.833 > 0.8.
|
||||
let concentrated = det.check_field_model(&[6.0, 0.0, 0.0, 0.0, 0.0, 0.0], 6.0);
|
||||
assert!(
|
||||
concentrated > FIELD_MODEL_GINI_VIOLATION,
|
||||
"one-hot Gini {concentrated} must exceed the 0.8 violation threshold"
|
||||
);
|
||||
// Uniform → Gini ≈ 0 < 0.8.
|
||||
let uniform = det.check_field_model(&[1.0; 6], 6.0);
|
||||
assert!(
|
||||
uniform < FIELD_MODEL_GINI_VIOLATION,
|
||||
"uniform Gini {uniform} must be below the 0.8 threshold"
|
||||
);
|
||||
}
|
||||
|
||||
/// Consistency active-fraction boundary: a link counts as "active" iff its
|
||||
/// energy > 0.1·mean. Pin that exactly one sub-threshold link is excluded.
|
||||
#[test]
|
||||
fn consistency_active_fraction_boundary() {
|
||||
let det = AdversarialDetector::new(default_config()).unwrap();
|
||||
// 5 links at 1.0, one link at just BELOW 0.1·mean.
|
||||
// mean over 6 = (5.0 + x)/6; for x small, threshold ≈ 0.1*5/6 ≈ 0.083.
|
||||
let mut e = vec![1.0; 6];
|
||||
e[5] = 0.05; // below ~0.083 threshold → excluded
|
||||
let c_excluded = det.check_consistency(&e, e.iter().sum());
|
||||
assert!(
|
||||
(c_excluded - 5.0 / 6.0).abs() < 1e-9,
|
||||
"sub-threshold link must be excluded: got {c_excluded}"
|
||||
);
|
||||
// Bump it well above threshold → counts as active (all 6).
|
||||
e[5] = 1.0;
|
||||
let c_included = det.check_consistency(&e, e.iter().sum());
|
||||
assert!(
|
||||
(c_included - 1.0).abs() < 1e-9,
|
||||
"above-threshold link must count: got {c_included}"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -24,6 +24,18 @@ use midstreamer_attractor::{AttractorAnalyzer, AttractorType, PhasePoint};
|
||||
|
||||
use super::longitudinal::DriftMetric;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Internal constants (ADR-154 §7.4 — de-magicked; values unchanged)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Per-metric ring-buffer capacity: one year of daily observations.
|
||||
const METRIC_BUFFER_CAPACITY: usize = 365;
|
||||
|
||||
/// Number of most-recent values averaged to estimate a point-attractor's
|
||||
/// stable centre. Empirical default — a short tail that tracks the latest
|
||||
/// converged level without over-smoothing.
|
||||
const STABLE_CENTER_WINDOW: usize = 10;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Configuration
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -232,7 +244,7 @@ impl AttractorDriftAnalyzer {
|
||||
|
||||
let buffers = DriftMetric::all()
|
||||
.iter()
|
||||
.map(|&m| MetricBuffer::new(m, 365)) // 1 year of daily observations
|
||||
.map(|&m| MetricBuffer::new(m, METRIC_BUFFER_CAPACITY))
|
||||
.collect();
|
||||
|
||||
Ok(Self {
|
||||
@@ -296,8 +308,12 @@ impl AttractorDriftAnalyzer {
|
||||
|
||||
match info.attractor_type {
|
||||
AttractorType::PointAttractor => {
|
||||
// Compute center as mean of last few values
|
||||
let recent = &values[values.len().saturating_sub(10)..];
|
||||
// Compute center as the mean of the last STABLE_CENTER_WINDOW
|
||||
// values. `recent` is non-empty here: the `count < min_needed`
|
||||
// guard above guarantees `values.len() >= min_observations >= 1`
|
||||
// before this branch, so `recent.len() >= 1` and the division
|
||||
// below cannot be a divide-by-zero.
|
||||
let recent = &values[values.len().saturating_sub(STABLE_CENTER_WINDOW)..];
|
||||
let center = recent.iter().sum::<f64>() / recent.len() as f64;
|
||||
BiophysicalAttractor::Stable { center }
|
||||
}
|
||||
@@ -563,4 +579,38 @@ mod tests {
|
||||
let dbg = format!("{:?}", a);
|
||||
assert!(dbg.contains("AttractorDriftAnalyzer"));
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked internal constants must equal the prior inline literals.
|
||||
#[test]
|
||||
fn attractor_consts_unchanged_from_literals() {
|
||||
assert_eq!(METRIC_BUFFER_CAPACITY, 365);
|
||||
assert_eq!(STABLE_CENTER_WINDOW, 10);
|
||||
}
|
||||
|
||||
/// `analyze` returns InsufficientData strictly below `min_observations` and
|
||||
/// succeeds at exactly `min_observations`. Pins the off-by-one boundary
|
||||
/// (previously only the well-below case was tested) and, with it, the
|
||||
/// implicit `recent.len() >= 1` divide-safety in the PointAttractor branch.
|
||||
#[test]
|
||||
fn analyze_min_observations_boundary() {
|
||||
let cfg = AttractorDriftConfig {
|
||||
min_observations: 12,
|
||||
..Default::default()
|
||||
};
|
||||
let mut a = AttractorDriftAnalyzer::new(7, cfg.clone()).unwrap();
|
||||
// One below the boundary -> InsufficientData.
|
||||
for i in 0..(cfg.min_observations - 1) {
|
||||
a.add_observation(DriftMetric::GaitSymmetry, 0.1 + i as f64 * 0.001);
|
||||
}
|
||||
assert!(matches!(
|
||||
a.analyze(DriftMetric::GaitSymmetry, 0),
|
||||
Err(AttractorDriftError::InsufficientData { needed: 12, have: 11 })
|
||||
));
|
||||
// Exactly at the boundary -> Ok (no panic, finite center if Stable).
|
||||
a.add_observation(DriftMetric::GaitSymmetry, 0.111);
|
||||
let report = a.analyze(DriftMetric::GaitSymmetry, 0).unwrap();
|
||||
assert_eq!(report.observation_count, 12);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -40,6 +40,30 @@ const VERSION: u8 = 1;
|
||||
const HEADER_LEN: usize = 16; // magic(4) + version(1) + tier(1) + reserved(2) + unix_s(8)
|
||||
const SUBCARRIER_RECORD_LEN: usize = 16; // 4 × f32
|
||||
|
||||
// ADR-154 §7.4 — de-magicked (values unchanged). The tuning thresholds below
|
||||
// are EMPIRICAL DEFAULTS pending labelled empty-vs-occupied calibration traces.
|
||||
|
||||
/// Default minimum frames for a baseline finalization (30 s @ 20 Hz). Shared by
|
||||
/// every tier constructor (`ht20`/`ht40`/`he20`/`he40`).
|
||||
const DEFAULT_MIN_FRAMES: u32 = 600;
|
||||
|
||||
/// Amplitude standard-deviation floor used as the z-score divisor in
|
||||
/// `deviation()`, guarding against a zero-variance baseline subcarrier.
|
||||
const AMP_STD_FLOOR: f32 = 1e-12;
|
||||
|
||||
/// `deviation()` flags motion when the median amplitude z-score exceeds this
|
||||
/// many σ. EMPIRICAL DEFAULT.
|
||||
const MOTION_AMP_Z_THRESHOLD: f32 = 2.0;
|
||||
|
||||
/// `deviation()` flags motion when the median phase drift exceeds this many
|
||||
/// radians (π/6 = 30°). EMPIRICAL DEFAULT.
|
||||
const MOTION_PHASE_DRIFT_THRESHOLD: f32 = std::f32::consts::PI / 6.0;
|
||||
|
||||
/// Minimum complex magnitude in `subtract_in_place` below which a bin is left
|
||||
/// untouched (a near-zero bin has no meaningful baseline to subtract and the
|
||||
/// `(norm - baseline)/norm` scaling would be ill-conditioned).
|
||||
const SUBTRACT_MIN_NORM: f64 = 1e-30;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// PHY tier
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -103,11 +127,11 @@ pub struct CalibrationConfig {
|
||||
impl CalibrationConfig {
|
||||
/// HT20 defaults: 64 FFT, 52 active, 600 frame minimum (30 s @ 20 Hz).
|
||||
pub fn ht20() -> Self {
|
||||
Self { tier: PhyTier::Ht20, num_subcarriers: 64, num_active: 52, min_frames: 600, max_phase_variance: 0.3 }
|
||||
Self { tier: PhyTier::Ht20, num_subcarriers: 64, num_active: 52, min_frames: DEFAULT_MIN_FRAMES, max_phase_variance: 0.3 }
|
||||
}
|
||||
/// HT40 defaults: 128 FFT, 114 active.
|
||||
pub fn ht40() -> Self {
|
||||
Self { tier: PhyTier::Ht40, num_subcarriers: 128, num_active: 114, min_frames: 600, max_phase_variance: 0.3 }
|
||||
Self { tier: PhyTier::Ht40, num_subcarriers: 128, num_active: 114, min_frames: DEFAULT_MIN_FRAMES, max_phase_variance: 0.3 }
|
||||
}
|
||||
/// HE20 defaults: 256 FFT, **256 active** (record all delivered bins).
|
||||
///
|
||||
@@ -128,11 +152,11 @@ impl CalibrationConfig {
|
||||
/// `cir.rs` (`HE20_ACTIVE`), where the Φ sensing matrix genuinely needs it;
|
||||
/// the baseline recorder does not.
|
||||
pub fn he20() -> Self {
|
||||
Self { tier: PhyTier::He20, num_subcarriers: 256, num_active: 256, min_frames: 600, max_phase_variance: 0.3 }
|
||||
Self { tier: PhyTier::He20, num_subcarriers: 256, num_active: 256, min_frames: DEFAULT_MIN_FRAMES, max_phase_variance: 0.3 }
|
||||
}
|
||||
/// HE40 defaults: 512 FFT, 484 active.
|
||||
pub fn he40() -> Self {
|
||||
Self { tier: PhyTier::He40, num_subcarriers: 512, num_active: 484, min_frames: 600, max_phase_variance: 0.3 }
|
||||
Self { tier: PhyTier::He40, num_subcarriers: 512, num_active: 484, min_frames: DEFAULT_MIN_FRAMES, max_phase_variance: 0.3 }
|
||||
}
|
||||
}
|
||||
|
||||
@@ -264,7 +288,7 @@ impl BaselineCalibration {
|
||||
for (ki, (c, baseline)) in y.iter().zip(self.subcarriers.iter()).enumerate() {
|
||||
let _ = ki;
|
||||
let amp = c.norm();
|
||||
let std = baseline.amp_variance.sqrt().max(1e-12_f32);
|
||||
let std = baseline.amp_variance.sqrt().max(AMP_STD_FLOOR);
|
||||
z_amp.push((amp - baseline.amp_mean) / std);
|
||||
let theta = c.arg();
|
||||
let drift = circular_distance(theta, baseline.phase_mean);
|
||||
@@ -273,7 +297,8 @@ impl BaselineCalibration {
|
||||
let amplitude_z_median = median_abs(&z_amp);
|
||||
let amplitude_z_max = z_amp.iter().map(|v| v.abs()).fold(0.0_f32, f32::max);
|
||||
let phase_drift_median = median_slice(&phase_drift);
|
||||
let motion_flagged = amplitude_z_median > 2.0 || phase_drift_median > std::f32::consts::PI / 6.0;
|
||||
let motion_flagged =
|
||||
amplitude_z_median > MOTION_AMP_Z_THRESHOLD || phase_drift_median > MOTION_PHASE_DRIFT_THRESHOLD;
|
||||
Ok(CalibrationDeviationScore { amplitude_z_median, amplitude_z_max, phase_drift_median, motion_flagged })
|
||||
}
|
||||
|
||||
@@ -338,7 +363,7 @@ impl BaselineCalibration {
|
||||
for s in 0..n_streams {
|
||||
let c = frame.data[[s, ki]];
|
||||
let norm = c.norm();
|
||||
if norm > 1e-30 {
|
||||
if norm > SUBTRACT_MIN_NORM {
|
||||
let scale = ((norm - baseline_amp).max(0.0)) / norm;
|
||||
frame.data[[s, ki]] = num_complex::Complex64::new(c.re * scale, c.im * scale);
|
||||
}
|
||||
@@ -491,7 +516,8 @@ impl CalibrationRecorder {
|
||||
let amplitude_z_median = median_slice(&z_amp_abs);
|
||||
let amplitude_z_max = z_amp_abs.iter().copied().fold(0.0_f32, f32::max);
|
||||
let phase_drift_median = median_slice(&phase_drift);
|
||||
let motion_flagged = amplitude_z_median > 2.0 || phase_drift_median > std::f32::consts::PI / 6.0;
|
||||
let motion_flagged =
|
||||
amplitude_z_median > MOTION_AMP_Z_THRESHOLD || phase_drift_median > MOTION_PHASE_DRIFT_THRESHOLD;
|
||||
Ok(CalibrationDeviationScore { amplitude_z_median, amplitude_z_max, phase_drift_median, motion_flagged })
|
||||
}
|
||||
|
||||
@@ -736,6 +762,27 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant pin test.
|
||||
|
||||
/// The de-magicked calibration constants MUST equal the prior literals, and
|
||||
/// every tier constructor MUST share the one DEFAULT_MIN_FRAMES default.
|
||||
#[test]
|
||||
fn calibration_consts_unchanged_from_literals() {
|
||||
assert_eq!(DEFAULT_MIN_FRAMES, 600);
|
||||
assert_eq!(AMP_STD_FLOOR, 1e-12_f32);
|
||||
assert_eq!(MOTION_AMP_Z_THRESHOLD, 2.0_f32);
|
||||
assert_eq!(MOTION_PHASE_DRIFT_THRESHOLD, std::f32::consts::PI / 6.0);
|
||||
assert_eq!(SUBTRACT_MIN_NORM, 1e-30_f64);
|
||||
for cfg in [
|
||||
CalibrationConfig::ht20(),
|
||||
CalibrationConfig::ht40(),
|
||||
CalibrationConfig::he20(),
|
||||
CalibrationConfig::he40(),
|
||||
] {
|
||||
assert_eq!(cfg.min_frames, DEFAULT_MIN_FRAMES);
|
||||
}
|
||||
}
|
||||
|
||||
// Binary magic / version check.
|
||||
#[test]
|
||||
fn binary_magic_and_version() {
|
||||
|
||||
@@ -145,8 +145,10 @@ pub enum CirError {
|
||||
#[error("subcarrier count mismatch: expected {expected}, got {got}")]
|
||||
SubcarrierMismatch { expected: usize, got: usize },
|
||||
|
||||
/// Phase variance exceeds 2π — frame appears unsanitized (ghost-tap risk).
|
||||
#[error("CSI phase variance {variance:.3} suggests unsanitized input (ghost-tap risk)")]
|
||||
/// Circular phase variance (V = 1 − R̄ ∈ [0,1]) is too high — the CSI phase
|
||||
/// is near-uniformly spread across subcarriers, the signature of unsanitized
|
||||
/// SFO/CFO (ghost-tap risk). See `GHOST_TAP_CIRCULAR_VARIANCE_MAX`.
|
||||
#[error("CSI circular phase variance {variance:.3} suggests unsanitized input (ghost-tap risk)")]
|
||||
UnsanitizedPhase { variance: f32 },
|
||||
|
||||
/// ISTA did not converge within the iteration budget.
|
||||
@@ -567,9 +569,14 @@ impl CirEstimator {
|
||||
|
||||
let y = self.extract_csi_vector(csi);
|
||||
|
||||
// Ghost-tap guard: phase variance > 2π signals unsanitized SFO/CFO.
|
||||
// Ghost-tap guard: a near-uniform spread of CSI phase across subcarriers
|
||||
// signals unsanitized SFO/CFO (raw hardware phase ramps that were never
|
||||
// de-rotated). `phase_variance` is now Mardia's *circular* variance
|
||||
// V = 1 − R̄ ∈ [0,1] (ADR-154 §7.4 #1), so the old `> TAU` (≈6.28)
|
||||
// threshold — meaningful only for the unbounded linear variance — no
|
||||
// longer applies. We compare against the bounded const below.
|
||||
let phase_var = phase_variance(&y);
|
||||
if phase_var > std::f32::consts::TAU {
|
||||
if phase_var > GHOST_TAP_CIRCULAR_VARIANCE_MAX {
|
||||
return Err(CirError::UnsanitizedPhase {
|
||||
variance: phase_var,
|
||||
});
|
||||
@@ -988,17 +995,64 @@ fn normalize_complex(v: &mut [Complex32]) {
|
||||
}
|
||||
}
|
||||
|
||||
/// Variance of the instantaneous phase angles (rad) across a complex vector.
|
||||
/// Ghost-tap guard threshold on the **circular** phase variance (ADR-154 §7.4 #1).
|
||||
///
|
||||
/// `phase_variance` returns Mardia's circular variance V = 1 − R̄ ∈ [0,1].
|
||||
/// The guard rejects a frame as unsanitized when V exceeds this cutoff, i.e.
|
||||
/// when the mean resultant length R̄ falls below `1 − MAX`. At V = 0.99 the
|
||||
/// guard fires only when R̄ ≤ 0.01 — essentially uniform phase, the signature
|
||||
/// of raw SFO/CFO ramps the gate is meant to reject — while a sanitized,
|
||||
/// concentrated phase set (R̄ near 1, V near 0) passes comfortably.
|
||||
///
|
||||
/// **DATA-GATED (ADR-154 §7.4 #1):** this is a deliberately *conservative*
|
||||
/// default, not a calibrated operating point. A clean single-path channel with
|
||||
/// appreciable delay also sweeps the circle (high V), so V alone cannot cleanly
|
||||
/// separate "clean ramp" from "unsanitized noise" without labelled
|
||||
/// sanitized/unsanitized frames. The *metric* (circular variance) is MEASURED;
|
||||
/// this *value* awaits per-deployment calibration. Until then we err toward
|
||||
/// never false-rejecting a real frame — strictly more permissive at the wrap
|
||||
/// boundary than the old linear-variance guard, which is the bug being fixed.
|
||||
const GHOST_TAP_CIRCULAR_VARIANCE_MAX: f32 = 0.99;
|
||||
|
||||
/// Circular variance of the instantaneous phase angles across a complex vector.
|
||||
///
|
||||
/// Phase angles live on the circle and wrap at ±π, so a *linear* sample variance
|
||||
/// (the previous implementation, ADR-154 §7.4 #1) reports spuriously HIGH
|
||||
/// dispersion for a tightly-clustered set straddling the ±π branch cut — e.g.
|
||||
/// `{+3.13, −3.13}` are 0.02 rad apart on the circle but ≈2π apart on the line.
|
||||
/// That made the `phase_variance > TAU` ghost-tap guard FALSE-TRIP on real,
|
||||
/// tightly-clustered CIR taps.
|
||||
///
|
||||
/// The correct metric is Mardia's circular variance:
|
||||
///
|
||||
/// R̄ = | (1/n) · Σ_k e^{iθ_k} | (mean resultant length, ∈ [0,1])
|
||||
/// V = 1 − R̄ (circular variance, ∈ [0,1])
|
||||
///
|
||||
/// V = 0 ⇔ all angles identical (maximally concentrated); V = 1 ⇔ the unit
|
||||
/// phasors cancel (e.g. uniformly-spread angles → R̄ = 0). It is invariant to
|
||||
/// where the cluster sits on the circle, so the branch-cut artefact is gone.
|
||||
///
|
||||
/// Reference: Mardia & Jupp, *Directional Statistics* (2000), §1.3.
|
||||
#[inline]
|
||||
fn phase_variance(y: &[Complex32]) -> f32 {
|
||||
let n = y.len();
|
||||
if n < 2 {
|
||||
return 0.0;
|
||||
}
|
||||
// Mean resultant vector of the *unit* phasors e^{iθ_k}. Normalising each
|
||||
// term to unit magnitude makes this a pure phase statistic (amplitude does
|
||||
// not bias the dispersion), matching the linear version which used only
|
||||
// `arg()`.
|
||||
let mut sx = 0.0f32;
|
||||
let mut sy = 0.0f32;
|
||||
for c in y {
|
||||
let theta = c.arg();
|
||||
sx += theta.cos();
|
||||
sy += theta.sin();
|
||||
}
|
||||
let nf = n as f32;
|
||||
let phases: Vec<f32> = y.iter().map(|c| c.arg()).collect();
|
||||
let mean = phases.iter().sum::<f32>() / nf;
|
||||
phases.iter().map(|p| (p - mean) * (p - mean)).sum::<f32>() / nf
|
||||
let r_bar = ((sx * sx + sy * sy).sqrt() / nf).clamp(0.0, 1.0);
|
||||
1.0 - r_bar
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -1205,6 +1259,108 @@ mod tests {
|
||||
assert!(phase_variance(&y) < 1e-6);
|
||||
}
|
||||
|
||||
// ── ADR-154 §7.4 #1: circular vs linear phase variance ──────────────────
|
||||
|
||||
/// Inline replica of the OLD linear sample variance over `arg()` — kept in
|
||||
/// the test only, so we can show the exact contrast the fix removes.
|
||||
fn old_linear_phase_variance(y: &[Complex32]) -> f32 {
|
||||
let n = y.len();
|
||||
if n < 2 {
|
||||
return 0.0;
|
||||
}
|
||||
let nf = n as f32;
|
||||
let phases: Vec<f32> = y.iter().map(|c| c.arg()).collect();
|
||||
let mean = phases.iter().sum::<f32>() / nf;
|
||||
phases.iter().map(|p| (p - mean) * (p - mean)).sum::<f32>() / nf
|
||||
}
|
||||
|
||||
/// FAILS-ON-OLD: phases tightly clustered across the ±π branch cut. The old
|
||||
/// LINEAR variance reports a huge value (≈π²) and would trip the `> TAU`
|
||||
/// guard; the new CIRCULAR variance reports ≈0 (the cluster is 0.04 rad wide
|
||||
/// on the circle) and the guard does NOT false-trip.
|
||||
#[test]
|
||||
fn phase_variance_circular_not_fooled_by_branch_cut() {
|
||||
// 40 unit phasors split between +π−ε and −π+ε: true angular spread ≈0.04
|
||||
// rad, but they straddle the wrap point.
|
||||
let eps = 0.02_f32;
|
||||
let y: Vec<Complex32> = (0..40)
|
||||
.map(|i| {
|
||||
let theta = if i % 2 == 0 {
|
||||
std::f32::consts::PI - eps
|
||||
} else {
|
||||
-std::f32::consts::PI + eps
|
||||
};
|
||||
Complex32::new(theta.cos(), theta.sin())
|
||||
})
|
||||
.collect();
|
||||
|
||||
let old = old_linear_phase_variance(&y);
|
||||
let new = phase_variance(&y);
|
||||
|
||||
// The OLD metric is spuriously huge (well past the old TAU≈6.28 guard).
|
||||
assert!(
|
||||
old > std::f32::consts::TAU,
|
||||
"old linear variance should be large (>TAU) on wrap-straddling phases, was {old}"
|
||||
);
|
||||
// The NEW circular variance is ≈0 — the cluster is genuinely tight.
|
||||
assert!(
|
||||
new < 0.01,
|
||||
"circular variance must be ~0 for a tight cluster across ±π, was {new}"
|
||||
);
|
||||
// And the guard must NOT false-trip on this (a real tight CIR tap).
|
||||
assert!(
|
||||
new <= GHOST_TAP_CIRCULAR_VARIANCE_MAX,
|
||||
"ghost-tap guard must not false-trip on a tight wrap-straddling cluster"
|
||||
);
|
||||
}
|
||||
|
||||
/// Circular variance is bounded [0,1] for arbitrary (deterministic-random)
|
||||
/// inputs, and hits its documented extremes: ≈0 for identical angles, ≈1
|
||||
/// for uniformly-spread angles.
|
||||
#[test]
|
||||
fn phase_variance_circular_is_bounded_and_extremal() {
|
||||
// Deterministic pseudo-random phases via an LCG — bounded check.
|
||||
let mut s: u32 = 0x1234_5678;
|
||||
let y: Vec<Complex32> = (0..200)
|
||||
.map(|_| {
|
||||
s = s.wrapping_mul(1_664_525).wrapping_add(1_013_904_223);
|
||||
let u = (s >> 8) as f32 / (1u32 << 24) as f32; // [0,1)
|
||||
let theta = u * std::f32::consts::TAU - std::f32::consts::PI;
|
||||
Complex32::new(theta.cos(), theta.sin())
|
||||
})
|
||||
.collect();
|
||||
let v = phase_variance(&y);
|
||||
assert!((0.0..=1.0).contains(&v), "V must be in [0,1], was {v}");
|
||||
|
||||
// Identical angles → V ≈ 0.
|
||||
let same: Vec<Complex32> = (0..64)
|
||||
.map(|_| {
|
||||
let t = 0.7_f32;
|
||||
Complex32::new(t.cos(), t.sin())
|
||||
})
|
||||
.collect();
|
||||
assert!(
|
||||
phase_variance(&same) < 1e-5,
|
||||
"identical angles must give V≈0, got {}",
|
||||
phase_variance(&same)
|
||||
);
|
||||
|
||||
// Angles spread uniformly around the full circle → resultant cancels,
|
||||
// V ≈ 1.
|
||||
let n = 360usize;
|
||||
let uniform: Vec<Complex32> = (0..n)
|
||||
.map(|k| {
|
||||
let t = std::f32::consts::TAU * (k as f32) / (n as f32);
|
||||
Complex32::new(t.cos(), t.sin())
|
||||
})
|
||||
.collect();
|
||||
assert!(
|
||||
phase_variance(&uniform) > 0.99,
|
||||
"uniformly-spread angles must give V≈1, got {}",
|
||||
phase_variance(&uniform)
|
||||
);
|
||||
}
|
||||
|
||||
/// Build a CsiFrame with a deterministic single-tap channel at `tau_sec`.
|
||||
fn make_single_tap_frame(
|
||||
num_subcarriers: usize,
|
||||
@@ -1302,6 +1458,79 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
/// ADR-154 §7.4 #14: the `fft_operator` path *changes the witness hash*
|
||||
/// (documented in `CirConfig::fft_operator`), so it must be pinned as
|
||||
/// numerically **close** to the dense path — not silently divergent. The
|
||||
/// existing `fft_estimate_matches_dense_dominant_tap` covers HT20 / one tau;
|
||||
/// this test asserts the **full `Cir` output** (every tap + every scalar
|
||||
/// field) stays within a documented relative tolerance on the production
|
||||
/// **canonical-56** config across several realistic delays. A regression
|
||||
/// that lets the FFT path drift (wrong scaling, off-by-one Φ column, etc.)
|
||||
/// fails here instead of corrupting a downstream witness unnoticed.
|
||||
#[test]
|
||||
fn fft_operator_within_tolerance_of_dense_canonical56() {
|
||||
// Relative tolerances — documented, not silent. The FFT operator sums the
|
||||
// same Φ entries in a different order, so taps agree to ~float epsilon
|
||||
// scaled by the dominant-tap magnitude; ISTA can differ by a few last
|
||||
// bits over its trajectory, hence 1e-2 (same order as the existing test).
|
||||
const TAP_REL_TOL: f32 = 1e-2;
|
||||
const RATIO_ABS_TOL: f32 = 1e-2;
|
||||
const SPREAD_REL_TOL: f64 = 1e-2;
|
||||
|
||||
for &tau in &[20e-9_f64, 50e-9, 90e-9] {
|
||||
let dense_cfg = CirConfig::canonical56();
|
||||
let mut fft_cfg = CirConfig::canonical56();
|
||||
fft_cfg.fft_operator = true;
|
||||
|
||||
let frame = make_single_tap_frame(dense_cfg.num_subcarriers, tau);
|
||||
let dense = CirEstimator::new(dense_cfg).estimate(&frame).unwrap();
|
||||
let fast = CirEstimator::new(fft_cfg).estimate(&frame).unwrap();
|
||||
|
||||
assert_eq!(dense.taps.len(), fast.taps.len());
|
||||
|
||||
// Full tap vector close (relative to the dominant tap magnitude).
|
||||
let dom = dense.taps[dense.dominant_tap_idx].norm().max(1e-6);
|
||||
let mut max_tap_err = 0.0_f32;
|
||||
for (a, b) in dense.taps.iter().zip(&fast.taps) {
|
||||
max_tap_err = max_tap_err.max((a - b).norm());
|
||||
}
|
||||
assert!(
|
||||
max_tap_err <= TAP_REL_TOL * dom,
|
||||
"tau={tau:e}: FFT taps diverged from dense — max err {max_tap_err} > {TAP_REL_TOL} * {dom} (NOT numerically close)"
|
||||
);
|
||||
|
||||
// The dominant tap and the scalar summary fields must agree too —
|
||||
// these feed the witness, so a silent divergence here is the bug #14
|
||||
// guards against.
|
||||
assert_eq!(
|
||||
dense.dominant_tap_idx, fast.dominant_tap_idx,
|
||||
"tau={tau:e}: dominant tap index moved"
|
||||
);
|
||||
assert!(
|
||||
(dense.dominant_tap_ratio - fast.dominant_tap_ratio).abs() <= RATIO_ABS_TOL,
|
||||
"tau={tau:e}: dominant_tap_ratio drift {} vs {}",
|
||||
dense.dominant_tap_ratio,
|
||||
fast.dominant_tap_ratio
|
||||
);
|
||||
assert_eq!(
|
||||
dense.active_tap_count, fast.active_tap_count,
|
||||
"tau={tau:e}: active_tap_count changed"
|
||||
);
|
||||
assert_eq!(
|
||||
dense.ranging_valid, fast.ranging_valid,
|
||||
"tau={tau:e}: ranging_valid flipped"
|
||||
);
|
||||
let spread_ref = dense.rms_delay_spread_s.abs().max(1e-12);
|
||||
assert!(
|
||||
(dense.rms_delay_spread_s - fast.rms_delay_spread_s).abs()
|
||||
<= SPREAD_REL_TOL * spread_ref,
|
||||
"tau={tau:e}: rms_delay_spread drift {} vs {}",
|
||||
dense.rms_delay_spread_s,
|
||||
fast.rms_delay_spread_s
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/// The default configs keep the FFT operator off — the dense, bit-exact
|
||||
/// witness path is the default (enabling FFT shifts float results).
|
||||
#[test]
|
||||
|
||||
@@ -79,7 +79,7 @@ impl CoherenceState {
|
||||
Self {
|
||||
reference: vec![0.0; n_subcarriers],
|
||||
variance: vec![1.0; n_subcarriers],
|
||||
decay: 0.95,
|
||||
decay: DEFAULT_EMA_DECAY,
|
||||
current_score: 1.0,
|
||||
stale_count: 0,
|
||||
drift_profile: DriftProfile::Stable,
|
||||
@@ -200,8 +200,8 @@ impl CoherenceState {
|
||||
let diff = obs - old_ref;
|
||||
*v = self.decay * *v + alpha * diff * diff;
|
||||
// Ensure variance does not collapse to zero
|
||||
if *v < 1e-6 {
|
||||
*v = 1e-6;
|
||||
if *v < VARIANCE_FLOOR {
|
||||
*v = VARIANCE_FLOOR;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -229,7 +229,7 @@ pub fn coherence_score(current: &[f32], reference: &[f32], variance: &[f32]) ->
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
let epsilon = 1e-6_f32;
|
||||
let epsilon = VARIANCE_FLOOR;
|
||||
let mut weighted_sum = 0.0_f32;
|
||||
let mut weight_sum = 0.0_f32;
|
||||
|
||||
@@ -249,11 +249,34 @@ pub fn coherence_score(current: &[f32], reference: &[f32], variance: &[f32]) ->
|
||||
(weighted_sum / weight_sum).clamp(0.0, 1.0)
|
||||
}
|
||||
|
||||
/// Coherence score at/above which the environment is classified `Stable`
|
||||
/// (ADR-154 §7.4 #9 — DATA-GATED). EMPIRICAL DEFAULT, not a calibrated cutoff:
|
||||
/// a defensible value needs labelled stable/drifting environment traces. Pinned
|
||||
/// by `classify_drift_*_boundary` so a future retune is a visible, tested change.
|
||||
const DRIFT_STABLE_SCORE: f32 = 0.85;
|
||||
|
||||
/// Stale-frame count below which a coherence loss is treated as a transient
|
||||
/// `StepChange` rather than a sustained `Linear` drift (ADR-154 §7.4 #9 —
|
||||
/// DATA-GATED). EMPIRICAL DEFAULT pending labelled calibration.
|
||||
const DRIFT_STEP_CHANGE_MAX_STALE: u64 = 10;
|
||||
|
||||
/// Variance floor (ADR-154 §7.4 — de-magicked): the online variance estimate
|
||||
/// is never allowed to collapse below this, which keeps the inverse-variance
|
||||
/// weight and the z-score divisor finite. Used as both the floor in
|
||||
/// `update_reference` and the epsilon in `coherence_score` /
|
||||
/// `per_subcarrier_zscores`. Value unchanged from the prior `1e-6` literals.
|
||||
const VARIANCE_FLOOR: f32 = 1e-6;
|
||||
|
||||
/// Default EMA decay rate for the reference/variance update (ADR-154 §7.4 —
|
||||
/// de-magicked from the inline `0.95` in `CoherenceState::new`). EMPIRICAL
|
||||
/// DEFAULT; override via [`CoherenceState::with_decay`].
|
||||
const DEFAULT_EMA_DECAY: f32 = 0.95;
|
||||
|
||||
/// Classify drift profile based on coherence history.
|
||||
fn classify_drift(score: f32, stale_count: u64) -> DriftProfile {
|
||||
if score >= 0.85 {
|
||||
if score >= DRIFT_STABLE_SCORE {
|
||||
DriftProfile::Stable
|
||||
} else if stale_count < 10 {
|
||||
} else if stale_count < DRIFT_STEP_CHANGE_MAX_STALE {
|
||||
// Brief coherence loss -> likely step change
|
||||
DriftProfile::StepChange
|
||||
} else {
|
||||
@@ -269,7 +292,7 @@ pub fn per_subcarrier_zscores(current: &[f32], reference: &[f32], variance: &[f3
|
||||
let n = current.len().min(reference.len()).min(variance.len());
|
||||
(0..n)
|
||||
.map(|i| {
|
||||
let var = variance[i].max(1e-6);
|
||||
let var = variance[i].max(VARIANCE_FLOOR);
|
||||
(current[i] - reference[i]).abs() / var.sqrt()
|
||||
})
|
||||
.collect()
|
||||
@@ -418,6 +441,55 @@ mod tests {
|
||||
assert_eq!(classify_drift(0.3, 20), DriftProfile::Linear);
|
||||
}
|
||||
|
||||
// ── ADR-154 §7.4 #9: drift-threshold characterization (DATA-GATED) ──────
|
||||
// Pin the CURRENT empirical thresholds so a future labelled-data retune is a
|
||||
// visible, tested change. These assert the decision boundaries, not that the
|
||||
// values are "correct".
|
||||
|
||||
/// The named consts must equal the original bare literals (no value drift).
|
||||
#[test]
|
||||
fn drift_consts_unchanged_from_literals() {
|
||||
assert_eq!(DRIFT_STABLE_SCORE, 0.85);
|
||||
assert_eq!(DRIFT_STEP_CHANGE_MAX_STALE, 10);
|
||||
// ADR-154 §7.4 M3: variance-floor + default-decay de-magic.
|
||||
assert_eq!(VARIANCE_FLOOR, 1e-6_f32);
|
||||
assert_eq!(DEFAULT_EMA_DECAY, 0.95_f32);
|
||||
}
|
||||
|
||||
/// `coherence_score` stays finite and in [0,1] when a subcarrier reports
|
||||
/// zero variance — the [`VARIANCE_FLOOR`] keeps the z-score divisor and the
|
||||
/// inverse-variance weight finite. Pins the floor's effect.
|
||||
#[test]
|
||||
fn coherence_score_finite_with_zero_variance() {
|
||||
let current = [1.0_f32, 2.0, 3.0];
|
||||
let reference = [1.0_f32, 2.0, 3.0];
|
||||
let zero_var = [0.0_f32, 0.0, 0.0];
|
||||
let s = coherence_score(¤t, &reference, &zero_var);
|
||||
assert!(s.is_finite() && (0.0..=1.0).contains(&s));
|
||||
// Perfect agreement with floored variance -> ~1.0.
|
||||
assert!((s - 1.0).abs() < 1e-3);
|
||||
}
|
||||
|
||||
/// Stable score boundary: `>= 0.85` is Stable; just below flips to a
|
||||
/// non-stable profile.
|
||||
#[test]
|
||||
fn classify_drift_stable_score_boundary() {
|
||||
// exactly at threshold → Stable
|
||||
assert_eq!(classify_drift(0.85, 0), DriftProfile::Stable);
|
||||
// just below → not Stable (StepChange, since stale_count < 10)
|
||||
assert_eq!(classify_drift(0.849, 0), DriftProfile::StepChange);
|
||||
}
|
||||
|
||||
/// Stale-count boundary: `< 10` is StepChange, `>= 10` is Linear (when the
|
||||
/// score is below the Stable cutoff).
|
||||
#[test]
|
||||
fn classify_drift_stale_count_boundary() {
|
||||
// just below 10 → StepChange
|
||||
assert_eq!(classify_drift(0.3, 9), DriftProfile::StepChange);
|
||||
// exactly 10 → Linear
|
||||
assert_eq!(classify_drift(0.3, 10), DriftProfile::Linear);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn per_subcarrier_zscores_correct() {
|
||||
let current = vec![2.0, 4.0];
|
||||
|
||||
@@ -77,13 +77,27 @@ pub struct GatePolicyConfig {
|
||||
pub adaptive: bool,
|
||||
}
|
||||
|
||||
// Gate-policy DEFAULTS (ADR-154 §7.4 #9 — DATA-GATED). These were bare literals
|
||||
// in the `Default` impl. They are already tunable per-instance via
|
||||
// `GatePolicyConfig`/`GatePolicy::new` (the config seam exists), so de-magicking
|
||||
// here is about naming + pinning the DEFAULTS. EMPIRICAL — defensible values
|
||||
// need labelled coherence traces; the VALUES are unchanged.
|
||||
/// Default coherence accept cutoff (full Kalman update above this).
|
||||
const DEFAULT_ACCEPT_THRESHOLD: f32 = 0.85;
|
||||
/// Default coherence reject cutoff (discard measurement below this).
|
||||
const DEFAULT_REJECT_THRESHOLD: f32 = 0.5;
|
||||
/// Default stale-frame budget before forcing recalibration (≈10 s at 20 Hz).
|
||||
const DEFAULT_MAX_STALE_FRAMES: u64 = 200;
|
||||
/// Default PredictOnly-zone measurement-noise inflation factor.
|
||||
const DEFAULT_PREDICT_ONLY_NOISE: f32 = 3.0;
|
||||
|
||||
impl Default for GatePolicyConfig {
|
||||
fn default() -> Self {
|
||||
Self {
|
||||
accept_threshold: 0.85,
|
||||
reject_threshold: 0.5,
|
||||
max_stale_frames: 200, // 10s at 20Hz
|
||||
predict_only_noise: 3.0,
|
||||
accept_threshold: DEFAULT_ACCEPT_THRESHOLD,
|
||||
reject_threshold: DEFAULT_REJECT_THRESHOLD,
|
||||
max_stale_frames: DEFAULT_MAX_STALE_FRAMES,
|
||||
predict_only_noise: DEFAULT_PREDICT_ONLY_NOISE,
|
||||
adaptive: false,
|
||||
}
|
||||
}
|
||||
@@ -114,7 +128,7 @@ impl GatePolicy {
|
||||
accept_threshold: accept,
|
||||
reject_threshold: reject,
|
||||
max_stale_frames: max_stale,
|
||||
predict_only_noise: 3.0,
|
||||
predict_only_noise: DEFAULT_PREDICT_ONLY_NOISE,
|
||||
consecutive_low: 0,
|
||||
last_decision: None,
|
||||
}
|
||||
@@ -343,6 +357,17 @@ mod tests {
|
||||
assert!(!cfg.adaptive);
|
||||
}
|
||||
|
||||
/// ADR-154 §7.4 #9 (DATA-GATED): the named DEFAULT_* consts must equal the
|
||||
/// original bare literals — pins the de-magicked defaults so a future
|
||||
/// labelled-data retune is a visible, tested change. Values UNCHANGED.
|
||||
#[test]
|
||||
fn gate_default_consts_unchanged_from_literals() {
|
||||
assert_eq!(DEFAULT_ACCEPT_THRESHOLD, 0.85);
|
||||
assert_eq!(DEFAULT_REJECT_THRESHOLD, 0.5);
|
||||
assert_eq!(DEFAULT_MAX_STALE_FRAMES, 200);
|
||||
assert_eq!(DEFAULT_PREDICT_ONLY_NOISE, 3.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn from_config_construction() {
|
||||
let cfg = GatePolicyConfig {
|
||||
|
||||
@@ -23,6 +23,10 @@
|
||||
//! # References
|
||||
//! - ADR-030 Tier 5: Cross-Room Identity Continuity
|
||||
|
||||
/// Denominator guard for cosine similarity (ADR-154 §7.4 — de-magicked):
|
||||
/// a product of norms below this is treated as a zero-norm vector ⇒ 0.0.
|
||||
const COSINE_SIMILARITY_EPSILON: f32 = 1e-9;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Error types
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -359,12 +363,15 @@ impl CrossRoomTracker {
|
||||
}
|
||||
|
||||
/// Cosine similarity between two f32 vectors.
|
||||
///
|
||||
/// Returns `0.0` when either vector has (near-)zero norm — the product of
|
||||
/// norms falls below [`COSINE_SIMILARITY_EPSILON`] and the division is skipped.
|
||||
fn cosine_similarity_f32(a: &[f32], b: &[f32]) -> f32 {
|
||||
let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
|
||||
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let denom = norm_a * norm_b;
|
||||
if denom < 1e-9 {
|
||||
if denom < COSINE_SIMILARITY_EPSILON {
|
||||
0.0
|
||||
} else {
|
||||
dot / denom
|
||||
@@ -623,4 +630,23 @@ mod tests {
|
||||
let sim = cosine_similarity_f32(&a, &b);
|
||||
assert!(sim.abs() < 1e-5);
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked epsilon must equal the prior literal.
|
||||
#[test]
|
||||
fn cosine_epsilon_unchanged_from_literal() {
|
||||
assert_eq!(COSINE_SIMILARITY_EPSILON, 1e-9_f32);
|
||||
}
|
||||
|
||||
/// A zero-norm vector falls below the denominator epsilon ⇒ similarity 0.0.
|
||||
/// Previously untested (both existing tests use unit-norm vectors).
|
||||
#[test]
|
||||
fn test_cosine_similarity_zero_vector() {
|
||||
let zero = vec![0.0_f32; 4];
|
||||
let v = vec![1.0_f32, 2.0, 3.0, 4.0];
|
||||
assert_eq!(cosine_similarity_f32(&zero, &v), 0.0);
|
||||
assert_eq!(cosine_similarity_f32(&v, &zero), 0.0);
|
||||
assert_eq!(cosine_similarity_f32(&zero, &zero), 0.0);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -105,6 +105,10 @@ impl WelfordStats {
|
||||
}
|
||||
|
||||
/// Population variance (biased). Returns 0.0 if count < 2.
|
||||
///
|
||||
/// The `count < 2` guard is the n=0 NaN guard (ADR-154 §7.4 #10): at n=0,
|
||||
/// `m2 = 0` and `count = 0` would yield `0.0/0.0 = NaN`. Pinned by
|
||||
/// `welford_finite_at_n0_and_n1`.
|
||||
pub fn variance(&self) -> f64 {
|
||||
if self.count < 2 {
|
||||
0.0
|
||||
@@ -119,6 +123,10 @@ impl WelfordStats {
|
||||
}
|
||||
|
||||
/// Sample variance (unbiased). Returns 0.0 if count < 2.
|
||||
///
|
||||
/// The `count < 2` guard is load-bearing (ADR-154 §7.4 #10): at n=0 the
|
||||
/// `(self.count - 1)` term would underflow `0usize − 1` and at n=1 it would
|
||||
/// divide by zero. Pinned by `welford_finite_at_n0_and_n1`.
|
||||
pub fn sample_variance(&self) -> f64 {
|
||||
if self.count < 2 {
|
||||
0.0
|
||||
@@ -958,6 +966,52 @@ mod tests {
|
||||
assert!((w.variance() - 0.0).abs() < 1e-10);
|
||||
}
|
||||
|
||||
/// ADR-154 §7.4 #10: every statistic must stay FINITE at the n=0 and n=1
|
||||
/// boundaries. This pins the load-bearing `count < 2` guards: without them
|
||||
/// `sample_variance` at n=0 underflows `(0usize − 1)` and divides by a huge
|
||||
/// bogus divisor, and `variance`/`z_score` produce `0.0/0.0 = NaN`. Same
|
||||
/// family as the §4 divide-by-(n−1) window trio.
|
||||
#[test]
|
||||
fn welford_finite_at_n0_and_n1() {
|
||||
// n = 0: fresh accumulator, nothing observed.
|
||||
let w0 = WelfordStats::new();
|
||||
assert_eq!(w0.count, 0);
|
||||
for v in [
|
||||
w0.mean,
|
||||
w0.variance(),
|
||||
w0.sample_variance(),
|
||||
w0.std_dev(),
|
||||
w0.z_score(123.0),
|
||||
] {
|
||||
assert!(v.is_finite(), "n=0 statistic must be finite, got {v}");
|
||||
}
|
||||
// Documented sentinels at n=0.
|
||||
assert_eq!(w0.variance(), 0.0);
|
||||
assert_eq!(w0.sample_variance(), 0.0);
|
||||
assert_eq!(w0.std_dev(), 0.0);
|
||||
assert_eq!(w0.z_score(123.0), 0.0);
|
||||
|
||||
// n = 1: a single observation has no spread.
|
||||
let mut w1 = WelfordStats::new();
|
||||
w1.update(7.5);
|
||||
assert_eq!(w1.count, 1);
|
||||
for v in [
|
||||
w1.mean,
|
||||
w1.variance(),
|
||||
w1.sample_variance(),
|
||||
w1.std_dev(),
|
||||
w1.z_score(7.5),
|
||||
w1.z_score(999.0),
|
||||
] {
|
||||
assert!(v.is_finite(), "n=1 statistic must be finite, got {v}");
|
||||
}
|
||||
assert_eq!(w1.variance(), 0.0);
|
||||
assert_eq!(w1.sample_variance(), 0.0);
|
||||
assert_eq!(w1.std_dev(), 0.0);
|
||||
// z_score guards on near-zero sd → 0.0 even for an off-mean query.
|
||||
assert_eq!(w1.z_score(999.0), 0.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_link_baseline_stats() {
|
||||
let mut stats = LinkBaselineStats::new(4);
|
||||
|
||||
@@ -14,6 +14,15 @@
|
||||
|
||||
use super::QualityScored;
|
||||
|
||||
/// Multiplicative coherence penalty applied per recorded contradiction
|
||||
/// (ADR-154 §7.4 — de-magicked; EMPIRICAL DEFAULT). `n` contradictions scale
|
||||
/// coherence by `CONTRADICTION_PENALTY.powi(n)`.
|
||||
const CONTRADICTION_PENALTY: f32 = 0.8;
|
||||
|
||||
/// Confidence-bound half-width added per recorded contradiction (clamped so the
|
||||
/// interval stays within `[0, 1]`). EMPIRICAL DEFAULT.
|
||||
const CONTRADICTION_BOUND_HALFWIDTH: f32 = 0.1;
|
||||
|
||||
/// Identifies which sensing family produced a fused frame, so one
|
||||
/// [`QualityScore`] can be correlated across the signal-domain fuser
|
||||
/// (`multistatic.rs`) and the embedding-domain fuser (`viewpoint/fusion.rs`).
|
||||
@@ -113,7 +122,7 @@ impl QualityScore {
|
||||
/// streaming engine routes/gates on.
|
||||
#[must_use]
|
||||
pub fn penalized_coherence(&self) -> f32 {
|
||||
let penalty = 0.8_f32.powi(self.contradiction_flags.len() as i32);
|
||||
let penalty = CONTRADICTION_PENALTY.powi(self.contradiction_flags.len() as i32);
|
||||
(self.base_coherence * penalty).clamp(0.0, 1.0)
|
||||
}
|
||||
}
|
||||
@@ -127,7 +136,8 @@ impl QualityScored for QualityScore {
|
||||
// Width grows with the number of tolerated contradictions: each adds
|
||||
// ±0.1 of uncertainty around the penalized coherence, clamped to [0,1].
|
||||
let c = self.penalized_coherence();
|
||||
let half = (0.1 * self.contradiction_flags.len() as f32).min(c.min(1.0 - c));
|
||||
let half =
|
||||
(CONTRADICTION_BOUND_HALFWIDTH * self.contradiction_flags.len() as f32).min(c.min(1.0 - c));
|
||||
((c - half).max(0.0), (c + half).min(1.0))
|
||||
}
|
||||
}
|
||||
@@ -185,4 +195,24 @@ mod tests {
|
||||
assert!((0.0..=1.0).contains(&s));
|
||||
assert!(0.0 <= lo && lo <= hi && hi <= 1.0);
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked penalty/bound consts must equal the prior literals.
|
||||
#[test]
|
||||
fn fusion_quality_consts_unchanged_from_literals() {
|
||||
assert_eq!(CONTRADICTION_PENALTY, 0.8_f32);
|
||||
assert_eq!(CONTRADICTION_BOUND_HALFWIDTH, 0.1_f32);
|
||||
}
|
||||
|
||||
/// Zero contradictions: penalty is `0.8^0 = 1.0` (coherence unchanged) and
|
||||
/// the confidence bounds collapse to a point. Pins the n=0 boundary.
|
||||
#[test]
|
||||
fn no_contradiction_is_identity() {
|
||||
let q = base();
|
||||
assert!(q.contradiction_flags.is_empty());
|
||||
assert!((q.penalized_coherence() - q.base_coherence).abs() < 1e-6);
|
||||
let (lo, hi) = q.confidence_bounds();
|
||||
assert!((hi - lo).abs() < 1e-6); // half-width is 0 with no contradictions
|
||||
}
|
||||
}
|
||||
|
||||
@@ -19,6 +19,16 @@
|
||||
//! - Sakoe & Chiba (1978), "Dynamic programming algorithm optimization
|
||||
//! for spoken word recognition" IEEE TASSP
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Tuning constants (ADR-154 §7.4 — de-magicked; value unchanged)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Minimum second-best DTW distance below which the relative-margin
|
||||
/// confidence formula `1 - best/second_best` would divide by a near-zero
|
||||
/// denominator. Below this we fall back to the `max_distance`-relative
|
||||
/// confidence. Empirical guard, not a tuned operating point.
|
||||
const CONFIDENCE_SECOND_BEST_EPSILON: f64 = 1e-10;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Error types
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -236,7 +246,10 @@ impl GestureClassifier {
|
||||
let recognized = best_dist <= self.config.max_distance;
|
||||
|
||||
// Confidence: how much better is the best match vs second best
|
||||
let confidence = if recognized && second_best_dist.is_finite() && second_best_dist > 1e-10 {
|
||||
let confidence = if recognized
|
||||
&& second_best_dist.is_finite()
|
||||
&& second_best_dist > CONFIDENCE_SECOND_BEST_EPSILON
|
||||
{
|
||||
(1.0 - best_dist / second_best_dist).clamp(0.0, 1.0)
|
||||
} else if recognized {
|
||||
(1.0 - best_dist / self.config.max_distance).clamp(0.0, 1.0)
|
||||
@@ -364,7 +377,24 @@ fn dtw_distance(seq_a: &[Vec<f64>], seq_b: &[Vec<f64>], band_width: usize) -> f6
|
||||
}
|
||||
|
||||
/// Euclidean distance between two feature vectors.
|
||||
///
|
||||
/// # Caller contract (ADR-154 §7.4 #12)
|
||||
/// `a` and `b` are expected to have the **same** dimension (`feature_dim`).
|
||||
/// The implementation `zip`s the two slices, so on a length mismatch it
|
||||
/// **silently truncates to the shorter vector** rather than erroring. Every
|
||||
/// in-tree caller (`dtw_distance` over a single classifier's templates)
|
||||
/// already enforces equal `feature_dim`, so a mismatch indicates a
|
||||
/// construction bug; a `debug_assert!` makes that loud in debug builds while
|
||||
/// keeping the release operating path (and its output) unchanged.
|
||||
fn euclidean_distance(a: &[f64], b: &[f64]) -> f64 {
|
||||
debug_assert_eq!(
|
||||
a.len(),
|
||||
b.len(),
|
||||
"euclidean_distance: feature-vector length mismatch ({} vs {}) — \
|
||||
zip() would silently truncate; callers must use a uniform feature_dim",
|
||||
a.len(),
|
||||
b.len()
|
||||
);
|
||||
a.iter()
|
||||
.zip(b.iter())
|
||||
.map(|(x, y)| (x - y) * (x - y))
|
||||
@@ -688,4 +718,34 @@ mod tests {
|
||||
assert_eq!(GestureType::Circle.name(), "circle");
|
||||
assert_eq!(GestureType::Custom.name(), "custom");
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4 #12 + de-magic: boundary / characterization tests.
|
||||
|
||||
/// De-magicked confidence epsilon must equal the prior literal.
|
||||
#[test]
|
||||
fn confidence_epsilon_unchanged_from_literal() {
|
||||
assert_eq!(CONFIDENCE_SECOND_BEST_EPSILON, 1e-10);
|
||||
}
|
||||
|
||||
/// `dtw_distance` returns +inf when EITHER sequence is empty. Pins the
|
||||
/// n=0 / m=0 boundary (previously exercised only with n,m >= 3).
|
||||
#[test]
|
||||
fn dtw_empty_sequence_is_infinite() {
|
||||
let nonempty: Vec<Vec<f64>> = vec![vec![1.0], vec![2.0]];
|
||||
let empty: Vec<Vec<f64>> = vec![];
|
||||
assert!(dtw_distance(&empty, &nonempty, 3).is_infinite());
|
||||
assert!(dtw_distance(&nonempty, &empty, 3).is_infinite());
|
||||
assert!(dtw_distance(&empty, &empty, 3).is_infinite());
|
||||
}
|
||||
|
||||
/// `euclidean_distance` over equal-length vectors is the L2 norm of the
|
||||
/// difference. Pins the documented same-dimension caller contract (#12);
|
||||
/// the mismatch case is guarded by a debug_assert in debug builds and
|
||||
/// truncates in release — not exercised here to keep the test
|
||||
/// release/debug-agnostic.
|
||||
#[test]
|
||||
fn euclidean_distance_equal_length_is_l2() {
|
||||
assert!((euclidean_distance(&[1.0, 2.0, 2.0], &[0.0, 0.0, 0.0]) - 3.0).abs() < 1e-12);
|
||||
assert_eq!(euclidean_distance(&[], &[]), 0.0);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -21,6 +21,11 @@
|
||||
|
||||
use std::collections::VecDeque;
|
||||
|
||||
/// Minimum acceleration magnitude (ADR-154 §7.4 — de-magicked) below which the
|
||||
/// lead-time estimate `t = (v_thresh - v) / a` would divide by a near-zero
|
||||
/// acceleration; below this the lead time is reported as 0.0.
|
||||
const LEAD_TIME_MIN_ACCEL: f64 = 1e-10;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Error types
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -233,7 +238,7 @@ impl IntentionDetector {
|
||||
let detected = self.sustained_count >= self.config.min_sustained_frames;
|
||||
|
||||
// Estimate lead time based on current acceleration and velocity
|
||||
let estimated_lead = if detected && accel_mag > 1e-10 {
|
||||
let estimated_lead = if detected && accel_mag > LEAD_TIME_MIN_ACCEL {
|
||||
// Time until velocity reaches threshold: t = (v_thresh - v) / a
|
||||
let remaining = (self.config.max_pre_movement_velocity - velocity_mag) / accel_mag;
|
||||
remaining.clamp(0.0, self.config.max_lead_time_s)
|
||||
@@ -508,4 +513,29 @@ mod tests {
|
||||
let sd = embedding_second_diff(&a, &b, &c, 1.0);
|
||||
assert!((sd[0] - 2.0).abs() < 1e-10);
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked lead-time accel guard must equal the prior literal.
|
||||
#[test]
|
||||
fn lead_time_accel_const_unchanged_from_literal() {
|
||||
assert_eq!(LEAD_TIME_MIN_ACCEL, 1e-10);
|
||||
}
|
||||
|
||||
/// A static (zero-motion) embedding stream produces ~zero acceleration, so
|
||||
/// the lead-time estimate stays at the 0.0 sentinel rather than dividing by
|
||||
/// a near-zero acceleration. Pins the `accel_mag <= LEAD_TIME_MIN_ACCEL`
|
||||
/// branch behaviour.
|
||||
#[test]
|
||||
fn lead_time_zero_for_static_stream() {
|
||||
let config = make_config();
|
||||
let mut detector = IntentionDetector::new(config).unwrap();
|
||||
let mut last = None;
|
||||
for frame in 0..6_u64 {
|
||||
last = Some(detector.update(&static_embedding(), frame * 50_000).unwrap());
|
||||
}
|
||||
let signal = last.unwrap();
|
||||
assert!(signal.acceleration_magnitude < LEAD_TIME_MIN_ACCEL.max(1e-9));
|
||||
assert_eq!(signal.estimated_lead_time_s, 0.0);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -18,6 +18,38 @@
|
||||
|
||||
use crate::ruvsense::field_model::WelfordStats;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Drift-detection thresholds (ADR-154 §7.4 — de-magicked; EMPIRICAL DEFAULTS).
|
||||
//
|
||||
// These encode the "Key Invariants" documented in the module header. They were
|
||||
// previously bare literals scattered through `update_daily`/`is_ready`. Lifting
|
||||
// them to named consts makes the policy explicit and a future retune a visible,
|
||||
// tested change. Values are unchanged.
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Minimum observation days before drift detection activates.
|
||||
const BASELINE_MIN_OBSERVATION_DAYS: u32 = 7;
|
||||
|
||||
/// EMA update weight applied to the embedding centroid each day (the new
|
||||
/// sample's weight; the centroid retains `1 - EMBEDDING_EMA_ALPHA` of its old
|
||||
/// value, i.e. a decay of 0.95). Kept as the literal `0.05` rather than
|
||||
/// `1.0 - 0.95_f32` to stay bit-identical (the f32 subtraction is not exactly
|
||||
/// 0.05).
|
||||
const EMBEDDING_EMA_ALPHA: f32 = 0.05;
|
||||
|
||||
/// Per-metric absolute z-score above which a day counts toward sustained drift.
|
||||
const DRIFT_ZSCORE_SIGMA: f64 = 2.0;
|
||||
|
||||
/// Consecutive drift days required before a drift report is emitted.
|
||||
const DRIFT_SUSTAINED_DAYS: u32 = 3;
|
||||
|
||||
/// Consecutive drift days at/above which monitoring escalates from `Drift`
|
||||
/// to `RiskCorrelation`.
|
||||
const DRIFT_ESCALATION_DAYS: u32 = 7;
|
||||
|
||||
/// Denominator guard for cosine similarity (zero-norm vectors ⇒ 0.0).
|
||||
const COSINE_SIMILARITY_EPSILON: f32 = 1e-9;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Error types
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -226,7 +258,7 @@ impl PersonalBaseline {
|
||||
|
||||
/// Whether baseline has enough data for drift detection.
|
||||
pub fn is_ready(&self) -> bool {
|
||||
self.observation_days >= 7
|
||||
self.observation_days >= BASELINE_MIN_OBSERVATION_DAYS
|
||||
}
|
||||
|
||||
/// Update baseline with a daily summary.
|
||||
@@ -240,10 +272,10 @@ impl PersonalBaseline {
|
||||
self.observation_days += 1;
|
||||
self.updated_at_us = timestamp_us;
|
||||
|
||||
// Update embedding centroid with EMA (decay = 0.95)
|
||||
// Update embedding centroid with EMA (decay 0.95, alpha = 1 - 0.95)
|
||||
if let Some(ref emb) = summary.embedding_centroid {
|
||||
if emb.len() == self.embedding_centroid.len() {
|
||||
let alpha = 0.05_f32; // 1 - 0.95
|
||||
let alpha = EMBEDDING_EMA_ALPHA;
|
||||
for (c, e) in self.embedding_centroid.iter_mut().zip(emb.iter()) {
|
||||
*c = (1.0 - alpha) * *c + alpha * *e;
|
||||
}
|
||||
@@ -271,20 +303,20 @@ impl PersonalBaseline {
|
||||
|
||||
let idx = Self::metric_index(metric);
|
||||
|
||||
if z.abs() > 2.0 {
|
||||
if z.abs() > DRIFT_ZSCORE_SIGMA {
|
||||
self.drift_counters[idx] += 1;
|
||||
} else {
|
||||
self.drift_counters[idx] = 0;
|
||||
}
|
||||
|
||||
if self.drift_counters[idx] >= 3 {
|
||||
if self.drift_counters[idx] >= DRIFT_SUSTAINED_DAYS {
|
||||
let direction = if z > 0.0 {
|
||||
DriftDirection::Increasing
|
||||
} else {
|
||||
DriftDirection::Decreasing
|
||||
};
|
||||
|
||||
let level = if self.drift_counters[idx] >= 7 {
|
||||
let level = if self.drift_counters[idx] >= DRIFT_ESCALATION_DAYS {
|
||||
MonitoringLevel::RiskCorrelation
|
||||
} else {
|
||||
MonitoringLevel::Drift
|
||||
@@ -310,7 +342,7 @@ impl PersonalBaseline {
|
||||
|
||||
/// Check readiness at a specific observation day count (internal helper).
|
||||
fn is_ready_at(&self, days: u32) -> bool {
|
||||
days >= 7
|
||||
days >= BASELINE_MIN_OBSERVATION_DAYS
|
||||
}
|
||||
|
||||
/// Get current drift counter for a metric.
|
||||
@@ -545,12 +577,15 @@ impl EmbeddingHistory {
|
||||
}
|
||||
|
||||
/// Cosine similarity between two f32 vectors.
|
||||
///
|
||||
/// Returns `0.0` if either vector has (near-)zero norm — the product of norms
|
||||
/// falls below [`COSINE_SIMILARITY_EPSILON`], so the division is skipped.
|
||||
fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
|
||||
let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
|
||||
let norm_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let norm_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
|
||||
let denom = norm_a * norm_b;
|
||||
if denom < 1e-9 {
|
||||
if denom < COSINE_SIMILARITY_EPSILON {
|
||||
0.0
|
||||
} else {
|
||||
dot / denom
|
||||
@@ -1017,4 +1052,40 @@ mod tests {
|
||||
assert!(*i < h.len());
|
||||
}
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// The de-magicked drift thresholds MUST equal the prior bare literals.
|
||||
#[test]
|
||||
fn drift_consts_unchanged_from_literals() {
|
||||
assert_eq!(BASELINE_MIN_OBSERVATION_DAYS, 7);
|
||||
assert_eq!(EMBEDDING_EMA_ALPHA, 0.05_f32);
|
||||
assert_eq!(DRIFT_ZSCORE_SIGMA, 2.0);
|
||||
assert_eq!(DRIFT_SUSTAINED_DAYS, 3);
|
||||
assert_eq!(DRIFT_ESCALATION_DAYS, 7);
|
||||
assert_eq!(COSINE_SIMILARITY_EPSILON, 1e-9_f32);
|
||||
}
|
||||
|
||||
/// `is_ready_at` pins the exact day-6 (not ready) / day-7 (ready) boundary
|
||||
/// independent of Welford state.
|
||||
#[test]
|
||||
fn is_ready_at_day_boundary() {
|
||||
let baseline = PersonalBaseline::new(1, 8);
|
||||
assert!(!baseline.is_ready_at(BASELINE_MIN_OBSERVATION_DAYS - 1)); // day 6
|
||||
assert!(baseline.is_ready_at(BASELINE_MIN_OBSERVATION_DAYS)); // day 7
|
||||
assert!(baseline.is_ready_at(BASELINE_MIN_OBSERVATION_DAYS + 1)); // day 8
|
||||
}
|
||||
|
||||
/// Cosine similarity returns 0.0 for a zero-norm vector (denominator below
|
||||
/// `COSINE_SIMILARITY_EPSILON`) and a finite value otherwise.
|
||||
#[test]
|
||||
fn cosine_similarity_zero_vector_is_zero() {
|
||||
let zero = [0.0_f32; 4];
|
||||
let v = [1.0_f32, 2.0, 3.0, 4.0];
|
||||
assert_eq!(cosine_similarity(&zero, &v), 0.0);
|
||||
assert_eq!(cosine_similarity(&v, &zero), 0.0);
|
||||
assert_eq!(cosine_similarity(&zero, &zero), 0.0);
|
||||
// identical non-zero vectors -> ~1.0
|
||||
assert!((cosine_similarity(&v, &v) - 1.0).abs() < 1e-5);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -198,7 +198,15 @@ fn compute_cross_channel_coherence(frames: &[CanonicalCsiFrame]) -> f32 {
|
||||
((mean_corr + 1.0) / 2.0).clamp(0.0, 1.0) as f32
|
||||
}
|
||||
|
||||
/// Denominator guard for the Pearson correlation (ADR-154 §7.4 — de-magicked):
|
||||
/// a product of standard deviations below this is treated as a zero-variance
|
||||
/// (constant) input ⇒ correlation 0.0.
|
||||
const PEARSON_DENOMINATOR_EPSILON: f32 = 1e-12;
|
||||
|
||||
/// Pearson correlation coefficient between two f32 slices.
|
||||
///
|
||||
/// Returns `0.0` for empty inputs or when either slice has (near-)zero
|
||||
/// variance (the denominator falls below [`PEARSON_DENOMINATOR_EPSILON`]).
|
||||
fn pearson_correlation_f32(a: &[f32], b: &[f32]) -> f32 {
|
||||
let n = a.len().min(b.len());
|
||||
if n == 0 {
|
||||
@@ -222,7 +230,7 @@ fn pearson_correlation_f32(a: &[f32], b: &[f32]) -> f32 {
|
||||
}
|
||||
|
||||
let denom = (var_a * var_b).sqrt();
|
||||
if denom < 1e-12 {
|
||||
if denom < PEARSON_DENOMINATOR_EPSILON {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
@@ -439,4 +447,24 @@ mod tests {
|
||||
assert_eq!(cfg.window_us, 200_000);
|
||||
assert!((cfg.min_coherence - 0.3).abs() < f32::EPSILON);
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked denominator epsilon must equal the prior literal.
|
||||
#[test]
|
||||
fn pearson_epsilon_unchanged_from_literal() {
|
||||
assert_eq!(PEARSON_DENOMINATOR_EPSILON, 1e-12_f32);
|
||||
}
|
||||
|
||||
/// A constant (zero-variance) input makes the denominator fall below the
|
||||
/// epsilon ⇒ correlation 0.0. Previously untested (existing tests use
|
||||
/// non-constant inputs).
|
||||
#[test]
|
||||
fn pearson_correlation_zero_variance() {
|
||||
let constant = vec![3.0_f32; 5];
|
||||
let varying = vec![1.0_f32, 2.0, 3.0, 4.0, 5.0];
|
||||
assert_eq!(pearson_correlation_f32(&constant, &varying), 0.0);
|
||||
assert_eq!(pearson_correlation_f32(&varying, &constant), 0.0);
|
||||
assert_eq!(pearson_correlation_f32(&constant, &constant), 0.0);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -201,12 +201,29 @@ fn find_static_subcarriers(
|
||||
|
||||
/// Estimate per-channel phase offsets using iterative Neumann-style refinement.
|
||||
///
|
||||
/// Channel 0 is the reference (offset = 0).
|
||||
/// Channel 0 is the reference (offset = 0). Thin wrapper that drops the
|
||||
/// iteration count; `estimate_phase_offsets_counted` is the instrumented core.
|
||||
fn estimate_phase_offsets(
|
||||
frames: &[CanonicalCsiFrame],
|
||||
static_indices: &[usize],
|
||||
config: &PhaseAlignConfig,
|
||||
) -> std::result::Result<Vec<f32>, PhaseAlignError> {
|
||||
estimate_phase_offsets_counted(frames, static_indices, config).map(|(offsets, _iters)| offsets)
|
||||
}
|
||||
|
||||
/// Core of [`estimate_phase_offsets`], also returning the number of refinement
|
||||
/// iterations actually executed.
|
||||
///
|
||||
/// The returned count is bounded by `config.max_iterations` — that bound is the
|
||||
/// convergence cap that guarantees termination on inputs the damped Neumann
|
||||
/// update never drives below `config.tolerance` (ADR-154 §7.4 #16). The offset
|
||||
/// vector is identical to the public `estimate_phase_offsets` path; only the
|
||||
/// iteration count is surfaced (for the cap test).
|
||||
fn estimate_phase_offsets_counted(
|
||||
frames: &[CanonicalCsiFrame],
|
||||
static_indices: &[usize],
|
||||
config: &PhaseAlignConfig,
|
||||
) -> std::result::Result<(Vec<f32>, usize), PhaseAlignError> {
|
||||
let n_ch = frames.len();
|
||||
let mut offsets = vec![0.0_f32; n_ch];
|
||||
|
||||
@@ -220,7 +237,7 @@ fn estimate_phase_offsets(
|
||||
}
|
||||
|
||||
// Iterative refinement (Neumann-style)
|
||||
for _iter in 0..config.max_iterations {
|
||||
for iter in 0..config.max_iterations {
|
||||
let mut max_update = 0.0_f32;
|
||||
|
||||
for c in 1..n_ch {
|
||||
@@ -241,12 +258,13 @@ fn estimate_phase_offsets(
|
||||
}
|
||||
|
||||
if max_update < config.tolerance {
|
||||
return Ok(offsets);
|
||||
return Ok((offsets, iter + 1));
|
||||
}
|
||||
}
|
||||
|
||||
// Even if we do not converge tightly, return best estimate
|
||||
Ok(offsets)
|
||||
// Even if we do not converge tightly, return best estimate. The loop ran the
|
||||
// full cap — termination is guaranteed by `config.max_iterations`.
|
||||
Ok((offsets, config.max_iterations))
|
||||
}
|
||||
|
||||
/// Apply phase correction: subtract offset from each subcarrier phase.
|
||||
@@ -446,6 +464,73 @@ mod tests {
|
||||
assert_eq!(cfg.min_static_subcarriers, 5);
|
||||
}
|
||||
|
||||
// ADR-154 §7.4 #16: the iterative LO-offset refinement must TERMINATE at the
|
||||
// `max_iterations` cap on a non-converging input — no unbounded loop.
|
||||
//
|
||||
// We force non-convergence by setting `tolerance` to an unreachable value
|
||||
// (the damped Neumann update on bounded phase residuals can never drive
|
||||
// `max_update` below 0.0), so the `max_update < tolerance` early-exit is
|
||||
// never taken. The instrumented core must then run *exactly*
|
||||
// `max_iterations` and return — proving the cap, not convergence, is what
|
||||
// bounds the loop.
|
||||
#[test]
|
||||
fn refinement_terminates_at_iteration_cap_when_not_converging() {
|
||||
let n_sub = 56;
|
||||
let max_iterations = 7;
|
||||
let config = PhaseAlignConfig {
|
||||
max_iterations,
|
||||
// Unreachable tolerance: `max_update` is always ≥ 0, never < 0.0,
|
||||
// so the convergence branch can never fire.
|
||||
tolerance: 0.0,
|
||||
static_fraction: 0.3,
|
||||
min_static_subcarriers: 5,
|
||||
};
|
||||
// Two channels with a real, persistent offset so each iteration keeps
|
||||
// producing a non-zero update.
|
||||
let f0 = make_frame_with_phase(n_sub, 0.0, 0.0);
|
||||
let f1 = make_frame_with_phase(n_sub, 0.0, 1.3);
|
||||
let frames = vec![f0, f1];
|
||||
let static_indices = find_static_subcarriers(&frames, &config).unwrap();
|
||||
|
||||
let (offsets, iters) =
|
||||
estimate_phase_offsets_counted(&frames, &static_indices, &config).unwrap();
|
||||
|
||||
// The cap, not convergence, terminated the loop.
|
||||
assert_eq!(
|
||||
iters, max_iterations,
|
||||
"expected the loop to run the full cap ({max_iterations}), got {iters}"
|
||||
);
|
||||
// It still returns a finite best-estimate offset vector.
|
||||
assert_eq!(offsets.len(), 2);
|
||||
assert!(offsets.iter().all(|o| o.is_finite()));
|
||||
// Reference channel offset stays 0.
|
||||
assert_eq!(offsets[0], 0.0);
|
||||
}
|
||||
|
||||
// Convergent companion: a near-identical input converges *before* the cap,
|
||||
// so the cap is an upper bound, not the only exit.
|
||||
#[test]
|
||||
fn refinement_converges_before_cap_on_easy_input() {
|
||||
let n_sub = 56;
|
||||
let config = PhaseAlignConfig {
|
||||
max_iterations: 50,
|
||||
tolerance: 1e-2, // loose: a tiny offset converges in a few iters
|
||||
static_fraction: 0.3,
|
||||
min_static_subcarriers: 5,
|
||||
};
|
||||
let f0 = make_frame_with_phase(n_sub, 0.0, 0.0);
|
||||
let f1 = make_frame_with_phase(n_sub, 0.0, 0.02);
|
||||
let frames = vec![f0, f1];
|
||||
let static_indices = find_static_subcarriers(&frames, &config).unwrap();
|
||||
let (_offsets, iters) =
|
||||
estimate_phase_offsets_counted(&frames, &static_indices, &config).unwrap();
|
||||
assert!(
|
||||
iters < config.max_iterations,
|
||||
"easy input should converge before the cap, ran {iters}/{}",
|
||||
config.max_iterations
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn phase_correction_preserves_amplitude() {
|
||||
let mut aligner = PhaseAligner::new(2);
|
||||
|
||||
@@ -13,6 +13,27 @@
|
||||
|
||||
use crate::ruvsense::field_model::WelfordStats;
|
||||
|
||||
/// Nanoseconds per day, for migration-rate (m/day) conversion (ADR-154 §7.4 —
|
||||
/// de-magicked from the inline `86_400_000_000_000.0` literal). 24·60·60·1e9.
|
||||
const NS_PER_DAY: f64 = 86_400_000_000_000.0;
|
||||
|
||||
/// Minimum observed span (in days) below which migration rate is reported as
|
||||
/// 0.0 — guards `cumulative_drift_m / span_days` against a near-zero span.
|
||||
const MIGRATION_MIN_SPAN_DAYS: f64 = 1e-9;
|
||||
|
||||
// ADR-154 §7.4: the v1 fixed-map defaults below were bare literals in
|
||||
// `fixed_map()`. They are EMPIRICAL DEFAULTS (ADR-143), unchanged.
|
||||
|
||||
/// Default association radius (m): a sighting within this of a reflector's
|
||||
/// running mean is folded into it; otherwise it seeds a new reflector.
|
||||
const FIXED_MAP_ASSOC_RADIUS_M: f64 = 0.5;
|
||||
|
||||
/// Default minimum sightings before a reflector counts as "persistent".
|
||||
const FIXED_MAP_MIN_SIGHTINGS: u64 = 20;
|
||||
|
||||
/// Default minimum tap coherence for a sighting to be admitted.
|
||||
const FIXED_MAP_MIN_COHERENCE: f32 = 0.6;
|
||||
|
||||
/// Classification of a discovered persistent reflector (mirrors ADR-139
|
||||
/// `AnchorKind`; kept local to avoid a crate dependency on the WorldGraph).
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
@@ -102,8 +123,8 @@ impl PersistentReflector {
|
||||
if span_ns == 0 {
|
||||
return 0.0;
|
||||
}
|
||||
let span_days = span_ns as f64 / 86_400_000_000_000.0; // ns → days
|
||||
if span_days < 1e-9 {
|
||||
let span_days = span_ns as f64 / NS_PER_DAY; // ns → days
|
||||
if span_days < MIGRATION_MIN_SPAN_DAYS {
|
||||
return 0.0;
|
||||
}
|
||||
self.cumulative_drift_m / span_days
|
||||
@@ -145,9 +166,9 @@ impl RfSlam {
|
||||
pub fn fixed_map() -> Self {
|
||||
Self {
|
||||
reflectors: Vec::new(),
|
||||
assoc_radius_m: 0.5,
|
||||
min_sightings: 20,
|
||||
min_coherence: 0.6,
|
||||
assoc_radius_m: FIXED_MAP_ASSOC_RADIUS_M,
|
||||
min_sightings: FIXED_MAP_MIN_SIGHTINGS,
|
||||
min_coherence: FIXED_MAP_MIN_COHERENCE,
|
||||
discovery_enabled: false,
|
||||
}
|
||||
}
|
||||
@@ -298,4 +319,29 @@ mod tests {
|
||||
assert_eq!(anchors.len(), 1);
|
||||
assert_eq!(anchors[0].1, ReflectorClass::Wall);
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant + boundary characterization tests.
|
||||
|
||||
/// De-magicked constants must equal the prior inline literals.
|
||||
#[test]
|
||||
fn migration_consts_unchanged_from_literals() {
|
||||
assert_eq!(NS_PER_DAY, 86_400_000_000_000.0);
|
||||
assert_eq!(NS_PER_DAY, 24.0 * 60.0 * 60.0 * 1e9);
|
||||
assert_eq!(MIGRATION_MIN_SPAN_DAYS, 1e-9);
|
||||
assert_eq!(FIXED_MAP_ASSOC_RADIUS_M, 0.5);
|
||||
assert_eq!(FIXED_MAP_MIN_SIGHTINGS, 20);
|
||||
assert_eq!(FIXED_MAP_MIN_COHERENCE, 0.6_f32);
|
||||
}
|
||||
|
||||
/// A single sighting has first_ns == last_ns ⇒ zero span ⇒ migration rate
|
||||
/// 0.0 (pins the `span_ns == 0` / `span_days < MIGRATION_MIN_SPAN_DAYS`
|
||||
/// guard, and that such a reflector classifies as a Wall).
|
||||
#[test]
|
||||
fn migration_zero_span_is_zero_rate() {
|
||||
let mut slam = RfSlam::with_discovery(0.5, 1, 0.6);
|
||||
slam.observe(&obs([1.0, 2.0, 0.0], 12_345));
|
||||
let r = slam.persistent()[0];
|
||||
assert_eq!(r.migration_m_per_day(), 0.0);
|
||||
assert_eq!(r.classify(0.05, 1.0), ReflectorClass::Wall);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -18,6 +18,16 @@ use midstreamer_temporal_compare::{ComparisonAlgorithm, Sequence, TemporalCompar
|
||||
|
||||
use super::gesture::{GestureConfig, GestureError, GestureResult, GestureTemplate};
|
||||
|
||||
/// Minimum second-best distance (ADR-154 §7.4 — de-magicked) below which the
|
||||
/// relative-margin confidence `1 - best/second_best` would divide by a
|
||||
/// near-zero denominator; below this we fall back to the `max_distance`-relative
|
||||
/// confidence. Mirrors the same guard in `gesture.rs`.
|
||||
const CONFIDENCE_SECOND_BEST_EPSILON: f64 = 1e-10;
|
||||
|
||||
/// Fixed-point scale used to quantize a frame's L2 norm to an i64 for the
|
||||
/// integer temporal comparator (norm·SCALE truncated). Empirical resolution.
|
||||
const NORM_QUANTIZATION_SCALE: f64 = 1000.0;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Configuration
|
||||
// ---------------------------------------------------------------------------
|
||||
@@ -192,7 +202,10 @@ impl TemporalGestureClassifier {
|
||||
let recognized = best_distance <= self.config.max_distance;
|
||||
|
||||
// Confidence based on margin between best and second-best
|
||||
let confidence = if recognized && second_best.is_finite() && second_best > 1e-10 {
|
||||
let confidence = if recognized
|
||||
&& second_best.is_finite()
|
||||
&& second_best > CONFIDENCE_SECOND_BEST_EPSILON
|
||||
{
|
||||
(1.0 - best_distance / second_best).clamp(0.0, 1.0)
|
||||
} else if recognized {
|
||||
(1.0 - best_distance / self.config.max_distance).clamp(0.0, 1.0)
|
||||
@@ -244,13 +257,13 @@ impl TemporalGestureClassifier {
|
||||
|
||||
/// Convert a feature sequence to a midstreamer `Sequence<i64>`.
|
||||
///
|
||||
/// Each frame's L2 norm is quantized to an i64 (multiplied by 1000)
|
||||
/// for use with the generic comparator.
|
||||
/// Each frame's L2 norm is quantized to an i64 (multiplied by
|
||||
/// [`NORM_QUANTIZATION_SCALE`]) for use with the generic comparator.
|
||||
fn to_sequence(frames: &[Vec<f64>]) -> Sequence<i64> {
|
||||
let mut seq = Sequence::new();
|
||||
for (i, frame) in frames.iter().enumerate() {
|
||||
let norm = frame.iter().map(|x| x * x).sum::<f64>().sqrt();
|
||||
let quantized = (norm * 1000.0) as i64;
|
||||
let quantized = (norm * NORM_QUANTIZATION_SCALE) as i64;
|
||||
seq.push(quantized, i as u64);
|
||||
}
|
||||
seq
|
||||
@@ -537,4 +550,14 @@ mod tests {
|
||||
let dbg = format!("{:?}", classifier);
|
||||
assert!(dbg.contains("TemporalGestureClassifier"));
|
||||
}
|
||||
|
||||
// -- ADR-154 §7.4: de-magic-constant pin test.
|
||||
|
||||
/// De-magicked confidence epsilon + quantization scale must equal the
|
||||
/// prior inline literals.
|
||||
#[test]
|
||||
fn temporal_gesture_consts_unchanged_from_literals() {
|
||||
assert_eq!(CONFIDENCE_SECOND_BEST_EPSILON, 1e-10);
|
||||
assert_eq!(NORM_QUANTIZATION_SCALE, 1000.0);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -9,9 +9,10 @@
|
||||
|
||||
use ndarray::Array2;
|
||||
use num_complex::Complex64;
|
||||
use rustfft::FftPlanner;
|
||||
use rustfft::{Fft, FftPlanner};
|
||||
use ruvector_attn_mincut::attn_mincut;
|
||||
use std::f64::consts::PI;
|
||||
use std::sync::Arc;
|
||||
|
||||
/// Configuration for spectrogram generation.
|
||||
#[derive(Debug, Clone)]
|
||||
@@ -87,12 +88,40 @@ pub fn compute_spectrogram(
|
||||
return Err(SpectrogramError::InvalidWindowSize);
|
||||
}
|
||||
|
||||
let n_frames = (signal.len() - config.window_size) / config.hop_size + 1;
|
||||
let n_freq = config.window_size / 2 + 1;
|
||||
let window = make_window(config.window_fn, config.window_size);
|
||||
|
||||
let mut planner = FftPlanner::new();
|
||||
let fft = planner.plan_fft_forward(config.window_size);
|
||||
let window = make_window(config.window_fn, config.window_size);
|
||||
Ok(compute_spectrogram_with_plan(
|
||||
signal,
|
||||
sample_rate,
|
||||
config,
|
||||
&fft,
|
||||
&window,
|
||||
))
|
||||
}
|
||||
|
||||
/// STFT core that runs against a **pre-planned** FFT and pre-built window.
|
||||
///
|
||||
/// ADR-154 §7.4 #20: `compute_spectrogram` re-plans the FFT on every call, so
|
||||
/// `compute_multi_subcarrier_spectrogram` (which calls it once per subcarrier)
|
||||
/// re-planned the same length-`window_size` FFT for *every* subcarrier. This
|
||||
/// helper hoists the plan + window out of the per-subcarrier loop. The numeric
|
||||
/// body is byte-for-byte the old loop — only the plan/window construction is
|
||||
/// lifted — so the output is **bit-identical** to the per-call path (asserted by
|
||||
/// `multi_subcarrier_hoisted_plan_bit_identical`). Callers must pass a plan
|
||||
/// built for exactly `config.window_size` and a window of that length.
|
||||
fn compute_spectrogram_with_plan(
|
||||
signal: &[f64],
|
||||
sample_rate: f64,
|
||||
config: &SpectrogramConfig,
|
||||
fft: &Arc<dyn Fft<f64>>,
|
||||
window: &[f64],
|
||||
) -> Spectrogram {
|
||||
debug_assert_eq!(window.len(), config.window_size, "window/plan size mismatch");
|
||||
debug_assert_eq!(fft.len(), config.window_size, "FFT/window size mismatch");
|
||||
|
||||
let n_frames = (signal.len() - config.window_size) / config.hop_size + 1;
|
||||
let n_freq = config.window_size / 2 + 1;
|
||||
|
||||
let mut data = Array2::zeros((n_freq, n_frames));
|
||||
|
||||
@@ -116,13 +145,13 @@ pub fn compute_spectrogram(
|
||||
}
|
||||
}
|
||||
|
||||
Ok(Spectrogram {
|
||||
Spectrogram {
|
||||
data,
|
||||
n_freq,
|
||||
n_time: n_frames,
|
||||
freq_resolution: sample_rate / config.window_size as f64,
|
||||
time_resolution: config.hop_size as f64 / sample_rate,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
/// Compute spectrogram for each subcarrier from a temporal CSI matrix.
|
||||
@@ -134,12 +163,40 @@ pub fn compute_multi_subcarrier_spectrogram(
|
||||
sample_rate: f64,
|
||||
config: &SpectrogramConfig,
|
||||
) -> Result<Vec<Spectrogram>, SpectrogramError> {
|
||||
let (_, n_sc) = csi_temporal.dim();
|
||||
let mut spectrograms = Vec::with_capacity(n_sc);
|
||||
let (n_samples, n_sc) = csi_temporal.dim();
|
||||
|
||||
// ADR-154 §7.4 #20: validate *once* (same checks `compute_spectrogram`
|
||||
// makes), then plan the FFT + build the window *once* and reuse them across
|
||||
// every subcarrier instead of re-planning per column. The window length is
|
||||
// identical for all subcarriers, so this is pure hoisting — output stays
|
||||
// bit-identical to the per-call path.
|
||||
if n_samples < config.window_size {
|
||||
return Err(SpectrogramError::SignalTooShort {
|
||||
signal_len: n_samples,
|
||||
window_size: config.window_size,
|
||||
});
|
||||
}
|
||||
if config.hop_size == 0 {
|
||||
return Err(SpectrogramError::InvalidHopSize);
|
||||
}
|
||||
if config.window_size == 0 {
|
||||
return Err(SpectrogramError::InvalidWindowSize);
|
||||
}
|
||||
|
||||
let mut planner = FftPlanner::new();
|
||||
let fft = planner.plan_fft_forward(config.window_size);
|
||||
let window = make_window(config.window_fn, config.window_size);
|
||||
|
||||
let mut spectrograms = Vec::with_capacity(n_sc);
|
||||
for sc in 0..n_sc {
|
||||
let col: Vec<f64> = csi_temporal.column(sc).to_vec();
|
||||
spectrograms.push(compute_spectrogram(&col, sample_rate, config)?);
|
||||
spectrograms.push(compute_spectrogram_with_plan(
|
||||
&col,
|
||||
sample_rate,
|
||||
config,
|
||||
&fft,
|
||||
&window,
|
||||
));
|
||||
}
|
||||
|
||||
Ok(spectrograms)
|
||||
@@ -372,6 +429,67 @@ mod tests {
|
||||
assert_eq!(spec.n_freq, 65);
|
||||
}
|
||||
}
|
||||
|
||||
// ADR-154 §7.4 #20: the FFT-planner hoist in
|
||||
// `compute_multi_subcarrier_spectrogram` must produce **bit-identical**
|
||||
// output to calling `compute_spectrogram` (fresh planner) per subcarrier.
|
||||
// We compare `f64::to_bits` of every spectrogram value across several
|
||||
// window functions and a realistic 56-subcarrier CSI matrix — the planner
|
||||
// change only reorders *when* the (identical) plan is built, never the math.
|
||||
#[test]
|
||||
fn multi_subcarrier_hoisted_plan_bit_identical() {
|
||||
let n_samples = 600;
|
||||
let n_sc = 56; // canonical-56 grid — the production subcarrier count
|
||||
let sample_rate = 100.0;
|
||||
let csi = Array2::from_shape_fn((n_samples, n_sc), |(t, sc)| {
|
||||
// Deterministic, non-trivial per-subcarrier content.
|
||||
let freq = 0.7 + sc as f64 * 0.13;
|
||||
(2.0 * PI * freq * t as f64 / sample_rate).sin()
|
||||
+ 0.3 * (2.0 * PI * (freq * 2.1) * t as f64 / sample_rate).cos()
|
||||
});
|
||||
|
||||
for window_fn in [
|
||||
WindowFunction::Hann,
|
||||
WindowFunction::Hamming,
|
||||
WindowFunction::Blackman,
|
||||
WindowFunction::Rectangular,
|
||||
] {
|
||||
for &power in &[true, false] {
|
||||
let config = SpectrogramConfig {
|
||||
window_size: 128,
|
||||
hop_size: 37, // non-divisor hop to exercise frame edges
|
||||
window_fn,
|
||||
power,
|
||||
};
|
||||
|
||||
// AFTER: hoisted-plan path.
|
||||
let hoisted =
|
||||
compute_multi_subcarrier_spectrogram(&csi, sample_rate, &config).unwrap();
|
||||
|
||||
// BEFORE: independent per-subcarrier fresh-planner path.
|
||||
let reference: Vec<Spectrogram> = (0..n_sc)
|
||||
.map(|sc| {
|
||||
let col: Vec<f64> = csi.column(sc).to_vec();
|
||||
compute_spectrogram(&col, sample_rate, &config).unwrap()
|
||||
})
|
||||
.collect();
|
||||
|
||||
assert_eq!(hoisted.len(), reference.len());
|
||||
for (sc, (h, r)) in hoisted.iter().zip(reference.iter()).enumerate() {
|
||||
assert_eq!(h.data.dim(), r.data.dim(), "dim sc={sc} {window_fn:?}");
|
||||
for (a, b) in h.data.iter().zip(r.data.iter()) {
|
||||
assert_eq!(
|
||||
a.to_bits(),
|
||||
b.to_bits(),
|
||||
"bit mismatch sc={sc} {window_fn:?} power={power}: {a} vs {b}"
|
||||
);
|
||||
}
|
||||
assert_eq!(h.freq_resolution.to_bits(), r.freq_resolution.to_bits());
|
||||
assert_eq!(h.time_resolution.to_bits(), r.time_resolution.to_bits());
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
|
||||
@@ -10,6 +10,11 @@
|
||||
// Helper math functions
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// LayerNorm numerical-stability epsilon added under the variance square root
|
||||
/// (`(x − μ)/√(σ² + ε)`). The standard transformer default (ADR-155 M2 §8:
|
||||
/// de-magicked from a bare `1e-5`; value unchanged, no behaviour change).
|
||||
const LAYER_NORM_EPS: f32 = 1e-5;
|
||||
|
||||
/// GELU activation (Hendrycks & Gimpel, 2016 approximation).
|
||||
pub fn gelu(x: f32) -> f32 {
|
||||
let c = (2.0_f32 / std::f32::consts::PI).sqrt();
|
||||
@@ -24,7 +29,7 @@ pub fn layer_norm(x: &[f32]) -> Vec<f32> {
|
||||
}
|
||||
let mean = x.iter().sum::<f32>() / n;
|
||||
let var = x.iter().map(|v| (v - mean).powi(2)).sum::<f32>() / n;
|
||||
let inv_std = 1.0 / (var + 1e-5_f32).sqrt();
|
||||
let inv_std = 1.0 / (var + LAYER_NORM_EPS).sqrt();
|
||||
x.iter().map(|v| (v - mean) * inv_std).collect()
|
||||
}
|
||||
|
||||
@@ -390,6 +395,13 @@ mod tests {
|
||||
assert!(layer_norm(&[]).is_empty());
|
||||
}
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked LayerNorm epsilon must equal the prior
|
||||
/// inline `1e-5` literal exactly (operating-value guard).
|
||||
#[test]
|
||||
fn layer_norm_eps_unchanged_from_literal() {
|
||||
assert_eq!(LAYER_NORM_EPS, 1e-5_f32);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mean_pool_simple() {
|
||||
let p = global_mean_pool(&[1.0, 2.0, 3.0, 5.0, 6.0, 7.0], 2, 3);
|
||||
|
||||
@@ -5,6 +5,12 @@
|
||||
|
||||
use std::collections::HashMap;
|
||||
|
||||
/// Smallest in-domain / few-shot MPJPE treated as positive before it divides a
|
||||
/// ratio. Below this the denominator is considered ≈0 and the ratio falls back
|
||||
/// to a sentinel (`1.0` or `INFINITY`) rather than dividing by ≈0 (ADR-155 M2
|
||||
/// §8: de-magicked from a bare `1e-10`; value unchanged, no behaviour change).
|
||||
const MIN_POSITIVE_MPJPE: f32 = 1e-10;
|
||||
|
||||
/// Aggregated cross-domain evaluation metrics.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct CrossDomainMetrics {
|
||||
@@ -79,14 +85,14 @@ impl CrossDomainEvaluator {
|
||||
} else {
|
||||
cross_dom
|
||||
};
|
||||
let gap = if in_dom > 1e-10 {
|
||||
let gap = if in_dom > MIN_POSITIVE_MPJPE {
|
||||
cross_dom / in_dom
|
||||
} else if cross_dom > 1e-10 {
|
||||
} else if cross_dom > MIN_POSITIVE_MPJPE {
|
||||
f32::INFINITY
|
||||
} else {
|
||||
1.0
|
||||
};
|
||||
let speedup = if few_shot > 1e-10 {
|
||||
let speedup = if few_shot > MIN_POSITIVE_MPJPE {
|
||||
cross_dom / few_shot
|
||||
} else {
|
||||
1.0
|
||||
@@ -132,6 +138,43 @@ fn mean_of(v: Option<&Vec<f32>>) -> f32 {
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked division-guard floor must equal the prior
|
||||
/// inline `1e-10` literal exactly (operating-value guard).
|
||||
#[test]
|
||||
fn eval_min_positive_mpjpe_unchanged_from_literal() {
|
||||
assert_eq!(MIN_POSITIVE_MPJPE, 1e-10_f32);
|
||||
}
|
||||
|
||||
/// Characterize the `in_dom ≈ 0` boundary: a perfect in-domain fit but
|
||||
/// nonzero cross-domain error yields the `INFINITY` gap sentinel (the
|
||||
/// middle branch), not a divide-by-≈0 NaN.
|
||||
#[test]
|
||||
fn domain_gap_infinite_when_in_domain_perfect_but_cross_nonzero() {
|
||||
let ev = CrossDomainEvaluator::new(1);
|
||||
let preds = vec![
|
||||
(vec![1.0, 2.0, 3.0], vec![1.0, 2.0, 3.0]), // dom 0: err 0
|
||||
(vec![0.0, 0.0, 0.0], vec![2.0, 0.0, 0.0]), // dom 1: err 2
|
||||
];
|
||||
let m = ev.evaluate(&preds, &[0, 1]);
|
||||
assert!((m.in_domain_mpjpe).abs() < MIN_POSITIVE_MPJPE);
|
||||
assert!(m.domain_gap_ratio.is_infinite());
|
||||
}
|
||||
|
||||
/// Characterize the all-perfect boundary: in-domain AND cross-domain both ≈0
|
||||
/// ⇒ gap falls back to the `1.0` sentinel (the final else branch), never NaN.
|
||||
#[test]
|
||||
fn domain_gap_unity_when_everything_perfect() {
|
||||
let ev = CrossDomainEvaluator::new(1);
|
||||
let preds = vec![
|
||||
(vec![1.0, 2.0, 3.0], vec![1.0, 2.0, 3.0]),
|
||||
(vec![4.0, 5.0, 6.0], vec![4.0, 5.0, 6.0]),
|
||||
];
|
||||
let m = ev.evaluate(&preds, &[0, 1]);
|
||||
assert!((m.domain_gap_ratio - 1.0).abs() < 1e-6);
|
||||
// few_shot derived = (0+0)/2 = 0 ⇒ speedup also falls back to 1.0.
|
||||
assert!((m.adaptation_speedup - 1.0).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mpjpe_known_value() {
|
||||
assert!((mpjpe(&[0.0, 0.0, 0.0], &[3.0, 4.0, 0.0], 1) - 5.0).abs() < 1e-6);
|
||||
|
||||
@@ -166,6 +166,13 @@ impl DeepSets {
|
||||
}
|
||||
|
||||
/// Encode a set of embeddings (each of length `geometry_dim`) into one vector.
|
||||
///
|
||||
/// # Panics
|
||||
///
|
||||
/// Panics if `ap_embeddings` is empty — a permutation-invariant mean-pool
|
||||
/// over zero elements is undefined. Callers with optional AP sets must guard
|
||||
/// for the empty case before calling (no behaviour change; documents the
|
||||
/// existing `assert!`).
|
||||
pub fn encode(&self, ap_embeddings: &[Vec<f32>]) -> Vec<f32> {
|
||||
assert!(
|
||||
!ap_embeddings.is_empty(),
|
||||
|
||||
@@ -50,6 +50,10 @@ pub mod error;
|
||||
pub mod eval;
|
||||
pub mod geometry;
|
||||
pub mod mae;
|
||||
/// Canonical pose-metric core (ADR-155 §Tier-1.1) — `pck_canonical` /
|
||||
/// `oks_canonical`, available **without** the `tch-backend` feature so the
|
||||
/// single metric definition is reachable from the workspace test gate.
|
||||
pub mod metrics_core;
|
||||
pub mod rapid_adapt;
|
||||
pub mod ruview_metrics;
|
||||
pub mod signal_features;
|
||||
@@ -79,6 +83,12 @@ pub mod occupancy_bench;
|
||||
pub mod trainer;
|
||||
|
||||
// Convenient re-exports at the crate root.
|
||||
// Canonical metric (ADR-155 §Tier-1.1) — re-exported un-gated so the single
|
||||
// source of truth is reachable with or without `tch-backend`.
|
||||
pub use metrics_core::{
|
||||
canonical_torso_size, oks_canonical, pck_canonical, CANON_LEFT_HIP, CANON_RIGHT_HIP,
|
||||
COCO_KP_SIGMAS,
|
||||
};
|
||||
pub use config::TrainingConfig;
|
||||
pub use dataset::{
|
||||
CsiDataset, CsiSample, DataLoader, MmFiDataset, SyntheticConfig, SyntheticCsiDataset,
|
||||
|
||||
@@ -4,7 +4,8 @@
|
||||
//!
|
||||
//! As of ADR-155 there is exactly **one** definition of PCK and one of OKS
|
||||
//! that may be used for any *reported / claimed* number. They live in the
|
||||
//! [`canonical`] region of this module:
|
||||
//! un-gated [`crate::metrics_core`] module (so the single definition is
|
||||
//! reachable with or without `tch-backend`) and are re-exported here:
|
||||
//!
|
||||
//! - [`pck_canonical`] — **PCK\@k, torso-normalized.** A keypoint `j` is
|
||||
//! correct iff `‖pred_j − gt_j‖₂ ≤ k · torso`, where
|
||||
@@ -47,177 +48,23 @@ use petgraph::visit::EdgeRef;
|
||||
use ruvector_mincut::{DynamicMinCut, MinCutBuilder};
|
||||
use std::collections::VecDeque;
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// COCO keypoint sigmas (17 joints)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Per-joint sigma values from the COCO keypoint evaluation standard.
|
||||
///
|
||||
/// These constants control the spread of the OKS Gaussian kernel for each
|
||||
/// of the 17 COCO-defined body joints.
|
||||
pub const COCO_KP_SIGMAS: [f32; 17] = [
|
||||
0.026, // 0 nose
|
||||
0.025, // 1 left_eye
|
||||
0.025, // 2 right_eye
|
||||
0.035, // 3 left_ear
|
||||
0.035, // 4 right_ear
|
||||
0.079, // 5 left_shoulder
|
||||
0.079, // 6 right_shoulder
|
||||
0.072, // 7 left_elbow
|
||||
0.072, // 8 right_elbow
|
||||
0.062, // 9 left_wrist
|
||||
0.062, // 10 right_wrist
|
||||
0.107, // 11 left_hip
|
||||
0.107, // 12 right_hip
|
||||
0.087, // 13 left_knee
|
||||
0.087, // 14 right_knee
|
||||
0.089, // 15 left_ankle
|
||||
0.089, // 16 right_ankle
|
||||
];
|
||||
|
||||
// ===========================================================================
|
||||
// CANONICAL METRIC — single source of truth (ADR-155 §Tier-1.1)
|
||||
// ===========================================================================
|
||||
//
|
||||
// The canonical metric core was hoisted to the **un-gated** `metrics_core`
|
||||
// module (ADR-155 Milestone-1) so the single PCK/OKS definition is reachable
|
||||
// from the workspace test gate (`--no-default-features`) — this whole `metrics`
|
||||
// module is gated behind `tch-backend`. Re-exporting here keeps every existing
|
||||
// call site (`MetricsAccumulator`, `compute_pck`, the deprecated v2 path, the
|
||||
// tch trainer) pointing at exactly **one** implementation.
|
||||
|
||||
/// COCO joint index of the left hip.
|
||||
pub const CANON_LEFT_HIP: usize = 11;
|
||||
/// COCO joint index of the right hip.
|
||||
pub const CANON_RIGHT_HIP: usize = 12;
|
||||
|
||||
/// Canonical torso normalizer used by [`pck_canonical`].
|
||||
///
|
||||
/// Returns `‖left_hip − right_hip‖₂` (COCO joints 11↔12) when both hips are
|
||||
/// visible; otherwise the diagonal of the visible-keypoint bounding box. The
|
||||
/// distance is computed in whatever coordinate space `kpts` is expressed in
|
||||
/// (the canonical PCK requires pred and gt to share that space).
|
||||
///
|
||||
/// Returns `None` when there is no positive-extent reference available (no
|
||||
/// visible hips *and* a degenerate/empty visible bbox), signalling the caller
|
||||
/// that the sample cannot be scored.
|
||||
pub fn canonical_torso_size(gt_kpts: &Array2<f32>, visibility: &Array1<f32>) -> Option<f32> {
|
||||
let n = gt_kpts.shape()[0].min(visibility.len());
|
||||
if CANON_LEFT_HIP < n
|
||||
&& CANON_RIGHT_HIP < n
|
||||
&& visibility[CANON_LEFT_HIP] >= 0.5
|
||||
&& visibility[CANON_RIGHT_HIP] >= 0.5
|
||||
{
|
||||
let dx = gt_kpts[[CANON_LEFT_HIP, 0]] - gt_kpts[[CANON_RIGHT_HIP, 0]];
|
||||
let dy = gt_kpts[[CANON_LEFT_HIP, 1]] - gt_kpts[[CANON_RIGHT_HIP, 1]];
|
||||
let torso = (dx * dx + dy * dy).sqrt();
|
||||
if torso > 1e-6 {
|
||||
return Some(torso);
|
||||
}
|
||||
}
|
||||
// Fallback: bounding-box diagonal of visible keypoints.
|
||||
let diag = bounding_box_diagonal(gt_kpts, visibility, n);
|
||||
if diag > 1e-6 {
|
||||
Some(diag)
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
/// **CANONICAL PCK\@`threshold`** — the single definition used for every
|
||||
/// reported number (ADR-155 §Tier-1.1).
|
||||
///
|
||||
/// A keypoint `j` with `visibility[j] >= 0.5` is *correct* iff
|
||||
/// `‖pred_j − gt_j‖₂ ≤ threshold · torso`, where `torso` is
|
||||
/// [`canonical_torso_size`] in the keypoint coordinate space.
|
||||
///
|
||||
/// # Returns
|
||||
/// `(correct, total, pck)` where `pck ∈ [0,1]`. **`(0, 0, 0.0)` when no
|
||||
/// keypoint is visible or the torso reference is degenerate** — a sample with
|
||||
/// no measurable evidence scores 0, never 1 (closes the
|
||||
/// `MetricsAccumulator` false-perfect bug).
|
||||
pub fn pck_canonical(
|
||||
pred_kpts: &Array2<f32>,
|
||||
gt_kpts: &Array2<f32>,
|
||||
visibility: &Array1<f32>,
|
||||
threshold: f32,
|
||||
) -> (usize, usize, f32) {
|
||||
let n = pred_kpts.shape()[0]
|
||||
.min(gt_kpts.shape()[0])
|
||||
.min(visibility.len());
|
||||
let torso = match canonical_torso_size(gt_kpts, visibility) {
|
||||
Some(t) => t,
|
||||
// No measurable reference scale ⇒ cannot score ⇒ 0.0 (NOT trivially 1.0).
|
||||
None => return (0, 0, 0.0),
|
||||
};
|
||||
let dist_threshold = threshold * torso;
|
||||
|
||||
let mut correct = 0usize;
|
||||
let mut total = 0usize;
|
||||
for j in 0..n {
|
||||
if visibility[j] < 0.5 {
|
||||
continue;
|
||||
}
|
||||
total += 1;
|
||||
let dx = pred_kpts[[j, 0]] - gt_kpts[[j, 0]];
|
||||
let dy = pred_kpts[[j, 1]] - gt_kpts[[j, 1]];
|
||||
if (dx * dx + dy * dy).sqrt() <= dist_threshold {
|
||||
correct += 1;
|
||||
}
|
||||
}
|
||||
let pck = if total > 0 {
|
||||
correct as f32 / total as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
(correct, total, pck)
|
||||
}
|
||||
|
||||
/// **CANONICAL OKS** — COCO Object Keypoint Similarity (ADR-155 §Tier-1.1).
|
||||
///
|
||||
/// `OKS = Σⱼ exp(−dⱼ² / (2 s² kⱼ²)) · δ(vⱼ≥0.5) / Σⱼ δ(vⱼ≥0.5)` with
|
||||
/// `s = sqrt(area)` derived from the **GT keypoint bounding box in the
|
||||
/// keypoint coordinate space** (via [`canonical_torso_size`]² as a robust,
|
||||
/// always-positive proxy for area when an explicit bbox is unavailable).
|
||||
///
|
||||
/// Passing normalized [0,1] coordinates is fine *because the scale is derived
|
||||
/// from the pose itself* — there is no `s = 1.0` escape hatch that would make
|
||||
/// OKS ≈ 1.0 for any pose (the historical "fake Gold tier" bug).
|
||||
///
|
||||
/// Returns 0.0 when no keypoints are visible or the scale is degenerate.
|
||||
pub fn oks_canonical(
|
||||
pred_kpts: &Array2<f32>,
|
||||
gt_kpts: &Array2<f32>,
|
||||
visibility: &Array1<f32>,
|
||||
) -> f32 {
|
||||
let n = pred_kpts.shape()[0]
|
||||
.min(gt_kpts.shape()[0])
|
||||
.min(visibility.len());
|
||||
// Scale: area ≈ torso². Derived from the actual pose, never a fixed 1.0.
|
||||
let s = match canonical_torso_size(gt_kpts, visibility) {
|
||||
Some(t) => t,
|
||||
None => return 0.0,
|
||||
};
|
||||
let s_sq = s * s;
|
||||
if s_sq <= 0.0 {
|
||||
return 0.0;
|
||||
}
|
||||
let mut num = 0.0f32;
|
||||
let mut den = 0.0f32;
|
||||
for j in 0..n {
|
||||
if visibility[j] < 0.5 {
|
||||
continue;
|
||||
}
|
||||
den += 1.0;
|
||||
let dx = pred_kpts[[j, 0]] - gt_kpts[[j, 0]];
|
||||
let dy = pred_kpts[[j, 1]] - gt_kpts[[j, 1]];
|
||||
let d_sq = dx * dx + dy * dy;
|
||||
let k = if j < COCO_KP_SIGMAS.len() {
|
||||
COCO_KP_SIGMAS[j]
|
||||
} else {
|
||||
0.07
|
||||
};
|
||||
num += (-d_sq / (2.0 * s_sq * k * k)).exp();
|
||||
}
|
||||
if den > 0.0 {
|
||||
num / den
|
||||
} else {
|
||||
0.0
|
||||
}
|
||||
}
|
||||
pub use crate::metrics_core::{
|
||||
canonical_torso_size, oks_canonical, pck_canonical, CANON_LEFT_HIP, CANON_RIGHT_HIP,
|
||||
COCO_KP_SIGMAS,
|
||||
};
|
||||
// `bounding_box_diagonal` stays crate-internal (metrics_core); the only caller
|
||||
// here is a test, which references it via its full path.
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// MetricsResult
|
||||
@@ -400,39 +247,9 @@ impl MetricsAccumulator {
|
||||
// ---------------------------------------------------------------------------
|
||||
// Geometric helpers
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Compute the Euclidean diagonal of the bounding box of visible keypoints.
|
||||
///
|
||||
/// The bounding box is defined by the axis-aligned extent of all keypoints
|
||||
/// that have `visibility[j] >= 0.5`. Returns 0.0 if there are no visible
|
||||
/// keypoints or all are co-located.
|
||||
fn bounding_box_diagonal(kp: &Array2<f32>, visibility: &Array1<f32>, num_joints: usize) -> f32 {
|
||||
let mut x_min = f32::MAX;
|
||||
let mut x_max = f32::MIN;
|
||||
let mut y_min = f32::MAX;
|
||||
let mut y_max = f32::MIN;
|
||||
let mut any_visible = false;
|
||||
|
||||
for j in 0..num_joints {
|
||||
if visibility[j] >= 0.5 {
|
||||
let x = kp[[j, 0]];
|
||||
let y = kp[[j, 1]];
|
||||
x_min = x_min.min(x);
|
||||
x_max = x_max.max(x);
|
||||
y_min = y_min.min(y);
|
||||
y_max = y_max.max(y);
|
||||
any_visible = true;
|
||||
}
|
||||
}
|
||||
|
||||
if !any_visible {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
let w = (x_max - x_min).max(0.0);
|
||||
let h = (y_max - y_min).max(0.0);
|
||||
(w * w + h * h).sqrt()
|
||||
}
|
||||
//
|
||||
// `bounding_box_diagonal` (the canonical normalizer's bbox fallback) now lives
|
||||
// in `metrics_core` alongside the canonical metric it supports.
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Per-sample PCK and OKS free functions (required by the training evaluator)
|
||||
@@ -1441,7 +1258,7 @@ mod tests {
|
||||
fn bbox_diagonal_unit_square() {
|
||||
let kp = array![[0.0_f32, 0.0], [1.0, 1.0]];
|
||||
let vis = array![2.0_f32, 2.0];
|
||||
let diag = bounding_box_diagonal(&kp, &vis, 2);
|
||||
let diag = crate::metrics_core::bounding_box_diagonal(&kp, &vis, 2);
|
||||
assert_abs_diff_eq!(diag, std::f32::consts::SQRT_2, epsilon = 1e-5);
|
||||
}
|
||||
|
||||
|
||||
@@ -0,0 +1,335 @@
|
||||
//! Canonical pose-metric core (ADR-155 §Tier-1.1) — the single source of truth
|
||||
//! for PCK and OKS, **available without the `tch-backend` feature**.
|
||||
//!
|
||||
//! # Why this module exists (ADR-155 Milestone-1, §8 backlog resolution)
|
||||
//!
|
||||
//! The full [`crate::metrics`] module is gated behind `tch-backend` (libtorch
|
||||
//! FFI) because it also hosts the trainer accumulators, min-cut matchers, and
|
||||
//! ndarray/petgraph machinery. But the *metric definition itself*
|
||||
//! ([`pck_canonical`], [`oks_canonical`], [`canonical_torso_size`]) depends only
|
||||
//! on `ndarray` — no tch. Hoisting those four functions here makes the canonical
|
||||
//! definition reachable from the workspace test gate
|
||||
//! (`cargo test --no-default-features`) so the integration test
|
||||
//! (`tests/test_metrics.rs`) can validate the **production** function against
|
||||
//! hand-computed fixtures, instead of testing an independent reimplementation
|
||||
//! that could be wrong the same way (the §8 "reference kernels" finding).
|
||||
//!
|
||||
//! [`crate::metrics`] re-exports every item here, so all existing call sites and
|
||||
//! the tch-gated trainer path are unchanged: there is still exactly **one**
|
||||
//! implementation of each metric, now in one *un-gated* place.
|
||||
//!
|
||||
//! # CANONICAL METRIC (the only definitions valid for a *reported* number)
|
||||
//!
|
||||
//! - [`pck_canonical`] — **PCK\@k, torso-normalized.** A keypoint `j` is correct
|
||||
//! iff `‖pred_j − gt_j‖₂ ≤ k · torso`, where
|
||||
//! `torso = ‖left_hip(11) − right_hip(12)‖₂` in the keypoint coordinate space,
|
||||
//! with a bounding-box-diagonal fallback when the hips are not both visible.
|
||||
//! **Zero visible joints ⇒ `(0, 0, 0.0)`** — no evidence scores 0, never 1.
|
||||
//! - [`oks_canonical`] — **COCO OKS** with `s = sqrt(area)` derived from the GT
|
||||
//! pose extent (never a fixed `1.0`); a degenerate pose returns 0.0.
|
||||
//!
|
||||
//! # No mock data
|
||||
//!
|
||||
//! All computations are grounded in real geometry following published metric
|
||||
//! definitions. No random or synthetic values are introduced at runtime.
|
||||
|
||||
use ndarray::{Array1, Array2};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// COCO keypoint sigmas (17 joints)
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Per-joint sigma values from the COCO keypoint evaluation standard.
|
||||
///
|
||||
/// These constants control the spread of the OKS Gaussian kernel for each
|
||||
/// of the 17 COCO-defined body joints.
|
||||
pub const COCO_KP_SIGMAS: [f32; 17] = [
|
||||
0.026, // 0 nose
|
||||
0.025, // 1 left_eye
|
||||
0.025, // 2 right_eye
|
||||
0.035, // 3 left_ear
|
||||
0.035, // 4 right_ear
|
||||
0.079, // 5 left_shoulder
|
||||
0.079, // 6 right_shoulder
|
||||
0.072, // 7 left_elbow
|
||||
0.072, // 8 right_elbow
|
||||
0.062, // 9 left_wrist
|
||||
0.062, // 10 right_wrist
|
||||
0.107, // 11 left_hip
|
||||
0.107, // 12 right_hip
|
||||
0.087, // 13 left_knee
|
||||
0.087, // 14 right_knee
|
||||
0.089, // 15 left_ankle
|
||||
0.089, // 16 right_ankle
|
||||
];
|
||||
|
||||
// ===========================================================================
|
||||
// CANONICAL METRIC — single source of truth (ADR-155 §Tier-1.1)
|
||||
// ===========================================================================
|
||||
|
||||
/// COCO joint index of the left hip.
|
||||
pub const CANON_LEFT_HIP: usize = 11;
|
||||
/// COCO joint index of the right hip.
|
||||
pub const CANON_RIGHT_HIP: usize = 12;
|
||||
|
||||
// --- Tuning constants (ADR-155 M2 §8: de-magicked from bare literals; values
|
||||
// are bit-identical to the prior inline literals — documentation only, no
|
||||
// behaviour change). ---
|
||||
|
||||
/// Visibility cutoff: a keypoint counts as *visible* iff `visibility[j] >= 0.5`.
|
||||
///
|
||||
/// This is the COCO convention (visibility flag 2 = "labelled and visible";
|
||||
/// any soft confidence ≥ 0.5 is treated as present). Used identically in
|
||||
/// [`bounding_box_diagonal`], [`canonical_torso_size`], [`pck_canonical`] and
|
||||
/// [`oks_canonical`].
|
||||
const VISIBILITY_THRESHOLD: f32 = 0.5;
|
||||
|
||||
/// Minimum positive extent for a usable reference scale (torso width or bbox
|
||||
/// diagonal). Below this the sample has no measurable evidence and is reported
|
||||
/// as unscoreable (PCK `(0,0,0.0)` / OKS `0.0`) rather than dividing by ≈0.
|
||||
const MIN_REFERENCE_EXTENT: f32 = 1e-6;
|
||||
|
||||
/// Fallback per-joint OKS sigma for joint indices beyond the 17 COCO-defined
|
||||
/// keypoints (defensive: the canonical path only ever scores `j < 17`). Mid-range
|
||||
/// of the COCO sigma band — see [`COCO_KP_SIGMAS`].
|
||||
const OKS_FALLBACK_SIGMA: f32 = 0.07;
|
||||
|
||||
/// Compute the Euclidean diagonal of the bounding box of visible keypoints.
|
||||
///
|
||||
/// The bounding box is defined by the axis-aligned extent of all keypoints
|
||||
/// that have `visibility[j] >= 0.5`. Returns 0.0 if there are no visible
|
||||
/// keypoints or all are co-located.
|
||||
pub(crate) fn bounding_box_diagonal(
|
||||
kp: &Array2<f32>,
|
||||
visibility: &Array1<f32>,
|
||||
num_joints: usize,
|
||||
) -> f32 {
|
||||
let mut x_min = f32::MAX;
|
||||
let mut x_max = f32::MIN;
|
||||
let mut y_min = f32::MAX;
|
||||
let mut y_max = f32::MIN;
|
||||
let mut any_visible = false;
|
||||
|
||||
for j in 0..num_joints {
|
||||
if visibility[j] >= VISIBILITY_THRESHOLD {
|
||||
let x = kp[[j, 0]];
|
||||
let y = kp[[j, 1]];
|
||||
x_min = x_min.min(x);
|
||||
x_max = x_max.max(x);
|
||||
y_min = y_min.min(y);
|
||||
y_max = y_max.max(y);
|
||||
any_visible = true;
|
||||
}
|
||||
}
|
||||
|
||||
if !any_visible {
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
let w = (x_max - x_min).max(0.0);
|
||||
let h = (y_max - y_min).max(0.0);
|
||||
(w * w + h * h).sqrt()
|
||||
}
|
||||
|
||||
/// Canonical torso normalizer used by [`pck_canonical`].
|
||||
///
|
||||
/// Returns `‖left_hip − right_hip‖₂` (COCO joints 11↔12) when both hips are
|
||||
/// visible; otherwise the diagonal of the visible-keypoint bounding box. The
|
||||
/// distance is computed in whatever coordinate space `gt_kpts` is expressed in
|
||||
/// (the canonical PCK requires pred and gt to share that space).
|
||||
///
|
||||
/// Returns `None` when there is no positive-extent reference available (no
|
||||
/// visible hips *and* a degenerate/empty visible bbox), signalling the caller
|
||||
/// that the sample cannot be scored.
|
||||
pub fn canonical_torso_size(gt_kpts: &Array2<f32>, visibility: &Array1<f32>) -> Option<f32> {
|
||||
let n = gt_kpts.shape()[0].min(visibility.len());
|
||||
if CANON_LEFT_HIP < n
|
||||
&& CANON_RIGHT_HIP < n
|
||||
&& visibility[CANON_LEFT_HIP] >= VISIBILITY_THRESHOLD
|
||||
&& visibility[CANON_RIGHT_HIP] >= VISIBILITY_THRESHOLD
|
||||
{
|
||||
let dx = gt_kpts[[CANON_LEFT_HIP, 0]] - gt_kpts[[CANON_RIGHT_HIP, 0]];
|
||||
let dy = gt_kpts[[CANON_LEFT_HIP, 1]] - gt_kpts[[CANON_RIGHT_HIP, 1]];
|
||||
let torso = (dx * dx + dy * dy).sqrt();
|
||||
if torso > MIN_REFERENCE_EXTENT {
|
||||
return Some(torso);
|
||||
}
|
||||
}
|
||||
// Fallback: bounding-box diagonal of visible keypoints.
|
||||
let diag = bounding_box_diagonal(gt_kpts, visibility, n);
|
||||
if diag > MIN_REFERENCE_EXTENT {
|
||||
Some(diag)
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
/// **CANONICAL PCK\@`threshold`** — the single definition used for every
|
||||
/// reported number (ADR-155 §Tier-1.1).
|
||||
///
|
||||
/// A keypoint `j` with `visibility[j] >= 0.5` is *correct* iff
|
||||
/// `‖pred_j − gt_j‖₂ ≤ threshold · torso`, where `torso` is
|
||||
/// [`canonical_torso_size`] in the keypoint coordinate space.
|
||||
///
|
||||
/// # Returns
|
||||
/// `(correct, total, pck)` where `pck ∈ [0,1]`. **`(0, 0, 0.0)` when no
|
||||
/// keypoint is visible or the torso reference is degenerate** — a sample with
|
||||
/// no measurable evidence scores 0, never 1 (closes the
|
||||
/// `MetricsAccumulator` false-perfect bug).
|
||||
///
|
||||
/// # Normalization basis (vs other PCK definitions in the workspace)
|
||||
/// This is **hip↔hip torso WIDTH** normalized in the keypoint coordinate space.
|
||||
/// It is deliberately **distinct** from the live sensing-server's
|
||||
/// `compute_pck_torso_height` (torso-HEIGHT nose→hip, pixel-space) — see ADR-155
|
||||
/// §2.1 / §8. Those numbers must never be conflated.
|
||||
pub fn pck_canonical(
|
||||
pred_kpts: &Array2<f32>,
|
||||
gt_kpts: &Array2<f32>,
|
||||
visibility: &Array1<f32>,
|
||||
threshold: f32,
|
||||
) -> (usize, usize, f32) {
|
||||
let n = pred_kpts.shape()[0]
|
||||
.min(gt_kpts.shape()[0])
|
||||
.min(visibility.len());
|
||||
let torso = match canonical_torso_size(gt_kpts, visibility) {
|
||||
Some(t) => t,
|
||||
// No measurable reference scale ⇒ cannot score ⇒ 0.0 (NOT trivially 1.0).
|
||||
None => return (0, 0, 0.0),
|
||||
};
|
||||
let dist_threshold = threshold * torso;
|
||||
|
||||
let mut correct = 0usize;
|
||||
let mut total = 0usize;
|
||||
for j in 0..n {
|
||||
if visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
total += 1;
|
||||
let dx = pred_kpts[[j, 0]] - gt_kpts[[j, 0]];
|
||||
let dy = pred_kpts[[j, 1]] - gt_kpts[[j, 1]];
|
||||
if (dx * dx + dy * dy).sqrt() <= dist_threshold {
|
||||
correct += 1;
|
||||
}
|
||||
}
|
||||
let pck = if total > 0 {
|
||||
correct as f32 / total as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
(correct, total, pck)
|
||||
}
|
||||
|
||||
/// **CANONICAL OKS** — COCO Object Keypoint Similarity (ADR-155 §Tier-1.1).
|
||||
///
|
||||
/// `OKS = Σⱼ exp(−dⱼ² / (2 s² kⱼ²)) · δ(vⱼ≥0.5) / Σⱼ δ(vⱼ≥0.5)` with
|
||||
/// `s = sqrt(area)` derived from the **GT keypoint bounding box in the
|
||||
/// keypoint coordinate space** (via [`canonical_torso_size`]² as a robust,
|
||||
/// always-positive proxy for area when an explicit bbox is unavailable).
|
||||
///
|
||||
/// Passing normalized [0,1] coordinates is fine *because the scale is derived
|
||||
/// from the pose itself* — there is no `s = 1.0` escape hatch that would make
|
||||
/// OKS ≈ 1.0 for any pose (the historical "fake Gold tier" bug).
|
||||
///
|
||||
/// Returns 0.0 when no keypoints are visible or the scale is degenerate.
|
||||
pub fn oks_canonical(
|
||||
pred_kpts: &Array2<f32>,
|
||||
gt_kpts: &Array2<f32>,
|
||||
visibility: &Array1<f32>,
|
||||
) -> f32 {
|
||||
let n = pred_kpts.shape()[0]
|
||||
.min(gt_kpts.shape()[0])
|
||||
.min(visibility.len());
|
||||
// Scale: area ≈ torso². Derived from the actual pose, never a fixed 1.0.
|
||||
let s = match canonical_torso_size(gt_kpts, visibility) {
|
||||
Some(t) => t,
|
||||
None => return 0.0,
|
||||
};
|
||||
let s_sq = s * s;
|
||||
if s_sq <= 0.0 {
|
||||
return 0.0;
|
||||
}
|
||||
let mut num = 0.0f32;
|
||||
let mut den = 0.0f32;
|
||||
for j in 0..n {
|
||||
if visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
den += 1.0;
|
||||
let dx = pred_kpts[[j, 0]] - gt_kpts[[j, 0]];
|
||||
let dy = pred_kpts[[j, 1]] - gt_kpts[[j, 1]];
|
||||
let d_sq = dx * dx + dy * dy;
|
||||
let k = if j < COCO_KP_SIGMAS.len() {
|
||||
COCO_KP_SIGMAS[j]
|
||||
} else {
|
||||
OKS_FALLBACK_SIGMA
|
||||
};
|
||||
num += (-d_sq / (2.0 * s_sq * k * k)).exp();
|
||||
}
|
||||
if den > 0.0 {
|
||||
num / den
|
||||
} else {
|
||||
0.0
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod consts_tests {
|
||||
use super::*;
|
||||
|
||||
/// ADR-155 M2 §8: the de-magicked tuning consts must equal the prior inline
|
||||
/// literals exactly — this pins them so a future "tidy-up" cannot silently
|
||||
/// shift the metric definition (operating-value guard).
|
||||
#[test]
|
||||
fn metrics_core_consts_unchanged_from_literals() {
|
||||
assert_eq!(VISIBILITY_THRESHOLD, 0.5_f32);
|
||||
assert_eq!(MIN_REFERENCE_EXTENT, 1e-6_f32);
|
||||
assert_eq!(OKS_FALLBACK_SIGMA, 0.07_f32);
|
||||
assert_eq!(CANON_LEFT_HIP, 11);
|
||||
assert_eq!(CANON_RIGHT_HIP, 12);
|
||||
}
|
||||
|
||||
/// Characterize the visibility-threshold boundary: a keypoint at exactly the
|
||||
/// cutoff (vis == 0.5) is INCLUDED (`>=`), just below (0.499) is EXCLUDED.
|
||||
/// Pins current `>=`-inclusive behaviour at the edge.
|
||||
#[test]
|
||||
fn visibility_threshold_boundary_is_inclusive() {
|
||||
// Two GT hips give a positive torso; vary the (single) scored joint's
|
||||
// visibility around the 0.5 cutoff and confirm it flips total in/out.
|
||||
let gt = Array2::from_shape_vec(
|
||||
(13, 2),
|
||||
(0..13).flat_map(|j| [j as f32, 0.0]).collect::<Vec<_>>(),
|
||||
)
|
||||
.unwrap();
|
||||
// hips at 11,12 give torso = |11-12| = 1.0 along x.
|
||||
let pred = gt.clone();
|
||||
let mk_vis = |v0: f32| {
|
||||
let mut vis = Array1::<f32>::zeros(13);
|
||||
vis[CANON_LEFT_HIP] = 1.0;
|
||||
vis[CANON_RIGHT_HIP] = 1.0;
|
||||
vis[0] = v0; // joint 0 is the one we toggle
|
||||
vis
|
||||
};
|
||||
// At exactly 0.5 → joint 0 is counted (total includes it: 3 visible).
|
||||
let (_, total_at, _) = pck_canonical(&pred, >, &mk_vis(0.5), 0.2);
|
||||
assert_eq!(total_at, 3, "vis == 0.5 must be INCLUDED (>=)");
|
||||
// Just below → joint 0 excluded (only the 2 hips visible).
|
||||
let (_, total_below, _) = pck_canonical(&pred, >, &mk_vis(0.499), 0.2);
|
||||
assert_eq!(total_below, 2, "vis < 0.5 must be EXCLUDED");
|
||||
}
|
||||
|
||||
/// Characterize the reference-extent floor: a near-zero-extent GT pose (all
|
||||
/// keypoints coincident, hips coincident) is UNSCOREABLE → `(0,0,0.0)`,
|
||||
/// never a trivial perfect score. Pins the `MIN_REFERENCE_EXTENT` guard.
|
||||
#[test]
|
||||
fn degenerate_extent_below_floor_is_unscoreable() {
|
||||
// All 13 joints at the same point ⇒ torso ≈ 0, bbox diag ≈ 0 < 1e-6.
|
||||
let gt = Array2::<f32>::zeros((13, 2));
|
||||
let pred = gt.clone();
|
||||
let mut vis = Array1::<f32>::zeros(13);
|
||||
vis[CANON_LEFT_HIP] = 1.0;
|
||||
vis[CANON_RIGHT_HIP] = 1.0;
|
||||
assert!(canonical_torso_size(>, &vis).is_none());
|
||||
assert_eq!(pck_canonical(&pred, >, &vis, 0.2), (0, 0, 0.0));
|
||||
assert_eq!(oks_canonical(&pred, >, &vis), 0.0);
|
||||
}
|
||||
}
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user