fix(rvcsi): scale-relative baseline-drift thresholds + ESP32 end-to-end validation

BaselineDriftDetector compared `mean_amplitude` against its EWMA baseline with *absolute* thresholds (anomaly 1.0, drift 0.15). Fine for the synthetic unit tests (amplitudes ~1.0), but raw ESP32 CSI is int8 I/Q with amplitudes up to ~128, so window-to-window RMS distance is routinely 5-50 >> 1.0 and AnomalyDetected fired on ~96% of windows (319/331 on a real node-1 capture). Drift is now `||current - baseline||2 / ||baseline||2` (a fraction, with an eps floor that falls back to absolute for a degenerate near-zero baseline), so one tuning is valid across raw-int8 ESP32, int16-scaled Nexmon, and baseline-subtracted streams. AnomalyDetected drops to 40/331 on the same data; the existing detector tests still pass (their explicit configs are valid relative thresholds too); added baseline_drift_is_scale_invariant_ no_anomaly_storm. rvcsi-events 18 -> 19 tests; 162 rvcsi tests, 0 failures, clippy-clean. Surfaced by an end-to-end test against real ESP32 CSI on COM7: the device (ESP32-S3, node 1, ADR-018 firmware, WiFi "ruv.net" ch5 RSSI -39, CSI cb only because nothing listens at .156). rvcsi has no ESP32 adapter yet, so a 7,000-frame node-1 recording was transcoded to .rvcsi via the new scripts/esp32_jsonl_to_rvcsi.py (stand-in for `record --source esp32-jsonl`) and run through `rvcsi inspect`/`replay`/`calibrate`/`events` end-to-end. ADR-095 D13 and ADR-096 sections 2.1/5 updated; CHANGELOG entry added; rvcsi-adapter-esp32 (live serial/UDP source) noted as a follow-up. Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-09 10:13:17 +00:00 · 2026-05-12 22:16:27 -04:00
parent d40411e6d7
commit deb561bf9c
5 changed files with 294 additions and 12 deletions
@@ -111,7 +111,7 @@ Agents should observe RF state safely; device mutation and calibration change sy
 ### D13 — Quality scoring is mandatory

 CSI quality varies widely by chip, antenna, environment, channel, and interference. **Every frame, window, and event carries quality or confidence scoring.**
-*Consequences:* downstream systems can suppress weak evidence; easier debugging; requires calibration and thresholds.
+*Consequences:* downstream systems can suppress weak evidence; easier debugging; requires calibration and thresholds. Where a detector compares against a learned baseline (e.g. baseline-drift / anomaly), thresholds are expressed **relative to the baseline's magnitude**, not as absolute amplitude units, so a single tuning is valid across sources whose raw CSI scales differ by orders of magnitude (raw `int8` ESP32 vs. `int16`-scaled Nexmon vs. baseline-subtracted streams).

 ### D14 — Versioned calibration profiles