mirror of
https://github.com/ruvnet/RuView
synced 2026-06-28 13:23:19 +00:00
ef69a52624bb84b6618f601f44ed0df108106700
1047 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ef69a52624 |
chore: untrack ruvector.db runtime artifacts + gitignore at any depth
These are hook/runtime-generated databases (ruvector/intelligence store) that kept showing as dirty and don't belong in version control. Removed from the index (files kept on disk) and ignored globally. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
4001e9e178 |
feat(harness): npx @ruvnet/ruview operator harness + ADR-182 (#1123)
A host-portable RuView agent harness minted via MetaHarness and hardened per ADR-182. Published as @ruvnet/ruview@0.1.0 (bare `ruview` blocked by npm's typosquat filter → scoped fallback). What it does: - 6 fail-closed `ruview.*` tools (onboard, claim_check, verify, node_monitor, calibrate, node_flash) exposed as CLI verbs + a dependency-free MCP stdio server. - The "prove everything" rule made executable: `ruview.claim_check` flags untagged accuracy claims and the retracted "100%" framing. - 5 host-neutral skills (onboard/provision-node/calibrate-room/ train-pose/verify) + bundled .claude/ config + provenance manifest. Validated: 17/17 unit tests, live MCP handshake, `ruview.verify` ran the real verify.py to VERDICT: PASS, clean `npx @ruvnet/ruview` from registry. Packs to 16.7 kB / 21 files; kernel+host are optionalDependencies so the operator tools install lightweight. README: documented as the portable, multi-host companion to the in-repo plugins/ruview/ Claude Code plugin (not a replacement). |
||
|
|
65e29ef47a |
fix(display): no false display-detect on bare DevKit → CSI starves at MGMT-only (#1000) (#1121)
The SH8601 QSPI panel is write-only, so display_hal_init_panel() 'succeeds' even on a bare display-less board — display_is_active() then returned true and main.c skipped the #893/#906 MGMT->MGMT+DATA CSI filter upgrade (yield=0pps). Gate on the FT3168 touch I2C readback (always present on the Touch-AMOLED board, absent on a bare DevKit): if touch is absent, the panel 'success' was a false-positive — bail to headless before the display task starts, so display_is_active() stays false and CSI captures. Co-authored-by: ruv <ruvnet@gmail.com>v0.8.1-esp32 |
||
|
|
cb30988cf9 |
fix(mmwave): require validated MR60 header on probe — no false detect on empty UART (#1107) (#1119)
probe_at_baud counted bare 0x01 (SOF) bytes and declared MR60BHA2 on a single one. A floating UART1 with no sensor reads noise full of 0x01s → false 'Detected MR60BHA2 (caps=0x000f)'. Now a candidate must be a full 8-byte header with a valid header checksum (bytes 0..6) AND a known frame type (0x0A__ / 0x0F09), and clear the ≥3 threshold; removed the weak single-hit fallback. Real sensors stream valid frames continuously, so detection of present hardware is unaffected. Co-authored-by: ruv <ruvnet@gmail.com> |
||
|
|
128b129474 |
Merge pull request #1120 from ruvnet/fix/1007-paired-data-pipeline
fix(paired-data): 4 bugs in CSI recorder + ground-truth aligner (#1007) |
||
|
|
15a983b555 |
fix(paired-data): 4 bugs corrupting/blocking camera-supervised training data (#1007)
1. record-csi-udp.py stamped LOCAL time with a 'Z' (UTC) suffix → camera/CSI disagreed by the UTC offset → 0 aligned pairs. Now writes true UTC via datetime.now(timezone.utc). 2. align-ground-truth.js kept empty-keypoint (non-detection) records at confidence 0, collapsing window avgConf below threshold → all windows rejected. Now skipped at load. 3. extractCsiMatrix silently zero-padded/truncated mixed-subcarrier frames. Now frames are filtered to the session's modal subcarrier count before windowing — never padded. 4. CSI/feature matrices are filled frame-major but were labeled shape [nSc, nFrames] — transposed. Labels corrected to [nFrames, nSc] / [nFrames, dim]. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
c6e7667676 |
Merge pull request #1104 from ruvnet/fix/issue-1049-configurable-guard
fix(sensing-server): make multistatic guard interval configurable (closes #1049) |
||
|
|
d639c747df |
Merge pull request #1114 from ruvnet/examples/through-wall-tools
examples(through-wall): ESP32 sensor auto-detection + WiFlow analysis tools |
||
|
|
42c764652d |
examples(through-wall): ESP32 sensor auto-detection + WiFlow analysis tools
- wiflow_browser.html: auto-detect live ESP32 nodes from the /ws/sensing stream and lock them as the model schema (NODE_IDS/CSI_DIM dynamic), persisted + restorable - wiflow_ab.py: leakage-controlled A/B (chronological/random/blocked-gap/grouped-bucket, multi-seed) — the honest CSI→pose evaluation harness - wiflow_capture.py / wiflow_train.py / wiflow_infer.py: camera-paired capture + train + infer - pose.html: live WiFi-inferred skeleton viewer; serve.py: static server - gitignore the regenerable 1.5MB model.npz artifact Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
db02956c22 |
Merge pull request #1113 from ruvnet/chore/bump-ruv-neural-submodule
chore: bump ruv-neural submodule to current main |
||
|
|
c84ea39e62 |
chore: bump ruv-neural submodule → current main (web console, closed-loop neuromod, ruvn mention)
Advances the vendored ruv-neural submodule from the stale 'Initial' pin (1ece3af) to current main (81be9e1): the static web console UI, the closed-loop neuromodulation platform, repositioned README, and the @ruvnet/ruvn companion-tool mention. ruv-neural is not a v2 workspace member, so this does not affect the workspace build (cargo metadata resolves clean). Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
760d05026c |
Merge pull request #1112 from ruvnet/chore/extract-swarm-worldgraph-submodules
Extract ruview-swarm → ruvnet/ruv-drone and world crates → ruvnet/worldgraph (submodules) |
||
|
|
a784546918 |
ci(ruview-swarm): drop removed itar-unrestricted feature from test matrix
The industrial rescope (ruv-drone) removed the itar-unrestricted feature flag — formation/allocation/raft/flight-control are now default capabilities. Update the 'ruflo+itar' matrix entry to just '--features ruflo' so CI matches the new feature set. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
9c751d0d92 |
chore(worldgraph): bump submodule — README + metadata polish
Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
a13e9b66cb |
chore: bump ruv-drone + worldgraph submodules (LICENSE + CI polish)
Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
6db183bf3e |
chore(swarm): bump ruv-drone submodule — README cleanup
Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
f65d0f79e7 |
chore(swarm): bump ruv-drone submodule (rescope + stray-file cleanup)
Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
7fb3b88061 |
chore(swarm): bump ruv-drone submodule — industrial rescope (drop ITAR/USML gating)
Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
aeac5f5543 |
chore(worldgraph): extract geo+worldgraph+worldmodel to ruvnet/worldgraph submodule
- published as github.com/ruvnet/worldgraph (3-crate workspace, history via git-filter-repo) - replace the 3 in-tree crates with one submodule at v2/crates/worldgraph - parent workspace: drop the 3 members, exclude the submodule (it is its own workspace), repoint workspace.dependencies(worldmodel) + engine/sensing-server path-deps into it - cargo metadata resolves clean (geo/worldgraph/worldmodel consumed from the submodule) Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
c257e67c3d |
chore(swarm): extract ruview-swarm to ruvnet/ruv-drone submodule
- ruview-swarm published as github.com/ruvnet/ruv-drone (history preserved via subtree split) - replace the in-tree crate with a submodule at v2/crates/ruview-swarm (branch main) - standalone repo dropped the unused wifi-densepose-core path-dep; export-control NOTICE added there - workspace member path unchanged; cargo metadata resolves ruview-swarm from the submodule Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
a4d5ea88f3 |
feat(examples): in-browser WiFlow trainer + camera-supervised pipeline + ADR-180/181/181A
Tonight's real WiFlow work, all honest: - examples/through-wall/: live 2-node CSI demo (index.html), the WiFlow camera-supervised pipeline (wiflow_capture/train/infer.py — proven +9.4pp over mean-pose baseline on ruvultra), the live pose viewer (pose.html), and the COMPLETE in-browser trainer (wiflow_browser.html): 4-stage calibrate->capture->train->infer, TF.js WebGPU/WASM/WebGL, MediaPipe camera supervision, IndexedDB persistence, mean-pose-baseline honesty. - ADR-180 (through-wall hand-off demo), ADR-181 (full browser WiFlow, WASM+WebGPU, calibration phase, mobile/secure-context matrix), ADR-181A (binary CSI framing protocol). Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
ebe217569b |
feat(examples): real live WiFi-CSI through-wall sensing demo
Self-contained Three.js r128 demo at examples/through-wall/ that renders ONLY genuine data streamed from the running sensing-server over ws://localhost:8765/ws/sensing. No simulation, no fabricated frames, no fake skeleton. Renders, driven by real /ws/sensing frames: - 20x20 signal_field floor heatmap (real values) - coarse RF-localization puck from persons[0].position (labeled coarse, NOT pose; peak signal_field cell as fallback) - live motion/breathing/variance/rssi bars + motion sparkline - presence/confidence/estimated_persons/active-node/tick/Hz meters - 3D room with wall + doorway dividing office (node 9) / hallway (node 13) - honest mutually-exclusive banner: LIVE (source=esp32) / SIMULATED / NO SERVER, showing the real source verbatim - optional webcam tile (ground-truth-when-visible, separate from CSI) Reuses scene/lights/bloom/CSS + webcam path from examples/three.js/demos/05-skinned-realtime.html, the floor-heatmap idea from ui/observatory/js/, and the threaded no-cache server from examples/three.js/server/serve-demo.py (serve.py on :8080). Verified against the live server: real frame source=esp32, nodes [9,13], 400 signal_field values, persons[0].position present. Python proof PASS. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
c27d6cc98e |
fix(sensing-server): make multistatic guard interval operator-configurable (#1049)
Two ESP32-S3 nodes on WiFi/ESP-NOW sync drift 10-150 ms (~70 ms typ.), exceeding the 60 ms default guard → permanent trust demotion to Restricted, all pose output suppressed, 200k+ errors, no escape but a container restart. Add a direct WDP_GUARD_INTERVAL_US override (+ optional WDP_SOFT_GUARD_US) to multistatic_guard_config_from_env. Precedence (most-specific wins): direct override > WDP_TDM_SLOTS+WDP_TDM_SLOT_US schedule-derived > 60ms/20ms default. Soft band always clamped strictly below hard; malformed/zero ignored (falls back, never breaks fusion). Effective guard logged at startup. Pinned by 6 tests (multistatic_guard_config_tests). sensing-server bin tests 449 -> 455, 0 failed. Python proof PASS, hash unchanged (off signal path). Closes #1049. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
cafbeb1e81 |
fix(wasm-edge): sanitize non-finite host floats at the WASM↔host frame boundary (#1102)
Closing beyond-SOTA security review of wifi-densepose-wasm-edge (ADR-040,
~70 edge modules). The two WASM↔host boundaries (lib.rs::on_frame/on_timer
and bin/ghost_hunter.rs::on_frame) read raw IEEE-754 f32 from the csi_get_*
imports with no finiteness check — the crate had zero is_finite/is_nan
guards and its clamp helpers propagate NaN. A single non-finite host value
latches NaN into long-lived per-module accumulators (EMA / Welford / phasor
sums / anomaly baselines), after which detectors fail degraded (stuck gate
state, silently-disabled checks) — silent corruption, not a crash.
Add sanitize_host_f32() (non-finite -> 0.0, core-only for no_std) applied at
every host_get_* float read: one chokepoint covering all downstream modules,
mirroring the existing M-01 negative-n_subcarriers boundary clamp. LOW /
defense-in-depth (the Tier-2 DSP firmware supplies the imports, a semi-trusted
boundary).
Pinned by boundary_tests::{sanitize_passes_finite_values_through,
sanitize_maps_non_finite_to_zero,
coherence_monitor_nan_latches_without_sanitize_but_not_with} — the last
asserts on the current CoherenceMonitor that a raw NaN frame latches the
smoothed score while the sanitized path stays finite.
Other review dimensions attested clean with evidence (see CHANGELOG): no
hot-path panics (all unwrap/expect are test-only or std-gated RVF builder),
all bounds min()-clamped, all index-by-cast const-bounded or guarded, no
leaking closures (no move||/forget/leak), no secrets.
Verified: host `cargo test --features std,medical-experimental` 672 passed /
0 failed (+3 new tests); all three wasm32-unknown-unknown release artifacts
build clean (lib default no_std/panic=abort, ghost_hunter standalone-bin,
medical-experimental); Python proof VERDICT PASS, hash unchanged.
|
||
|
|
c859f6f743 |
security(occworld-candle): int32-checkpoint crash + degenerate-input guards + ADR-179 (closes Milestone #9) (#1101)
* fix(occworld-candle): security review fixes — int32 checkpoint crash + predict input validation Beyond-SOTA security + correctness review of wifi-densepose-occworld-candle (Milestone #9, crate 4/4 — the last ungated crate). Findings fixed: 1. HIGH (MEASURED) — checkpoint-load crash on any int32 tensor. model.rs mapped safetensors I32 -> candle DType::I64 and passed the raw int32 byte buffer (4 bytes/elem) to Tensor::from_raw_buffer(.., I64, ..). Candle derives elem_count = data.len() / dtype.size(), so the I64 path halved the count while keeping the original shape -> a tensor whose shape claims 2x its storage. Reading it PANICS (slice OOB: "range end index 6 out of range for slice of length 3") on any checkpoint containing an int32 tensor. Fixed: I32 -> DType::I32, I16 -> DType::I16 (both first-class candle dtypes). Reproduced on old code; pinned in tests/checkpoint_loading.rs. 2. LOW (MEASURED) — predict() lacked frame/batch validation at the input boundary. f_in > num_frames*2 over-indexed the temporal embedding (cryptic candle "gather" error); zero frame/batch fed a zero-element tensor in. Now rejected with a clear ShapeMismatch. Pinned in tests/input_validation.rs. 3. LOW (MEASURED) — divide-by-zero panic in the public VQCodebook::encode on a rank-0 / empty-last-dim tensor (last == 0). Now fails closed with a clear error. Pinned in vqvae.rs unit tests. Dimensions confirmed clean with evidence: panic surface (no unwrap/expect/ panic in prod paths), NaN-state-poisoning (N/A — stateless engine, u8 input), unbounded-alloc/shape-data mismatch (defended upstream by safetensors:: validate), secrets (none). unsafe_code = forbid. Validation (MEASURED, Windows): crate 31/31 pass; workspace 0 failed (lone desktop api_integration "Access is denied" file-lock flake passes 21/21 in isolation); Python proof VERDICT PASS, hash f8e76f21…446f7a unchanged. Warrants ADR slot 179 (parent to author). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-179 — occworld-candle checkpoint-load hardening (closes Milestone #9) Records the HIGH int32-checkpoint crash fix (I32→I64 dtype-widening → slice-OOB panic on load = DoS) + 2 LOW degenerate-input fixes from 5e77f47e5. Stateless engine (NaN-poisoning N/A), unsafe forbidden, safetensors validate() defends malloc upstream. occworld 31/31. Final ungated crate — Milestone #9 complete. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
10c813fde3 |
security(desktop): IPC serial-command-injection + over-broad shell capability + ADR-178 (#1100)
* fix(security): desktop IPC serial-command-injection + over-broad shell capability (ADR-178) Beyond-SOTA security review of wifi-densepose-desktop (Tauri v2). Two real findings, each MEASURED on Windows (crate builds + tests under --no-default-features): WDP-DESK-01 (MODERATE) — serial command injection via configure_esp32_wifi. The #[tauri::command] handler concatenated webview-supplied ssid/password into newline-terminated serial commands with no validation; a \r\n let a compromised webview inject an arbitrary follow-up firmware command (reboot/erase). Added validate_wifi_credentials() enforcing WPA2 length bounds and rejecting all control characters, called fail-closed before any serial write. Pinned by 3 new tests (rejects \r\n / \n / NUL injection, rejects out-of-range, accepts valid boundaries). WDP-DESK-02 (MODERATE) — removed unused shell:allow-execute / shell:allow-open from capabilities/default.json. The Rust backend spawns processes via std::process::Command (bypassing the allowlist) and the UI only uses dialog.open; the shell perms were unused privilege granting the webview arbitrary host command execution on compromise. Regenerated capabilities.json confirms only core:default + dialog perms remain. lib tests 18 -> 21 (+3 pins), integration 21 -> 21, 0 failed. Python deterministic proof unchanged (f8e76f21...46f7a; desktop off the signal path). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-178 — desktop IPC injection fix + capability least-privilege Records the 2 MEASURED MODERATE fixes in feddcde9d: WDP-DESK-01 (webview ssid/password \r\n-injected arbitrary firmware serial commands → validated fail-closed) and WDP-DESK-02 (unused shell:allow-execute/open capability granted to the webview → removed). 30-command IPC surface + capability scope audited; 6 dimensions clean-with-evidence. desktop 18→21. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
20ad75f30c |
feat(ADR-131): HOMECORE-UI dashboard + BFF gateway — review-fixed (supersedes #1082) (#1099)
* feat(ADR-131): HOMECORE-UI operational dashboard + BFF gateway Complete two-tier Cognitum operator dashboard (ADR-131), served by homecore-server at /homecore, plus the single-origin BFF gateway that wires it to real backends. Front-end (zero-dep vanilla TS/JS + CSS, exact Cognitum design tokens): - All 10 panels (§4.1-4.10): dashboard, SEED fleet + detail, fleet map, entities (live WS subscribe_events, never polls), rooms, COGs, calibration wizard, events + automation builder, witness/audit, settings. - §6 UX invariants in code: first-class provenance, prominent stale/veto/ fragility, null(not-trained) vs withheld vs error, --mono everywhere, Hailo vs CPU COG distinction. - api.js calls the gateway routes in production; mock demoted to a dev-only ?demo=1 fixture (no mock in prod); typed error states. - Tests under plain node: import-graph, boot, render-smoke (22), interaction (3), prod-errors (13) — 5 files green; bundle ~137 KB (~37x smaller than HA), <2 ms/cold-render. BFF gateway (homecore-server/src/gateway.rs, compiled + tested on Rust 1.89): - /api/cal/* reverse-proxy to the calibration API (ADR-151). - GET /api/homecore/rooms with the RoomState adapter (breathing->breathing_bpm, heartbeat:null->heart_bpm:null, injected anomaly.threshold/room_id). - GET /api/homecore/cogs supervisor over /var/lib/cognitum/apps/. - GET /api/homecore/appliance from /proc + TCP service probes. - SEED-device/appliance routes return typed 503 upstream_unavailable. - cargo test -p homecore-server = 12/12; run live (curl-verified); fixed a real double-v1 proxy-URL bug found during live testing. Honest scope: W1/W2/W4/W6-appliance functional; W3/W5/W6-Hailo/federation return typed 503 (depend on services/hardware not in this repo). Co-Authored-By: claude-flow <ruv@ruv.net> * fix(homecore-ui): resolve code-review findings — SSRF guard, CORS/trace coverage, §6 honesty, crash guards Addresses the high-effort review of PR #1082: - SECURITY: cal_proxy rejects path-traversal/confused-deputy SSRF (`.`/`..` segments, backslash, %2e%2e/%2f, absolute) on raw+decoded forms → 400, before attaching the server-side calibration bearer. - CORRECTNESS: /api/homecore/* + /api/cal/* now covered by the shared CORS allowlist (build_cors_layer, exported from homecore-api) + TraceLayer — previously merged outside router()'s layers (no CORS, no tracing). - §6 HONESTY (no fabricated data): dashboard renders '—' for null metrics (not "null%"/"null°C"); cogs Hailo pill reflects the REAL appliance probe (not hardcoded "connected"); room anomaly threshold passed through / null, not a fabricated 0.5. - ROBUSTNESS: cogs asArray(hef) guards a non-array manifest field; calibration progress guards target<=0 (no NaN%/Infinity%); restart clears the poll timer. - CLEANUP: mock.js is now a cached DYNAMIC import (demo-only) — never bundled in production (§2.2). - New ui/tests/unit-fixes.mjs pins the above; ADR-131 + CHANGELOG updated. Co-Authored-By: claude-flow <ruv@ruv.net> --------- Co-authored-by: Nick Ruest <127058086+nicholas-ruest@users.noreply.github.com> |
||
|
|
1df6d1e1ee |
security(nvsim): guard degenerate input — config panic + NaN silent-corruption + ADR-177 (#1098)
* fix(nvsim): guard degenerate input — config-induced panic + NaN-state poisoning
Beyond-SOTA security review of the ADR-089 NV-diamond simulator (milestone #9,
crate 2 of 4). Two real degenerate-input findings, each pinned fails-on-old:
NVSIM-DT-01 (config panic/DoS, pipeline.rs): an external f_s_hz == 0 made
dt == +Inf, dt_us saturated to u64::MAX, and `sample * dt_us` panicked with
"attempt to multiply with overflow" at sample >= 2 (debug/WASM panic=abort;
garbage t_us in release). Fix: sanitise dt (non-finite/non-positive -> 1 µs
fallback), cap the u64 cast, and saturating_mul the timestamp.
NVSIM-NAN-01 (NaN-state poisoning, digitiser.rs): a non-finite scene parameter
(NaN dipole position / Inf moment / NaN loop radius) bypasses the near-field
clamp (NaN < R_MIN_M is false) and yields a NaN field; at the ADC `NaN as i32`
== 0 silently emitted b_pt=[0,0,0] with ADC_SATURATED CLEAR — indistinguishable
from a legit zero-field reading. Fix at the funnel: adc_quantise treats any
non-finite input as out-of-range -> clamps to code 0 AND raises the saturation
flag, so the corruption is visible downstream.
Determinism integrity, panic-free MagFrame deserialisation, and RNG seeding
confirmed clean with evidence. The published cross-machine witness
(cc8de9b0…93b4) is unchanged — guards only affect degenerate inputs.
cargo test -p nvsim --no-default-features: 50 -> 53 passed, 0 failed.
Workspace green; Python deterministic proof unchanged (f8e76f21…46f7a,
nvsim off the signal proof path). Needs ADR slot 177.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-177 — nvsim degenerate-input hardening
Records the 2 MEASURED MEDIUM fixes in
|
||
|
|
4a083999e5 |
security(ruview-swarm): fail-closed on NaN/Inf at the swarm-comm trust boundary + ADR-176 (#1096)
* fix(ruview-swarm): fail-closed on NaN/Inf at swarm-comm trust boundary (ADR-148)
Beyond-SOTA security review of the ADR-148 drone swarm control plane found
four IEEE-754 NaN/Inf fail-open / DoS bugs on data crossing the untrusted
swarm-comm boundary (receive_peer_state / receive_peer_detection accept full
DroneState/CsiDetection whose f64/f32 fields deserialize with no finite-check).
- HIGH: failsafe::tick collision-avoidance + battery checks fail-open on NaN
(NaN < threshold == false silently disabled collision avoidance / kept a
NaN-battery drone Nominal). Now fails closed to EmergencyDiverge / RTH.
- MED: geofence::check NaN-altitude bypass returned Safe through the
point-in-polygon path. Now leading non-finite-coordinate guard -> HardBreach.
- MED/DoS: antijamming FhssRadio panicked with "% 0" on an empty deserialized
channels_mhz. Now len==0 early-returns (benign 0.0 sentinel).
- LOW: multiview::fuse propagated a NaN victim_position into the fused
"confirmed victim" location. Now requires finite confidence + position.
Each fix pinned by a fails-on-old / passes-on-new test (MEASURED: old code
returned Nominal/Safe or panicked). cargo test -p ruview-swarm
--no-default-features: 117 -> 123 passed, 0 failed. Workspace green; Python
deterministic proof unchanged (f8e76f21...46f7a, off the signal path).
Documented-not-fixed (ADR slot 176): Raft AppendEntries lacks Log-Matching
consistency check (topology/raft.rs); MavlinkSigner::verify uses non-constant
-time tag compare + no replay-window rejection (already doc-flagged).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-176 — ruview-swarm NaN-fail-open safety review
Records the 4 MEASURED fail-open safety bugs fixed in
|
||
|
|
0f64d23516 |
feat(bench): int8 quantization of WiFlow-STD half pose model — MEASURED trade-off (ADR-175, honest negative) (#1095)
Sub-deliverable 8.2 of the benchmark/optimization milestone. Quantizes the 843,834-param "half" WiFlow-STD pose model (half_best.pth) to int8 two ways and MEASURES the accuracy/size trade-off vs fp32 under ONE locked normalization (ADR-173 torso-diameter PCK, upstream calculate_pck use_torso_norm=True), on the same seed-42 file-level 70/15/15 test split that produced the fp32 sweep numbers. MEASURED on ruvultra (RTX 5080, torch 2.11.0+cu128, fbgemm; clean test, torso-PCK): fp32 96.62% pck@20 99.47% pck@50 0.008981 mpjpe 3.351 MB int8 PTQ static 40.98% pck@20 94.98% pck@50 0.038262 mpjpe 1.046 MB (-55.64pp) int8 QAT (3 ep) 67.48% pck@20 98.69% pck@50 0.026548 mpjpe 1.043 MB (-29.15pp) Verdict (honest no): int8 is NOT a win at the strict PCK@20 edge target. Static PTQ collapses; QAT recovers a large share but still loses 29 pp @20 for a 3.2x size win — keep fp32/fp16 on the edge. Disclosed: QAT fake-quant val pck@20 was 83.45% but converted int8 scores 67.48% (~16pp convert_fx gap, reported honestly). Deliverables: - v2/crates/wifi-densepose-train/scripts/quantize_half_int8.py (reproducible: header carries the exact ssh command + run date; QAT primary, static PTQ fallback) - docs/adr/ADR-175-int8-quantization-half-pose-model-measured.md (MEASURED table, locked normalization, QAT-vs-PTQ labeling, verdict, reproduction, limitations) - CHANGELOG [Unreleased] ### Added entry No production Rust or signal-pipeline change. Python deterministic proof unchanged (f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a, bit-exact). |
||
|
|
b209b8b778 |
ci(bench): compile-verify regression gate for v2 criterion benches + ADR-174 (#1094)
* ci(bench): wire v2 criterion benches into CI as a compile-verify regression gate
Sub-deliverable 8.3 of the benchmark/optimization milestone (needs ADR slot 174).
The v2/ workspace ships 26 criterion benches across 18 crates, but benches are
not part of `cargo test`, so nothing in CI compiled them and they silently rot
when a public API they call changes.
Add `.github/workflows/bench-regression.yml`:
- bench-compile (HARD GATE): `cargo bench --workspace --no-default-features
--no-run` compiles + links every default-feature bench (no measurement) plus
the cir-gated cir_bench — a real, deterministic regression guard against
bench bit-rot.
- bench-fast-run (INFORMATIONAL, continue-on-error, never gates): runs a
curated pure-CPU subset (nvsim, ruvector sketch/fusion) in criterion
quick-mode and uploads logs as an artifact.
No timing-regression gate, by design: wall-clock on shared GitHub runners varies
2-3x run-to-run, so a hard threshold or cross-runner `criterion --baseline`
compare would manufacture false failures. The honest scope is compile-verify +
informational-run; the workflow header documents the self-hosted-runner
condition under which true timing-gating becomes honest. The crv-gated crv_bench
is excluded because its crates.io dep ruvector-crv 0.1.1 fails to build upstream.
Running the gate immediately caught one already-bit-rotted bench:
wifi-densepose-mat/detection_bench failed to compile (E0063: missing field
last_rssi in SensorPosition). Fixed (last_rssi: None) and re-verified.
Validation (MEASURED): mat detection_bench + cir_bench + nvsim + ruvector +
vitals + swarm benches compile under --no-default-features; fast subset runs;
`cargo test -p wifi-densepose-mat --no-default-features` 174 passed / 0 failed;
Python proof PASS, hash f8e76f21...46f7a unchanged.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-174 — CI bench-regression compile-verify gate
Records sub-deliverable 8.3 (bench-regression.yml, committed
|
||
|
|
90a88ada9a |
feat(train): metric-locked PCK/MPJPE accuracy harness + ADR-173 (resolve PCK-definition ambiguity) (#1092)
* feat(train): metric-locked PCK/MPJPE accuracy harness — resolve PCK-definition ambiguity
The SOTA brief (docs/research/sota-nn-train-benchmark-brief.md §1/§3.1/§4)
identifies metric ambiguity as the single biggest threat to any beyond-SOTA
claim: three PCK@20 numbers (96.09% WiFlow-STD image-normalized, 81.63%
AetherArena torso-PCK, 61.1% GraphPose-Fi standard PCK) cannot be lined up
because each silently uses a different normalization. The project was retracted
twice over this (a withdrawn 92.9% used absolute pixels, not torso).
New src/accuracy.rs makes the normalizer explicit, selectable, and carried with
every reported number:
- PckNormalization enum: TorsoDiameter (standard MM-Fi/GraphPose-Fi hip↔hip),
BoundingBoxDiagonal (looser WiFlow-STD image-normalized), AbsolutePixels(t)
(retracted convention, reproducible + clearly non-comparable).
- pck_at(pred, gt, vis, k, normalization) — one canonical PCK reusing the
metrics_core geometric primitives (no duplicate kernel).
- mpjpe(pred, gt, vis) — 2D/3D, mm.
- PoseAccuracy { pck_at: BTreeMap<u8,f32>, mpjpe, normalization, n_keypoints,
n_frames } via accuracy_report(frames, ks, normalization) — an unlabeled PCK
number is structurally impossible.
17 hand-computed deterministic tests (no GPU, no datasets) prove the harness
arithmetic, including the key proof that identical predictions score
0.50 / 1.00 / 0.75 under the three normalizations, plus graceful degenerate
handling (zero torso, empty frames, NaN coords — no panic, never false-perfect).
This is measurement infrastructure, NOT an accuracy claim. Public API worth an
ADR — needs ADR slot 173 (parent to write).
wifi-densepose-train lib 191→206, test_metrics 12→14, 0 failed; full workspace
green (exit 0); Python deterministic proof unchanged
(f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-173 — metric-locked PCK/MPJPE accuracy harness
Documents the accuracy harness (committed
|
||
|
|
cfd0ad76cf |
security(core,cli): pin CSI-deserialiser DoS-resistance + ADR-172 (clean-with-evidence) (#1091)
* test(core,cli): pin DoS-resistance of CSI deserialisers (ADR-127 security review)
Beyond-SOTA security review of wifi-densepose-core + wifi-densepose-cli.
Load-bearing-question verdict: the NaN-state-poisoning bug class does NOT
originate in core — core exposes no stateful accumulator (no Welford,
von-Mises, IIR, voxel grid, running mean); each downstream crate rolls its
own, so each fix is correctly local. Both crates confirmed clean on every
reviewed dimension (panic-on-adversarial-input, NaN handling, unbounded
memory, path traversal, secrets) — no production code changed.
Adds 4 regression pins locking in two existing-but-untested DoS guards:
- core: from_canonical_bytes shape guard (Vec::with_capacity bound) — proven
to fail with `capacity overflow` when the saturating-mul guard is removed.
- core: canonical decoder never panics on arbitrary/truncated bytes.
- cli: parse_csi_packet rejects an oversized n_antennas*n_subcarriers claim
before Array2 allocation (33 MB claim in a 2 KB datagram -> None).
- cli: parse_csi_packet never panics on arbitrary UDP bytes.
core: 35 -> 37 lib tests; cli: 24 -> 26 tests; 0 failed. Python proof
unchanged (f8e76f21…46f7a — off the signal path).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(adr): ADR-172 — wifi-densepose-cli + core CSI-deserialiser security review
Records the clean-with-evidence verdict + 4 DoS-resistance regression pins
(test-only, committed in
|
||
|
|
71e8756051 | docs(research): SOTA evidence brief for nn/train benchmark ADR (#1090) | ||
|
|
5287497a4a |
security(homecore-migrate): redact secret value from malformed secrets.yaml error (#1089)
* fix(homecore-migrate): redact secret value from malformed secrets.yaml error (secret-leak)
`read_secrets` wrapped serde_yaml's parse error into `MigrateError::YamlParse {
source }`. serde_yaml's message for a typed-tag coercion failure embeds the
offending scalar verbatim, e.g. `invalid value: string "<the-secret-value>"`.
That error propagates out of `read_secrets`, is `?`-returned by the
`InspectSecrets` CLI path in main.rs, and printed to stderr by anyhow — leaking
a secret value despite the CLI's deliberate `<redacted>` design.
Fix: secrets.yaml parse failures now map to a new redacting variant
`MigrateError::SecretsParse { path, line, column }` that carries only the file
path and a coarse location (from `serde_yaml::Error::location()`), never the
scalar content. Other (non-secret) YAML files keep `YamlParse`.
Pinned by `secrets::tests::malformed_secrets_error_never_contains_secret_value`
(asserts the rendered error AND its full #[source] chain never contain the
secret value; fails on the old `YamlParse` path) plus
`malformed_secrets_error_reports_location` (still fail-closed + locatable).
ADR-165 secret-handling rule: a secret value must never appear in output.
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(homecore-migrate): record secret-leak fix in ADR-165 + CHANGELOG
Note the secrets.yaml error-redaction fix and the review's clean dimensions
(read-only source / no traversal / no panic / fail-closed versioning / no
injection) in ADR-165 §2.4, bump the test-evidence count 19→21 in §2.6, and add
an [Unreleased] Security entry to CHANGELOG.
Co-Authored-By: claude-flow <ruv@ruv.net>
|
||
|
|
bf1dfe79fd |
fix(homecore core): TOCTOU race dropped/reordered state_changed events under concurrent writers (~93k→0) + 2 fail-closed hardenings (#1087)
* fix(homecore): atomic state set — close TOCTOU lost/reordered state_changed events StateMachine::set did get() (release shard lock) → compute next + no-op decision → insert() (re-acquire lock) → send(). The read-modify-write was not atomic w.r.t. a concurrent writer on the same entity: a writer that read a stale `old` could mis-classify a real transition as a no-op and drop its state_changed event (a missed automation trigger) or fire an event whose new_state duplicated the previously delivered one (a spurious trigger for any automation keyed on old_state != new_state). ADR-127 §2.1 promises "writer atomically replaces the map entry"; the implementation did not. Fix: hold the DashMap shard write-lock across the whole read→decide→insert→ fire sequence via entry()/insert_entry(). tx.send is non-blocking, non-async, and never re-enters the map, so firing under the shard lock cannot deadlock and keeps global event order in lock-step with global commit order. Pinned by concurrent_set_fires_no_duplicate_adjacent_events: 4 writers toggling one entity A/B; asserts no two consecutive fired events carry the same new_state (impossible under correct serialisation). Fails reliably on the old code (~365-476 duplicate-adjacent events on the first trial), passes on the fix across repeated runs. Co-Authored-By: claude-flow <ruv@ruv.net> * harden(homecore): bound entity_id length — close memory-DoS at the REST boundary homecore-api/src/rest.rs parses untrusted path segments straight through EntityId::parse (get/delete/set_state). With no length cap, an otherwise-valid id like "a." + many MB of [a-z0-9_] was accepted; a POST /api/states/<giant> would persist it into the DashMap state store, permanently growing memory (amplification across distinct ids). Fix: reject ids longer than MAX_ENTITY_ID_LEN (255, HA-compatible) up front in parse(), before any per-char scan, with a new EntityIdError::TooLong. Fails closed at the boundary type so every caller (REST, registry deserialize, automation) is protected. Pinned by entity_id_length_boundary: exactly-MAX accepted, MAX+1 rejected, 4 MiB id rejected as TooLong. Fails on old code (oversized parses Ok). Co-Authored-By: claude-flow <ruv@ruv.net> * harden(homecore): isolate panicking service handlers (catch_unwind) ServiceRegistry::call already ran handlers outside the registry lock (the Arc<dyn ServiceHandler> is cloned out of the read guard first), so a panic could never poison the RwLock or block other callers — good. But a panicking handler unwound through call() into the caller's task; the task driving the engine (e.g. an axum request handler invoking a service) could be aborted by one buggy integration. Fix: wrap the handler future in AssertUnwindSafe + FutureExt::catch_unwind and convert a panic into ServiceError::HandlerPanicked. Mirrors HA isolating service-handler exceptions. The registry stays fully usable afterwards. Pinned by panicking_handler_is_isolated_and_registry_survives: the panicking call returns HandlerPanicked (not an unwind), a sibling healthy service still returns its value, and the bad service remains registered. Fails on old code (the await point panics instead of returning Err). Co-Authored-By: claude-flow <ruv@ruv.net> * test(homecore): pin event-bus lag safety (bounded broadcast, no DoS) Documents-with-evidence that the core EventBus does NOT have the homecore-api WS broadcast-lag failure: with EVENT_CHANNEL_CAPACITY=4096, firing 3x capacity while a subscriber never drains keeps fire_* non-blocking (publisher never waits on slow receivers), gives the slow receiver a recoverable Lagged(n) (drop-oldest + re-sync) rather than a closed channel, and leaves the bus live for a fresh fast subscriber. No code change — pins the clean dimension. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(homecore): record ADR-127 §9 security+concurrency review + CHANGELOG Documents the three pinned fixes (HC-RACE-01 state-set TOCTOU, HC-EID-LEN-01 entity_id memory-DoS, HC-SVC-PANIC-01 service-handler isolation) and the clean dimensions (bounded event-bus lag handling, lock discipline / no lock-across-await, no panic-on-input) with their evidence. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
9b126e927e |
harden(assist security): bound untrusted utterance (DoS); cmd-injection/ReDoS/NaN/fail-open all proven clean with evidence (#1086)
* fix(homecore-assist): bound untrusted utterance length, fail closed (ADR-133 security)
The intent recognizers accept utterances from untrusted callers (voice
transcripts, the WebSocket `assist` command). Neither the regex nor the
semantic path bounded utterance length, so a pathological multi-megabyte
utterance forced an unbounded `to_lowercase()` clone plus a per-registered-
pattern scan (and, in the semantic path, full tokenisation + feature-hash
embedding) — an allocation/CPU amplification on attacker-controlled input.
The `regex` crate is linear-time (no catastrophic backtracking), so this was
a throughput/memory DoS rather than a hang, but it was still unbounded.
Fix: introduce MAX_UTTERANCE_BYTES (4 KiB — far above any real spoken
command) and check it at both recognizer boundaries BEFORE any allocation or
scan. An over-length utterance fails closed: Ok(None) (no intent, no action),
identical to an unrecognised phrase. No legitimate command is affected.
Pinned by fails-on-old tests:
- recognizer::over_length_utterance_fails_closed — an over-length utterance
that contains a valid command resolves to None (would have matched before)
- semantic_recognizer::over_length_utterance_fails_closed_semantic
Co-Authored-By: claude-flow <ruv@ruv.net>
* test(homecore-assist): pin clean security dimensions with evidence (ADR-133)
Adds regression tests documenting the dimensions reviewed and found clean,
so the properties cannot silently regress:
- runner: no subprocess surface exists. RufloRunnerOpts.{script_path,env}
are inert and never executed; even a hostile script_path/env spawns
nothing. And the entity_id capture class [a-z0-9_ .] strips every shell
metacharacter, so a resolved slot can never carry ; | & $ ` / etc into a
(future) argv — sanitisation by construction.
(shell_metachars_never_survive_into_a_resolved_slot,
runner_opts_are_inert_no_process_spawned)
- recognizer: the regex crate is a linear-time finite automaton; a classic
catastrophic-backtracking shape (a+)+$ on adversarial input completes in
bounded time — no ReDoS.
(pathological_backtracking_pattern_completes_in_bounded_time)
- embedding: embeddings are structurally finite (FNV feature-hash + guarded
L2 normalise, no external float input, no unguarded division), so a crafted
utterance cannot inject NaN/Inf to poison cosine k-NN; cosine against the
zero vector is a finite 0.0, never NaN.
(embeddings_are_structurally_finite, cosine_with_zero_vector_is_finite_not_nan,
empty_utterance_against_empty_index_no_panic_no_match)
- pipeline: injection-shaped utterances never deliver a metacharacter into a
service call; the worst case resolves to a clean entity token, and an
unrecognised utterance fails closed to not_understood (no action).
(pipeline_injection_shaped_utterance_carries_no_metachars_to_service)
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(homecore-assist): record ADR-133 security review (HC-ASSIST-01 + clean dims)
CHANGELOG [Unreleased] Security entry + ADR-133 section 6 review notes for the
homecore-assist voice/intent pipeline review.
Co-Authored-By: claude-flow <ruv@ruv.net>
|
||
|
|
41bee64593 |
fix(recorder): bound history query (memory-DoS) + add missing transactional purge (disk-DoS); SQL-injection & NaN dims clean (#1084)
* fix(homecore-recorder): bound history query + add transactional purge (memory-DoS + disk-DoS)
Security review of the HA-compat state recorder (ADR-132) found two real
bounding bugs; SQL-injection and NaN-index dimensions confirmed clean.
(1) Memory-DoS: get_state_history carried no LIMIT — a wide [since,until]
window over a high-frequency entity loaded an unbounded row set into a
single in-memory Vec. Added LIMIT MAX_HISTORY_ROWS (1,000,000); the
sibling search paths were already k-bounded.
(2) Disk-DoS / documented-but-missing purge: README advertised
Recorder::purge(older_than) but no retention path existed -> unbounded
disk growth. Added a transactional purge with an EXCLUSIVE cutoff
(idempotent, no off-by-one) that deletes old states+events and
garbage-collects orphaned state_attributes blobs (dedup-shared blobs
are kept until their last referencing state is gone). All three deletes
run in one transaction so a mid-purge failure rolls back cleanly.
Pinning tests (homecore-recorder 19->25 no-default / 25->31 ruvector, 0 failed):
- malicious_entity_id_is_stored_literally_not_executed (SQL injection)
- like_metacharacters_in_query_are_literal_not_wildcards (LIKE escape)
- history_query_carries_a_limit_clause (memory-DoS bound)
- purge_keeps_boundary_row_and_drops_older (exclusive-cutoff, true pin)
- purge_gcs_orphaned_attributes_but_keeps_shared (dedup-safe GC)
- purge_also_removes_old_events
No behaviour change beyond the two fixes. Python deterministic proof
unchanged (recorder is off the signal proof path).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(homecore-recorder): record ADR-132 security review findings
Add a "3a. Security review" section to ADR-132 and a CHANGELOG [Unreleased]
Security entry covering the homecore-recorder review: SQL-injection and
NaN-index dimensions confirmed clean with evidence (every query bound; LIKE
pattern bound+escaped; SHA-256->i32->f32 embeddings always finite, empty
index/k=0 probed no-panic), plus the two fixes (unbounded history LIMIT,
transactional exclusive-cutoff purge with orphan-attribute GC).
Co-Authored-By: claude-flow <ruv@ruv.net>
|
||
|
|
5bc3b634b7 |
fix(automation security): template-bomb DoS (100MB/11s render → fuel-bounded, HIGH) + delay panic-on-config (MEDIUM) (#1083)
* fix(homecore-automation): bound template render to stop unbounded-expansion DoS (HC-SEC-01)
A `template:` condition / value_template comes straight from user
automation config and was rendered with MiniJinja's default (no
instruction budget, no output cap). A single condition such as
`{% for i in range(5000) %}{% for j in range(5000) %}xxxx{% endfor %}{% endfor %}`
rendered a 100 MB string over ~11 s on one render call (proven
empirically) — a CPU/memory denial of service, the bfld-class
"unbounded expansion".
Fix:
- Enable MiniJinja's `fuel` feature and set a per-render instruction
budget (`set_fuel(Some(1_000_000))`). A nested loop burns one unit
per iteration, so the budget caps total work regardless of nesting;
the attack now fails fast (~90 ms) with "engine ran out of fuel".
- Reject template sources over 64 KiB before compilation (defense in
depth so a pathological literal can neither compile nor emit verbatim).
Legitimate HA templates (a few dozen instructions) are unaffected.
Tests (fail on old — unbounded render / no rejection):
- nested_loop_template_is_bounded_not_unbounded_dos
- single_huge_repeat_template_is_bounded
- oversized_template_source_is_rejected
- legitimate_template_still_renders_within_fuel (no regression)
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(homecore-automation): stop crafted delay/timeout from panicking the run task (HC-SEC-02)
`Action::Delay { seconds }` and `Action::WaitForTrigger { timeout_seconds }`
fed the user-supplied float straight into `Duration::from_secs_f64`, which
PANICS on negative, NaN, infinite, or overflowing inputs. All of those are
reachable from a crafted (or simply typo'd) automation YAML —
`delay: {seconds: -1}`, `.nan`, `.inf`, `1e308` — so one hostile config
aborts the spawned automation task with a panic
("cannot convert float seconds to Duration: value is negative", proven
empirically).
Fix: a `safe_duration_from_secs` guard that saturates instead of panicking,
matching Home Assistant's lenient "non-positive delay = no delay":
- NaN / ±inf / negative -> Duration::ZERO
- absurdly large (would overflow) -> clamped to ~100 years (MAX_DELAY_SECS)
Tests (fail on old — panic = failure):
- delay_negative_seconds_does_not_panic
- delay_nan_seconds_does_not_panic
- delay_infinite_seconds_does_not_panic
- wait_for_trigger_negative_timeout_does_not_panic
- safe_duration_saturates_hostile_values (incl. overflow clamp)
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(homecore-automation): record HC-SEC-01/02 security review (CHANGELOG + ADR-129 §8a)
Document the two DoS findings (template unbounded-expansion HC-SEC-01,
delay panic-on-config HC-SEC-02) and the dimensions probed clean
(condition fail-closed, bounded run-modes, sandboxed read-only templates).
Co-Authored-By: claude-flow <ruv@ruv.net>
|
||
|
|
e1f4897269 |
fix(geo numerical): parse_hgt underflow/inf-grid (HIGH) + haversine asin-NaN; pointcloud confirmed-robust (NaN-poisoning class, 3rd find) (#1081)
* fix(geo numerical robustness): parse_hgt underflow panic + haversine asin-domain NaN Targeted numerical-robustness audit of wifi-densepose-geo (ADR-154-class sweep). Two real bugs, each pinned by a fails-on-old test: 1. terrain.rs parse_hgt — usize underflow panic on degenerate input. `side = sqrt(n_samples)`; for empty / sub-2x2 buffers side <= 1, so `1.0 / (side - 1)` underflows `usize` (panic "attempt to subtract with overflow" in debug; wraps to a huge value in release → garbage/inf cell_size_deg that poisons every ElevationGrid::get). A truncated HTTP body or a 404 HTML page reaches parse_hgt. Now bails with a clear error when side < 2. 2. coord.rs haversine — asin domain overflow → NaN for (near-)antipodal points. Floating rounding can push `h.sqrt()` to 1.0 + ~4e-16, and `asin(>1)` is NaN (verified: pair (-44.4994,-178.95722)→(44.49939999, 1.04278001) yields h=1.0000000000000004). A NaN distance silently breaks all downstream `<`/`>` comparisons. Clamp into [0,1] before asin. Also pins the ±90° pole-singularity (cos(lat)=0 division) as no-panic; the ENU transform itself is unchanged (no behavior change for valid inputs). Tests: wifi-densepose-geo 9→15 lib (6 new), 8 integration unchanged. 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * test(pointcloud robustness): pin NaN-state-poisoning resistance + degenerate voxel fusion Numerical-robustness audit of wifi-densepose-pointcloud. No bug found — the crate is confirmed-robust against the proven NaN-state-poisoning class that bit calibration/vitals. This adds regression pins documenting why: 1. csi_pipeline.rs — persistent auto-accumulating state (occupancy EMA, vitals) is provably self-healing. The UDP parser only emits finite amplitudes/phases (sqrt/atan2 of i8), and even an adversarial hand-built CsiFrame with NaN/inf amplitudes+phases cannot latch non-finite state: motion_score = (NaN/100).min(1.0) → 1.0; breathing path → 0 → clamp(5,40) → 5.0; tomography EMA uses only integer rssi. The new test injects 40 poisoned frames and asserts occupancy/vitals stay finite AND the pipeline recovers to an in-range estimate afterward — so a future refactor that drops a `.min`/`.clamp` self-heal would fail this pin. 2. fusion.rs — fuse_clouds voxel averaging is div-by-zero-safe (per-voxel count >= 1 by construction). Pins empty / single-point / all-coincident inputs as no-panic with finite output. No behavior change. Tests: wifi-densepose-pointcloud 18→22 (4 new), 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(geo/pointcloud robustness): CHANGELOG + ADR-154 sibling-crate sweep note Record the wifi-densepose-geo + wifi-densepose-pointcloud numerical-robustness audit under CHANGELOG [Unreleased] → Fixed, and a sibling-crate-extension note on the ADR-154 horizon ledger (these crates are outside ADR-154's signal scope but the sweep is the same ADR-154 class). Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
9f80b66ae3 |
harden(cog-ha-matter crypto): domain-separate witness signing + verify_strict (signing chain otherwise sound — P2 crypto core verified) (#1080)
* fix(cog-ha-matter): domain-separate witness signing chain + verify_strict (ADR-116 §2.2) Crypto review of the SHA-256 + Ed25519 witness chain that ADR-262 P2 reuses. The sibling wifi-densepose-engine bug class (unframed concatenation of operator-influenceable strings into a signed digest) is ABSENT here — canonical_bytes already length-prefixes kind/payload. Two real hardening gaps fixed: - CHM-WIT-01: add a versioned domain-separation tag (WITNESS_DOMAIN_TAG = b"cog-ha-matter/witness-event/v1\0") to canonical_bytes so the witness SHA-256 preimage / Ed25519 message cannot be replayed as a message for another signing context that shares key infrastructure (notably the manifest binary_signature). Completes the engine review's "domain-tag + length-prefix" rule. Witness bytes change by design (prior on-disk hashes/sigs invalidated); no in-repo crate consumes these bytes programmatically. - CHM-WIT-02: verify_signature uses VerifyingKey::verify_strict (rejects non-canonical encodings + small-order keys) for the audit-uniqueness property. Key stays caller-pinned (not read from the event). Pinned by fails-on-old tests: canonical_bytes_is_domain_separated, canonical_bytes_starts_with_domain_tag_then_prev_hash, witness_preimage_cannot_collide_with_a_bare_manifest_digest, signature_commits_to_domain_tag_not_bare_fields; key-pinning guarded by verify_uses_strict_path_and_pins_caller_key. cog-ha-matter 64 -> 68 tests, 0 failed. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(cog-ha-matter): record ADR-116 crypto review findings (CHM-WIT-01/02) CHANGELOG [Unreleased] Security entry + ADR-116 §4.1 review notes: engine-class signed-digest collision confirmed ABSENT (length-prefixing already correct), domain-separation tag added, verify_strict hardening, and the clean dimensions (verify-before-trust, key-handling, determinism, fail-closed parsing) with byte-layout evidence. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
02cb84e0bb |
fix(vitals safety): non-finite CSI frame permanently froze breathing+HR via IIR-state poisoning (self-heal) + noise-never-Valid pin (#1079)
* fix(vitals): self-heal IIR filters after non-finite CSI frame (ADR-021/ADR-158 §A1) The 2nd-order resonator bandpass_filter in BreathingExtractor and HeartRateExtractor latches each output y[n] into the filter state (y1/y2). A single non-finite amplitude residual from a corrupt CSI frame produced a NaN output that was written into the state. The existing extract() is_finite() guard dropped that one sample from the history buffer but never sanitized the poisoned filter state, so every subsequent output stayed NaN, was rejected too, and the sliding-window history never refilled: breathing AND heart-rate extraction went silently dead (returning None forever) until reset(). On the vitals alert path this is a safety-relevant denial of service — one bad frame stops monitoring with no error surfaced. Same class as the calibration NaN bug (ADR-154 §3) and the firmware vitals fixes (#998/#996/#987): prior hardening guarded the history boundary but not the filter-state boundary. Fix: when bandpass_filter computes a non-finite output it resets the IIR state to default and returns 0.0, so the resonator recovers on the next clean frame (the 0.0 is still dropped by the caller's finite-check, so no spurious sample enters history). Also de-magic the safety-critical HR physiological plausibility band into named HR_PLAUSIBLE_MIN_BPM/HR_PLAUSIBLE_MAX_BPM consts (value-identical 40/180 BPM). Pinned by: - breathing::tests::nan_frame_does_not_permanently_poison_filter (FAILS pre-fix) - breathing::tests::inf_mid_stream_does_not_freeze_history (FAILS pre-fix) - heartrate::tests::nan_frame_does_not_permanently_poison_filter (FAILS pre-fix) - heartrate::tests::pure_noise_is_never_reported_valid (fabricated-vital negative) - heartrate::tests::plausibility_band_constants_pinned (de-magic value pin) wifi-densepose-vitals --no-default-features: 55->60 lib tests, 0 failed. Workspace green (3370 passed, 0 failed). Python proof unchanged (vitals off the deterministic proof's signal path). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(vitals): record IIR NaN/inf self-heal fix (ADR-021, CHANGELOG) Document the wifi-densepose-vitals filter-state poisoning fix in ADR-021 Implementation Notes (parallel to the firmware #998/#996/#987 robustness class) and add a CHANGELOG [Unreleased] Fixed entry. Notes the confirmed clean dimensions with evidence (flat -> None; noise -> low-confidence Unreliable, never Valid; harmonic-rich breathing -> not a confident false HR; out-of-band BPM clamped). Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
ebfaee4437 |
fix(calibration): NaN-poisoning silently disabled presence specialist (Features::from_series unguarded) + de-magic (#1077)
* fix(calibration): drop non-finite samples in Features::from_series (ADR-151) A single NaN/inf scalar sample (corrupt CSI frame) poisoned mean/variance into NaN, which — baked into a persisted PresenceSpecialist::threshold — silently disabled presence detection (every `f.variance > NaN` is false), no error raised. extract.rs is the live-inference + training feature path, yet (unlike geometry_embedding.rs) had no non-finite guard. Fix at the production boundary: filter non-finite samples before computing any statistic; an all-non-finite series degrades to Features::ZERO, same as the empty series. Value-identical for all-finite input (full_loop + existing extract tests unchanged). Pinned by two fails-on-old tests. Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(calibration): de-magic specialist thresholds to named consts (ADR-151) Promote the bare default min-score literals (breathing 0.25, heartbeat 0.3) and the anomaly score scale / label cutoff (2.0× spread, > 0.5) to documented named consts. Value-identical — pinned by characterization tests asserting the consts equal the prior literals and the gate boundary (score >= floor). Co-Authored-By: claude-flow <ruv@ruv.net> * docs(calibration): record ADR-151 review — NaN fix + clean dimensions CHANGELOG [Unreleased] Security entry and ADR-151 §6.1 review note for the beyond-SOTA correctness+security review: NaN-poisoning fail-closed fix, file/path (no I/O in crate), untrusted-load, receipt/hash (absent), and the clean numerical paths — all with evidence. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
db3d94a313 |
fix(homecore-api security): auth-gate GET /api/ (was unauthenticated) + recover WS subscription on broadcast lag (#1076)
* fix(homecore-api security): auth-gate GET /api/ (HC-API-AUTH-01, ADR-161)
`rest::api_root` took no headers and unconditionally returned
`200 {"message":"API running."}`, while every sibling REST route gates
on `BearerAuth::from_headers`. HA's `APIStatusView` inherits
`requires_auth = True`, so `/api/` must return 401 for a missing/wrong
bearer — HA clients use it as a token-validation probe, so a 200 told a
bad-token client its token was valid and let an unauthenticated party
confirm a live endpoint. LOW severity (static body, no data leak),
reported at true severity.
Fix: `api_root(headers, State)` validates the bearer like `get_config`.
Pinned by fails-on-old tests (200 -> assert 401):
- api_root_rejects_missing_bearer
- api_root_rejects_wrong_bearer
guarded by api_root_accepts_correct_bearer (still 200 with valid token).
Co-Authored-By: claude-flow <ruv@ruv.net>
* fix(homecore-api security): recover WS subscription on broadcast lag (HC-WS-LAG-01, ADR-161)
`subscribe_events`'s per-subscription task matched `Err(_) => break` on
both broadcast `recv()` arms. `RecvError::Lagged(n)` (a slow consumer
falling >EVENT_CHANNEL_CAPACITY=4,096 events behind) is recoverable —
the bus doc says "Lagged receivers must re-sync" and HA keeps the
subscription alive across a lag. The old code treated the first lag as
fatal, so after an event burst the client's stream went permanently
silent with no error frame — a self-inflicted event-delivery DoS under
load. LOW severity.
Fix: `Lagged(_) => continue` (skip dropped window, re-sync),
`Closed => break`, on both the system and domain arms.
Pinned by subscription_survives_broadcast_lag: subscribes, floods 6,000
filtered events past the 4,096 capacity to force a Lagged, then asserts
a subsequent subscribed event is still delivered (old code: 5s timeout).
Co-Authored-By: claude-flow <ruv@ruv.net>
* docs(homecore-api security): record HC-API-AUTH-01 + HC-WS-LAG-01 review (ADR-161)
CHANGELOG [Unreleased] Security entry + ADR-161 addendum documenting the
beyond-SOTA network-API review: two LOW bugs fixed (unauthenticated
GET /api/; WS subscription killed on broadcast lag) and the
auth/traversal/injection/info-leak/CORS dimensions confirmed clean with
evidence (no traversal surface — in-memory DashMap + EntityId allowlist;
HashSet token compare, not a byte-== timing oracle).
Co-Authored-By: claude-flow <ruv@ruv.net>
|
||
|
|
a369fbe66e |
fix(bfld security): close HIGH privacy-bypass in process_to_frame (identity surface leaked despite restrictive class) + JSON-injection (#1075)
* fix(bfld): route process_to_frame payload through PrivacyGate (ADR-141 privacy bypass) BfldPipeline::process_to_frame stamped the frame header with the active privacy class but serialized the caller-supplied BfldPayload UNCHANGED via BfldFrame::from_payload. This let a frame labeled Anonymous(2) or Restricted(3) carry the full identity-leaky compressed_angle_matrix (+ amplitude/phase proxies, csi_delta) that PrivacyGate::demote is documented and tested (privacy_gate_demote.rs) to strip at exactly those classes. A NetworkSink accepts class >= Derived(1), so such a frame would publish the beamforming angle matrix — the identity surface — across the node boundary despite its restrictive class byte. The class byte lied about payload content. Fix: after building the frame at the active class, apply PrivacyGate::demote to the same class. demote() strips sections by target-class threshold (independent of any class transition), so a same-class demote performs no class change but brings the payload into policy compliance. Research classes (Raw/Derived) keep the full payload — demote is a no-op there. Pinned by three fails-on-old tests in pipeline_to_frame.rs: - process_to_frame_at_anonymous_strips_identity_leaky_sections (FAILED pre-fix) - process_to_frame_in_privacy_mode_strips_amplitude_and_phase (FAILED pre-fix) - process_to_frame_at_derived_preserves_full_payload (guards against over-strip) The pre-existing round-trip test is updated to assert the gated payload. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(bfld): JSON-escape zone_id in MQTT state-topic payload render_events emitted the zone_activity payload as format!("\"{zone}\"") with no escaping, while ha_discovery.rs already escapes operator-controlled strings via push_str_field. A zone name containing a double-quote or backslash therefore produced malformed / injectable JSON on the state topic that Home Assistant parses (e.g. zone `a"b` -> payload `"a"b"`). Fix: add json_string_literal() mirroring ha_discovery's escaping (", \, \n, \r, \t, control chars) and use it for the zone payload. Value-identical for normal zone names (living_room etc.). Pinned by zone_payload_escapes_json_metacharacters (FAILED pre-fix); the existing zone_payload_is_json_string_with_quotes still passes unchanged. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-141): record bfld privacy+security review findings + CHANGELOG Document the two fixed bugs (process_to_frame privacy-bypass; zone_id JSON injection) and the dimensions confirmed clean (event-field gating, witness/hash framing, fail-closed) in ADR-141, plus CHANGELOG [Unreleased] Security/Fixed entries. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
d2089c342a |
fix(engine security): close witness domain-separation collision in governed-trust cycle + prove privacy monotonicity (#1074)
* fix(engine): length-prefix witness fields to close domain-separation collision The BLAKE3 trust witness concatenated model_version, calibration_version, and privacy_decision boundary-to-boundary, with the variable-length evidence list lacking an explicit count. A string straddling a field boundary (e.g. a per-room adapter id absorbing the leading bytes of the calibration epoch, or a model_version absorbing a trailing evidence ref) collided with a different trust decision — silently un-distinguishing two distinct privacy-relevant inputs and defeating the ADR-137 tamper/drift audit guarantee. model_version is operator-influenceable via the adapter id (ADR-150 §3.4), so the ambiguity was reachable. Fix: domain-tag the hash and length-prefix every field (8-byte LE length), plus an explicit evidence count. Pinned by two fails-on-old tests: witness_distinguishes_model_calibration_boundary and witness_distinguishes_evidence_model_boundary. Co-Authored-By: claude-flow <ruv@ruv.net> * test(engine): pin privacy monotonicity, fail-closed boundaries; de-magic constants Review hardening for the governed-trust cycle (no behavior change): - forced_contradiction_never_relaxes_class: property test over all 5 privacy modes proving a forced contradiction only ever raises the emitted class byte (more restrictive) and a clean cycle emits exactly the base class — the ADR-141/120 information-only-removed invariant. - empty_cycle_fails_closed: a zero-frame cycle errors (fusion NoFrames), emits no SemanticState, and does not advance the cycle counter. - single_node_cycle_is_well_formed: characterizes the n=1 boundary (no mesh, no directional, base class, witness still emitted) — documents single-node sensing as a valid non-demoting mode, not a bypass. - De-magicked the engine-construction literals (coherence accept gate, ADR-143 SLAM discovery + static-anchor thresholds) into named documented consts, value-identical, pinned by engine_constants_match_prior_values. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(engine-review): record witness domain-separation fix + monotonicity clean bill CHANGELOG [Unreleased] Security entry and review notes appended to ADR-137 (witness domain-separation fix) and ADR-141 (privacy monotonicity confirmed clean over all 5 modes, fail-closed boundaries pinned). Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
306d009e72 |
feat(rufield): rufield-viewer live-ingest mode (submodule bump) (#1072)
Bumps vendor/rufield to add --source live --upstream: the dashboard ingests RuView's /ws/field events, verifies each ed25519 receipt on ingest (forged events flagged, never fused), and renders real RuView FieldEvents through the same display path. Honest SYNTHETIC/LIVE/DISCONNECTED banner, mutually exclusive, never mislabeled (409 on /api/run in live mode). Closes the RuView↔RuField visual loop (ADR-262 surfaces). 26 tests, 0 failed. Co-authored-by: ruv <ruvnet@gmail.com> |
||
|
|
df617145d6 |
feat(ADR-262 P3): live /api/field + /ws/field — RuView sensing speaks RuField (fail-closed egress) (#1071)
* feat(ADR-262 P3): live RuField surface — RuView sensing speaks RuField on /api/field + /ws/field Wire the P1 `wifi-densepose-rufield` bridge into the live `wifi-densepose-sensing-server` so the governed sensing cycle emits real signed RuField `FieldEvent`s on two additive endpoints. - Cargo: add the `wifi-densepose-rufield` path dep (the single coupling point, ADR-262 §5.4 — no new RuView-internal coupling). - New `src/rufield_surface.rs` (kept out of the 8k-line main.rs): `FieldSurface` holds a dedicated ed25519 `Signer` + a bounded ring of recent events + the `/ws/field` broadcast topic; `GET /api/field` and `GET /ws/field` handlers; a standalone `router()` for isolated testing. - Signer (defers the P2 key decision, ADR-262 §8 Q1): a STANDALONE dev/sensing key from `WDP_RUFIELD_SIGNING_SEED`, else a deterministic dev default with a logged WARN. Reusing the `cog-ha-matter` Ed25519 key is the deferred P2 call — P3 does not pre-empt it. - Tap: at the ESP32 governed-trust cycle (`main.rs` ~5886 observe_cycle / ~5938 SensingUpdate build), `emit_rufield_event` joins the cycle's features/classification/signal_field with the engine's effective_class/demoted trust state into a `SensingSnapshot` and surfaces it via the bridge. Existing endpoints (`/ws/sensing` etc.) are unchanged — purely additive. - Privacy egress: `network_egress_allowed` is fail-closed for an unattended live surface — only P1/P2 leave the box; P0 raw and P3/P4/P5 (identity/biometric/aggregate) are held edge-local. A `Derived` cycle maps to P4/P5 and never surfaces. - No-phantom: `emit` drops no-presence cycles (no fabricated events). Gates (tests/rufield_surface_test.rs, tower::oneshot, 4/0): well-formed signed event (WifiCsi, P2 not P1, is_fusable, real timestamp); empty cycle → no phantom; Derived trust never surfaces; mixed stream surfaces only egress-safe events. Honesty (ADR-262 §0/§6): real plumbing on a live endpoint, NOT accuracy. Single-link CSI with its existing caveats (no validated room-coordinate accuracy); dedicated dev signing key pending the P2 ownership decision; no accuracy claim. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(ADR-262 P3): mark P1+P3 implemented; document /api/field + /ws/field; CHANGELOG - ADR-262 Status → "P1 + P3 implemented"; add a P3 implementation-status block (tap site, endpoints, dedicated dev signer deferring the §8 Q1 key decision, fail-closed egress, gates). Keep the honesty framing: real plumbing on a live endpoint, not accuracy. - CHANGELOG [Unreleased]: add the ADR-262 P3 entry. - user-guide: add `/api/field` to the REST table + a "RuField surface (ADR-262 P3)" section covering `/api/field` + `/ws/field`, the fail-closed P1/P2-only egress, the WDP_RUFIELD_SIGNING_SEED dev key, and the no-accuracy honesty note. Co-Authored-By: claude-flow <ruv@ruv.net> * ci: checkout submodules everywhere + Dockerfile copies vendor/rufield Making wifi-densepose-rufield (ADR-262 bridge) a v2 workspace member means EVERY cargo-on-workspace context must have the vendor/rufield submodule present (cargo loads all member manifests). P1 only fixed the rust-tests job; this adds `submodules: recursive` to all workflow checkouts that run cargo (mqtt-integration was failing on the missing submodule manifest), and makes Dockerfile.rust COPY vendor/rufield/ to /vendor/rufield (matches the bridge's ../../../vendor/rufield path-dep under the collapsed Docker layout). update-submodules.yml left alone (it manages submodules itself). Co-Authored-By: claude-flow <ruv@ruv.net> --------- Co-authored-by: ruv <ruvnet@gmail.com> |
||
|
|
f250149e94 |
feat(ADR-262 P1): wifi-densepose-rufield bridge — RuView sensing → signed RuField FieldEvents (fail-closed privacy map) (#1070)
* feat(rufield): ADR-262 P1 — wifi-densepose-rufield anti-corruption bridge New v2 workspace member that converts RuView WiFi-CSI sensing output into signed RuField FieldEvents. Path-deps the vendor/rufield submodule crates (rufield-core/-provenance/-privacy/-fusion); single coupling point between RuView and the standalone RuField MFS spec (ADR-262 §5.4). - SensingSnapshot: owned primitives mirroring SensingUpdate + TrustedOutput (no dependency on wifi-densepose-sensing-server). - snapshot_to_field_event(): builds a WifiCsi FieldTensor + Observation, derives a real position from the signal-field peak (never fabricated), real sha256 provenance + ed25519 signature (synthetic=false). - map_privacy() (§3.3 crux): maps by information content, NEVER byte value — Derived (byte 1) → P4/P5, never P1; fail-closed demotion floor to P2. P1 gates (tests/p1_gates.rs): round-trip serde, is_fusable verified receipt, RuFieldFusion::ingest accept + infer runs, privacy-safety (Derived never P1), full §3.3 table, fail-closed demotion, determinism, no-fabricated-position. 15 tests pass (5 unit + 9 integration + 1 doc), 0 failed. Honesty: P1 plumbing (tested conversion + safe privacy mapping), NOT wired into the live server (P3) and NOT an accuracy claim. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-262): mark P1 implemented + CI submodules:recursive + CHANGELOG/CLAUDE - ADR-262 Status → "Proposed — P1 implemented"; add §0.1 Implementation status (the bridge crate + the five P1 gates that pass; defers the provenance-carrier reuse, P3 live wiring, and P4 multi-modality). - ci.yml: add `submodules: recursive` to the rust-tests checkout so the new crate's `vendor/rufield` path-deps resolve in CI (they fail otherwise even though the workspace build passes locally with the submodule present). - CHANGELOG [Unreleased]: P1 bridge entry (kept alongside the upstream ADR-262 research entry). - CLAUDE.md: crate table row for `wifi-densepose-rufield`. Co-Authored-By: claude-flow <ruv@ruv.net>v1770 |
||
|
|
faca0530de |
docs(adr-262): RuField↔RuView integration design (Proposed) (#1069)
Researched integration ADR: thin wifi-densepose-rufield bridge crate (rvcsi pattern), live SensingServerAdapter emitting signed FieldEvents, vertical fusion composition (ruvsense within-WiFi → rufield cross-modal), and ONE canonical privacy/provenance model (RuView effective_class → RuField P0-P5 at egress; reuse cog-ha-matter SHA-256+Ed25519 receipt). Key finding: RuView has 2 privacy enums + 3 witness mechanisms; the Derived(byte=1)<Anonymous(byte=2)-but-carries-identity trap means the bridge must map by information content, not byte value. Plumbing architecture, not accuracy (real-CSI is unlabeled replay today). Co-authored-by: ruv <ruvnet@gmail.com>v1767 |