mirror of
https://github.com/ruvnet/RuView
synced 2026-06-16 11:23:19 +00:00
Compare commits
1 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| c27d6cc98e |
@@ -7,6 +7,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
## [Unreleased]
|
||||
|
||||
### Fixed
|
||||
- **Multistatic fusion guard interval is now operator-configurable — fixes permanent trust demotion with WiFi-synced ESP32 nodes (#1049).** Two independently-clocked ESP32-S3 boards on ESP-NOW sync drift 10–150 ms (typ. ~70 ms) — the 100 ms beacon + WiFi-MAC jitter cannot hold them within the published 60 ms default guard, so the governed-trust cycle permanently demoted to `Restricted`, suppressed all pose output, and spun the error counter to 200k+ with **no escape hatch but a container restart**. Added a **direct `WDP_GUARD_INTERVAL_US` override** (+ optional `WDP_SOFT_GUARD_US`) to `multistatic_guard_config_from_env`, so a deployment can lift the hard guard past its measured spread (e.g. `WDP_GUARD_INTERVAL_US=200000`) without having to know its exact TDM schedule. Precedence is most-specific-wins: a direct override beats the existing `WDP_TDM_SLOTS`+`WDP_TDM_SLOT_US` schedule-derived guard, which beats the 60 ms/20 ms default; the override is applied on top of whichever base is selected, the soft band is always clamped strictly below the hard guard, and a malformed/zero value is ignored (falls back to the base rather than breaking fusion). The effective guard is now logged at startup. Pinned by 6 new tests (`multistatic_guard_config_tests`): direct-override-wins / beats-TDM-derived / soft-clamped-below-hard / lowering-hard-pulls-soft-down / malformed-or-zero-falls-back / default-when-unset. `wifi-densepose-sensing-server` bin tests **449 → 455**, 0 failed; Python proof VERDICT PASS, hash unchanged (off the signal proof path).
|
||||
|
||||
### Security
|
||||
- **`wifi-densepose-occworld-candle` — beyond-SOTA security + correctness review (Milestone #9, crate 4/4).** (1) **HIGH (MEASURED) — checkpoint-load crash on any int32 tensor** (`model.rs::safetensor_dtype_to_candle`). `safetensors::Dtype::I32` was mapped to `candle_core::DType::I64` and the raw int32 byte buffer (4 bytes/elem) was then handed to `Tensor::from_raw_buffer(.., I64, shape, ..)`. Candle derives `elem_count = data.len() / dtype.size_in_bytes()`, so the I64 path halved the element count while keeping the *original* shape — yielding a tensor whose declared shape claims twice as many elements as its backing storage holds. Reading it **panics** (`range end index 6 out of range for slice of length 3` — slice OOB inside candle-core) on any attacker-supplied or PyTorch-exported checkpoint containing an int32 tensor (common: index/buffer tensors). Fixed by mapping `I32 → DType::I32` (and `I16 → DType::I16`), both first-class candle dtypes. Reproduction recorded on old code; pinned by `tests/checkpoint_loading.rs::int32_tensor_loads_with_consistent_shape_and_values` (panics on old, passes on new) plus F32/I64/corrupt-file control cases. (2) **LOW (MEASURED) — `predict()` lacked frame/batch validation at the input boundary** (`inference.rs`). It validated H/W/D but not the externally-supplied frame count; an `f_in > num_frames*2` over-indexed the temporal positional embedding deep in the transformer and surfaced as a cryptic candle "gather" `InvalidIndex` (returned error, not a panic — candle bounds-checks), and a zero frame/batch dim fed a zero-element tensor into the pipeline. Now rejected at the boundary with a clear `ShapeMismatch`. Pinned by `predict_rejects_zero_frames` / `predict_rejects_too_many_frames` / `predict_accepts_frame_count_at_capacity`. (3) **LOW (MEASURED) — divide-by-zero panic on a degenerate input to the public `VQCodebook::encode`** (`vqvae.rs`): a rank-0 / empty-last-dim tensor made `last == 0` and panicked on `elem_count() / last`. Now fails closed with a clear error. Pinned by `encode_rejects_scalar_without_panicking`. **Dimensions confirmed CLEAN with evidence:** panic surface — zero `unwrap()`/`expect()`/`panic!`/`unreachable!` in production code paths (grep evidence; all error handling via `?`/`map_err`); NaN-state-poisoning — N/A (engine is stateless between `predict` calls, input is `u8` class indices so non-finite input is structurally impossible, no persistent world-model buffer to latch into); unbounded-alloc / shape-data mismatch from malformed weights — defended upstream by `safetensors::validate()` (overflow-checked `nelements*dtype.size()` vs declared byte range, rejected before reaching candle); secrets — none (grep clean, only `token_h`/`token_w` config fields match). `unsafe_code = forbid` in the crate manifest. **Build/validation status (MEASURED on Windows):** crate builds and tests under `cargo test -p wifi-densepose-occworld-candle --no-default-features` — **29/29 pass** (20 unit + 4 checkpoint_loading + 3 predict_honesty + 2 doc) after fixes; `cargo test --workspace --no-default-features` = 0 failed across all crates (lone `wifi-densepose-desktop` `api_integration` failure was a Windows "Access is denied (os error 5)" file-lock flake — re-ran in isolation **21/21 pass**); Python proof VERDICT PASS, hash `f8e76f21…446f7a` unchanged. *Warrants ADR slot 179 (parent to author).*
|
||||
- **`wifi-densepose-wasm-edge` beyond-SOTA closing review — boundary NaN-state-poisoning guard + clean-with-evidence attestation (ADR-040 edge crate, ~70 modules).** Closing pass of the security campaign over the last untouched sizeable crate. **One real finding fixed (LOW / source-analysis + reproduced):** the two WASM↔host frame boundaries (`lib.rs::on_frame`/`on_timer` and `bin/ghost_hunter.rs::on_frame`) read raw IEEE-754 `f32` from the `csi_get_phase`/`csi_get_amplitude`/`csi_get_variance`/`csi_get_motion_energy` host imports **without any finiteness check** — the entire crate had **zero** `is_finite`/`is_nan` guards, and the in-crate `clamp` helpers propagate NaN (`NaN < lo` and `NaN > hi` are both false). A single non-finite value (firmware DSP bug, uninitialised buffer, or hostile host) latches NaN into the long-lived per-module accumulators (EMA, Welford, phasor sums, anomaly baselines); once latched, every downstream comparison evaluates `false`, so detectors fail **degraded** (stuck gate state, silently-disabled anomaly checks) — silent corruption, not a crash (WASM `panic=abort` is *not* tripped: no indexing/`unwrap` on the poisoned value). Threat model is a **semi-trusted** boundary (the Tier-2 DSP firmware supplies the imports, not direct network/JS), hence LOW severity / defense-in-depth. **Fix:** added `sanitize_host_f32()` (maps non-finite→`0.0`, `core`-only so it holds in `no_std`) applied at every `host_get_*` float read — a single chokepoint covering all ~70 downstream modules, mirroring the existing M-01 negative-`n_subcarriers` boundary clamp. **Pinned by** `boundary_tests::{sanitize_passes_finite_values_through, sanitize_maps_non_finite_to_zero, coherence_monitor_nan_latches_without_sanitize_but_not_with}` — the last asserts on the *current* `CoherenceMonitor` that a raw NaN frame latches the smoothed score (documents the hazard) while the boundary-sanitized path stays finite. **Dimensions attested CLEAN with evidence (source-analysis):** (a) **panic-on-input** — every non-test `unwrap()`/`expect()` is either `#[cfg(test)]` or in the `std`-gated RVF *builder* host tool writing to an in-memory `Vec` (infallible); no `panic!`/`unreachable!`/`todo!`/`get_unchecked` in any hot path. (b) **shape/bounds** — all frame-buffer access is `min()`-clamped (`MAX_SC=32`, `DTW_MAX_LEN`, `LCS_WINDOW`, `PATTERN_LEN`), all index-by-cast sites (`feature_id as usize`, `conclusion_id`, `minute_counter`, `plan_step`) are either compile-time-const-bounded or `if idx <`/`%`-guarded; negative `n_subcarriers` already mapped to 0 (M-01). (c) **memory/leak** — no `move ||` closures, no `mem::forget`/`Box::leak`/`.leak()`; the only `Box::new` is in the `std`-gated `skill_registry` (one-time init, bounded). (d) **secrets** — none (grep clean). **MEASURED build/test evidence:** host `cargo test --features std,medical-experimental` = **672 passed / 0 failed** (was 669 pre-fix; +3 new tests); the real deployment artifacts all build clean on the actual target — `cargo build --target wasm32-unknown-unknown --release` (no_std/panic=abort default lib), `--bin ghost_hunter --no-default-features --features standalone-bin`, and `--features medical-experimental` (toolchain 1.89 per `rust-toolchain.toml`). No ADR slot needed — a single LOW defense-in-depth boundary fix; CHANGELOG attestation suffices.
|
||||
|
||||
@@ -6391,32 +6391,71 @@ fn vitals_snapshots_from_sensing_json(
|
||||
}
|
||||
}
|
||||
|
||||
/// Build the multistatic guard config, optionally derived from the TDM schedule
|
||||
/// declared in the environment (#1031).
|
||||
/// Build the multistatic guard config from the environment (#1031, #1049).
|
||||
///
|
||||
/// When both `WDP_TDM_SLOTS` and `WDP_TDM_SLOT_US` parse as positive integers,
|
||||
/// the guard is derived via [`MultistaticConfig::for_tdm_schedule`] so a
|
||||
/// deployment can match its exact schedule. Otherwise the published default
|
||||
/// (60 ms hard / 20 ms soft) is returned. `min_nodes` is *not* set here — the
|
||||
/// caller overrides it for single-node passthrough.
|
||||
/// Three precedence layers, most-specific wins:
|
||||
/// 1. `WDP_GUARD_INTERVAL_US` (+ optional `WDP_SOFT_GUARD_US`) — a **direct**
|
||||
/// hard-guard override. This is the #1049 escape hatch: WiFi/ESP-NOW-synced
|
||||
/// ESP32 nodes drift 10–150 ms (the 100 ms beacon + WiFi-MAC jitter cannot
|
||||
/// hold two independently-clocked boards within the published default), so a
|
||||
/// deployment can simply lift the guard past its measured spread (e.g.
|
||||
/// `WDP_GUARD_INTERVAL_US=200000`) without knowing its exact TDM schedule.
|
||||
/// 2. `WDP_TDM_SLOTS` + `WDP_TDM_SLOT_US` (both positive) — derive the guard
|
||||
/// from the declared schedule via [`MultistaticConfig::for_tdm_schedule`].
|
||||
/// 3. Otherwise the published default (60 ms hard / 20 ms soft).
|
||||
///
|
||||
/// The direct override (1) is applied **on top of** whichever base (2 or 3) is
|
||||
/// selected, so `WDP_GUARD_INTERVAL_US` always wins for the hard guard while a
|
||||
/// TDM-derived soft band is preserved unless it would exceed the new hard guard.
|
||||
/// `min_nodes` is *not* set here — the caller overrides it for single-node
|
||||
/// passthrough.
|
||||
fn multistatic_guard_config_from_env() -> MultistaticConfig {
|
||||
multistatic_guard_config_from(
|
||||
std::env::var("WDP_TDM_SLOTS").ok().as_deref(),
|
||||
std::env::var("WDP_TDM_SLOT_US").ok().as_deref(),
|
||||
std::env::var("WDP_GUARD_INTERVAL_US").ok().as_deref(),
|
||||
std::env::var("WDP_SOFT_GUARD_US").ok().as_deref(),
|
||||
)
|
||||
}
|
||||
|
||||
/// Pure core of [`multistatic_guard_config_from_env`] for testability.
|
||||
fn multistatic_guard_config_from(slots: Option<&str>, slot_us: Option<&str>) -> MultistaticConfig {
|
||||
match (
|
||||
fn multistatic_guard_config_from(
|
||||
slots: Option<&str>,
|
||||
slot_us: Option<&str>,
|
||||
guard_us: Option<&str>,
|
||||
soft_us: Option<&str>,
|
||||
) -> MultistaticConfig {
|
||||
// Base: TDM-schedule-derived when both slot params are valid, else default.
|
||||
let mut cfg = match (
|
||||
slots.and_then(|s| s.trim().parse::<usize>().ok()),
|
||||
slot_us.and_then(|s| s.trim().parse::<u64>().ok()),
|
||||
) {
|
||||
(Some(n), Some(us)) if n >= 1 && us >= 1 => {
|
||||
MultistaticConfig::for_tdm_schedule(n, us)
|
||||
}
|
||||
(Some(n), Some(us)) if n >= 1 && us >= 1 => MultistaticConfig::for_tdm_schedule(n, us),
|
||||
_ => MultistaticConfig::default(),
|
||||
};
|
||||
|
||||
// Direct hard-guard override (#1049). Ignored when unset/zero/unparseable so
|
||||
// a malformed env var falls back to the base rather than breaking fusion.
|
||||
if let Some(g) = guard_us
|
||||
.and_then(|s| s.trim().parse::<u64>().ok())
|
||||
.filter(|&g| g >= 1)
|
||||
{
|
||||
cfg.guard_interval_us = g;
|
||||
// Keep the soft band strictly below the (possibly lowered) hard guard.
|
||||
if cfg.soft_guard_us >= g {
|
||||
cfg.soft_guard_us = g.saturating_sub(1).max(1);
|
||||
}
|
||||
}
|
||||
|
||||
// Optional explicit soft-guard override, always clamped strictly below hard.
|
||||
if let Some(s) = soft_us
|
||||
.and_then(|s| s.trim().parse::<u64>().ok())
|
||||
.filter(|&s| s >= 1)
|
||||
{
|
||||
cfg.soft_guard_us = s.min(cfg.guard_interval_us.saturating_sub(1).max(1));
|
||||
}
|
||||
|
||||
cfg
|
||||
}
|
||||
|
||||
/// Turn a `ProgressiveLoader::new` failure into an actionable diagnostic (#894).
|
||||
@@ -7485,11 +7524,16 @@ async fn main() {
|
||||
pose_tracker: PoseTracker::new(),
|
||||
last_tracker_instant: None,
|
||||
multistatic_fuser: {
|
||||
// #1031: the default guard (60 ms hard / 20 ms soft) accommodates a
|
||||
// real TDM slot offset. A deployment can override it to match its
|
||||
// own schedule via WDP_TDM_SLOTS + WDP_TDM_SLOT_US (both set ⇒ derive
|
||||
// from the schedule), else the published default is used.
|
||||
// #1031/#1049: the default guard (60 ms hard / 20 ms soft)
|
||||
// accommodates a real TDM slot offset. A deployment overrides it via
|
||||
// WDP_GUARD_INTERVAL_US (direct, e.g. 200000 for WiFi/ESP-NOW sync —
|
||||
// #1049) or WDP_TDM_SLOTS + WDP_TDM_SLOT_US (derive from schedule).
|
||||
let cfg = multistatic_guard_config_from_env();
|
||||
info!(
|
||||
"Multistatic fusion guard: {} µs hard / {} µs soft (override via \
|
||||
WDP_GUARD_INTERVAL_US / WDP_SOFT_GUARD_US, or WDP_TDM_SLOTS+WDP_TDM_SLOT_US)",
|
||||
cfg.guard_interval_us, cfg.soft_guard_us
|
||||
);
|
||||
let mut fuser = MultistaticFuser::with_config(MultistaticConfig {
|
||||
min_nodes: 1, // single-node passthrough
|
||||
..cfg
|
||||
@@ -7797,6 +7841,72 @@ async fn main() {
|
||||
info!("Server shut down cleanly");
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod multistatic_guard_config_tests {
|
||||
//! #1049 — the multistatic guard interval must be operator-configurable so a
|
||||
//! WiFi/ESP-NOW deployment (10–150 ms inter-node clock drift) can lift the
|
||||
//! guard past its measured timestamp spread instead of being permanently
|
||||
//! demoted to Restricted with no escape hatch.
|
||||
use super::*;
|
||||
|
||||
#[test]
|
||||
fn default_guard_when_nothing_set() {
|
||||
let cfg = multistatic_guard_config_from(None, None, None, None);
|
||||
assert_eq!(cfg.guard_interval_us, MultistaticConfig::default().guard_interval_us);
|
||||
assert_eq!(cfg.soft_guard_us, MultistaticConfig::default().soft_guard_us);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn direct_guard_override_wins_and_unblocks_wifi_spread() {
|
||||
// The #1049 reporter's measured ~70 ms spread exceeds the 60 ms default
|
||||
// → permanent demotion. A direct 200 ms override accepts it.
|
||||
let cfg = multistatic_guard_config_from(None, None, Some("200000"), None);
|
||||
assert_eq!(cfg.guard_interval_us, 200_000);
|
||||
assert!(cfg.soft_guard_us < cfg.guard_interval_us);
|
||||
// 70 ms spread now sits inside the guard.
|
||||
assert!(70_000 < cfg.guard_interval_us);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn direct_guard_override_beats_tdm_derived() {
|
||||
// Both TDM params AND a direct override set → the direct hard guard wins,
|
||||
// the TDM-derived soft band is preserved (still strictly below hard).
|
||||
let cfg = multistatic_guard_config_from(Some("2"), Some("18000"), Some("200000"), None);
|
||||
assert_eq!(cfg.guard_interval_us, 200_000);
|
||||
assert!(cfg.soft_guard_us < cfg.guard_interval_us);
|
||||
assert!(cfg.soft_guard_us >= 1);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn soft_override_is_clamped_strictly_below_hard() {
|
||||
// A soft guard ≥ hard would be nonsensical → clamped below the hard guard.
|
||||
let cfg = multistatic_guard_config_from(None, None, Some("50000"), Some("999999"));
|
||||
assert_eq!(cfg.guard_interval_us, 50_000);
|
||||
assert!(cfg.soft_guard_us < 50_000);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn lowering_hard_below_default_soft_pulls_soft_down() {
|
||||
// Override hard to 10 ms (< default 20 ms soft) → soft drops below it.
|
||||
let cfg = multistatic_guard_config_from(None, None, Some("10000"), None);
|
||||
assert_eq!(cfg.guard_interval_us, 10_000);
|
||||
assert!(cfg.soft_guard_us < 10_000);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn malformed_or_zero_override_falls_back_to_base() {
|
||||
// Garbage / zero must not break fusion — fall back to the base config.
|
||||
for bad in ["", "abc", "0", "-5", "12.5"] {
|
||||
let cfg = multistatic_guard_config_from(None, None, Some(bad), None);
|
||||
assert_eq!(
|
||||
cfg.guard_interval_us,
|
||||
MultistaticConfig::default().guard_interval_us,
|
||||
"override {bad:?} should be ignored"
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod node_sync_snapshot_serialization_tests {
|
||||
//! ADR-110 iter 24 — JSON public-API contract for the iter 23
|
||||
|
||||
Reference in New Issue
Block a user