mirror of
https://github.com/ruvnet/RuView
synced 2026-06-13 10:53:20 +00:00
Compare commits
4 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| 3fb40a9deb | |||
| 1a17cc5b06 | |||
| 7c13ec6a00 | |||
| d3606d51a7 |
@@ -55,6 +55,8 @@ trained checkpoint) so you can reproduce them yourself.
|
||||
| zero-copy ORT input ~1.48× (ADR-155) | **MEASURED** | `cd v2 && cargo bench -p wifi-densepose-nn --features onnx --bench onnx_bench` |
|
||||
| pointcloud splats 9→2 passes ~1.24× (ADR-160 research) | **MEASURED** | `cd v2 && cargo bench -p wifi-densepose-pointcloud --bench splats_bench` |
|
||||
| native wlanapi multi-BSSID scan 9.74 Hz (vs netsh ~2 Hz) | **MEASURED (Windows)** | `cd v2 && cargo test -p wifi-densepose-wifiscan -- --ignored measure_native_scan_rate` |
|
||||
| wasm-edge `process_frame` hot-path latency (host proxy, ADR-163) | **MEASURED-on-host** (NOT the ESP32/WASM3 budget — needs hardware) | `cd v2/crates/wifi-densepose-wasm-edge && cargo bench --features std` |
|
||||
| cog steady-state CPU infer latency ~305 µs (ADR-163; NOT the manifest cold-start) | **MEASURED-on-host** | `cd v2 && cargo bench -p cog-person-count -p cog-pose-estimation --no-default-features --bench infer_bench` |
|
||||
|
||||
## What we do NOT claim (the honest negatives — the strongest anti-slop signal)
|
||||
|
||||
@@ -68,8 +70,9 @@ trained checkpoint) so you can reproduce them yourself.
|
||||
|
||||
## Provenance
|
||||
|
||||
Every claim above traces to a committed ADR (`docs/adr/ADR-154`…`ADR-160`), a
|
||||
test, a criterion bench, or `benchmarks/wiflow-std/RESULTS.md`. The history
|
||||
Every claim above traces to a committed ADR (`docs/adr/ADR-154`…`ADR-163`), a
|
||||
test, a criterion bench, `benchmarks/wiflow-std/RESULTS.md`, or
|
||||
`benchmarks/edge-latency/RESULTS.md`. The history
|
||||
includes published **retractions** (the 92.9% PCK retraction; the WiFlow-STD
|
||||
shipped-checkpoint refutation; the NV-diamond BOM reality check) — a faker hides
|
||||
failures; we commit them.
|
||||
|
||||
@@ -0,0 +1,137 @@
|
||||
# Edge-Latency Benchmark Results — ADR-163
|
||||
|
||||
Converting **CLAIMED** edge latency budgets into **MEASURED-on-host** numbers,
|
||||
closing the measurement debt flagged by Milestones 5/6 (ADR-159 / ADR-160).
|
||||
Benches + docs only — **no production-code behavior changed**.
|
||||
|
||||
## The honest caveat, up front (read before citing any number)
|
||||
|
||||
Two distinct gaps separate every number below from the figure it is converting:
|
||||
|
||||
1. **Host ≠ ESP32.** The wasm-edge skill modules document budgets *"on ESP32-S3
|
||||
WASM3"* (e.g. `exo_time_crystal`: "H (<10 ms)"). These benches run **native
|
||||
x86_64 on a development laptop**, not the Xtensa/WASM3 target. A native host
|
||||
median is an **upper bound on the algorithm's work**, not the ESP32 number.
|
||||
WASM3 interpretation on a ~240 MHz Xtensa core is typically 1–2 orders of
|
||||
magnitude slower than native `-O` host code, so a host median far under the
|
||||
budget **does NOT prove the ESP32 meets it.** *The ESP32 figure is NOT
|
||||
reproduced here — it needs hardware.*
|
||||
|
||||
2. **Bench ≠ the doc-claimed measurement.** For the cogs, the manifest cites a
|
||||
**cold-start** number (`cold_start_ms_avg`, weight-load included); these
|
||||
benches measure **steady-state** per-frame `infer` (warm, weights resident).
|
||||
Different measurements; we report both, labelled.
|
||||
|
||||
Grades (per `benchmarks/wiflow-std/RESULTS.md` / ADR-152 vocabulary):
|
||||
- **MEASURED-on-host** — reproduced in this repo on the machine below, exact
|
||||
command recorded. NOT the ESP32 / NOT the cold-start figure.
|
||||
- **CLAIMED (ESP32)** — the doc budget; UNMEASURED on hardware here.
|
||||
|
||||
## Machine
|
||||
|
||||
| | |
|
||||
|---|---|
|
||||
| Host | `ruvzen` (Windows 11, this dev box) |
|
||||
| CPU | Intel Core Ultra 9 285H |
|
||||
| Toolchain | `cargo 1.91.1`, `--release` (opt-level per crate profile) |
|
||||
| Bench harness | criterion 0.5 (`time: [low **median** high]` reported below) |
|
||||
| Date | 2026-06-12 |
|
||||
|
||||
Run-to-run spread on this box is non-trivial (criterion's low/high bracket the
|
||||
median by a few %); the medians below are single-session captures with the smoke
|
||||
settings `--warm-up-time 1 --measurement-time 2` (wasm-edge) / `3` (cogs). Re-run
|
||||
for your own machine — the absolute numbers are host-specific.
|
||||
|
||||
---
|
||||
|
||||
## T1 — wasm-edge `process_frame` hot paths (ADR-160 deferred item → DONE host)
|
||||
|
||||
The crate is **excluded from the v2 workspace**; bench from the crate dir.
|
||||
|
||||
```bash
|
||||
cd v2/crates/wifi-densepose-wasm-edge
|
||||
cargo bench --features std -- --warm-up-time 1 --measurement-time 2
|
||||
# med_seizure_detect is medical-experimental-gated:
|
||||
cargo bench --features std,medical-experimental -- --warm-up-time 1 --measurement-time 2 med_seizure
|
||||
```
|
||||
|
||||
| Hot path (M6-audit-named) | Bench id | Host median | Grade | Doc budget (CLAIMED, ESP32) |
|
||||
|---|---|---|---|---|
|
||||
| `exo_time_crystal` 256-pt × 128-lag autocorrelation (full buffer) | `exo_time_crystal::process_frame[autocorr_256x128]` | **17.3 µs** | MEASURED-on-host | "H (<10 ms) on ESP32-S3 WASM3" — **NOT reproduced here (needs hardware)** |
|
||||
| `exo_ghost_hunter` empty-room periodicity + hidden-breathing | `exo_ghost_hunter::process_frame[empty_room_periodicity]` | **1.44 µs** | MEASURED-on-host | research/exotic; no firm ESP32 figure — host proxy only |
|
||||
| `sec_weapon_detect` per-subcarrier Welford (MAX_SC=32) | `sec_weapon_detect::process_frame[per_sc_welford]` | **0.42 µs** (420 ns) | MEASURED-on-host | research-grade; calibration-gated — host proxy only |
|
||||
| `med_seizure_detect` clonic-phase rhythm path (steady-state frame) | `med_seizure_detect::process_frame[clonic_rhythm]` | **0.10 µs** (105 ns) | MEASURED-on-host (feature-gated) | doc budget "S (<5 ms) on ESP32"; **NOT reproduced here** |
|
||||
|
||||
Reading these honestly:
|
||||
|
||||
- `exo_time_crystal` at **17.3 µs host** is the only one whose host cost is even
|
||||
in the same *thousandths* of its 10 ms ESP32 budget — it does the most work
|
||||
(~32K MACs/frame). 17.3 µs native says the algorithm is cheap; it says
|
||||
**nothing** about whether WASM3-on-Xtensa lands under 10 ms. A naïve
|
||||
host→ESP32 extrapolation (assume 100× interpreter+clock penalty) would put it
|
||||
near ~1.7 ms, comfortably under — **but that is an extrapolation, not a
|
||||
measurement**, and is recorded here only to show the host number is not
|
||||
obviously in tension with the budget. ESP32 figure: **UNMEASURED**.
|
||||
- `med_seizure_detect`'s 105 ns is the **steady-state** per-frame cost; the
|
||||
expensive clonic autocorrelation only fires when the state machine is in the
|
||||
clonic phase, so this is a lower-bound on the heavy path, not the worst case.
|
||||
It is still a real, committed host datapoint.
|
||||
- The pre-existing `tests/budget_compliance.rs` already asserts the L/S/H
|
||||
wall-clock tiers (25 passing tests); these criterion benches add the
|
||||
regression-grade, reproducible median that ADR-160 deferred.
|
||||
|
||||
---
|
||||
|
||||
## T2 — cog steady-state inference latency (ADR-159/160 deferred item → DONE)
|
||||
|
||||
Cog crates are normal workspace members; bench from `v2/`. Real weights
|
||||
(`count_v1.safetensors` / `pose_v1.safetensors`) ship in-repo under each cog's
|
||||
`cog/artifacts/`, so the bench measures the **real Candle CPU forward**, not the
|
||||
stub (the bench `assert!`s `backend().starts_with("candle-")`).
|
||||
|
||||
```bash
|
||||
cd v2
|
||||
cargo bench -p cog-person-count --no-default-features --bench infer_bench -- --warm-up-time 1 --measurement-time 3
|
||||
cargo bench -p cog-pose-estimation --no-default-features --bench infer_bench -- --warm-up-time 1 --measurement-time 3
|
||||
```
|
||||
|
||||
| Cog | Bench id | Host median (steady-state infer, CPU) | Grade | Manifest cold-start (CLAIMED, different measurement + machine) |
|
||||
|---|---|---|---|---|
|
||||
| cog-person-count | `cog_person_count::infer[cpu_real_weights_steady_state]` | **305 µs** (idle box) | MEASURED-on-host | — (person-count manifest carries comparable provenance) |
|
||||
| cog-pose-estimation | `cog_pose_estimation::infer[cpu_real_weights_steady_state]` | **305 µs** (idle box) | MEASURED-on-host | `cold_start_ms_avg: 5.4` (30 invocations, **ruvultra/RTX 5080 host**, candle 0.9 cpu) — **cold-start, NOT steady-state; NOT this machine** |
|
||||
|
||||
> Spread caveat (observed, honest): both medians above were captured with the box
|
||||
> otherwise idle. A re-run of the validate-form command *while a second cargo job
|
||||
> was loading the same cores* gave 385 µs (person-count) / 973 µs (pose) —
|
||||
> the criterion low/high bracket widens to ~0.34–1.18 ms under contention. The
|
||||
> 305 µs figures are the idle-box datapoints; the absolute number is host- and
|
||||
> load-dependent (the ~10× pose swing is core contention, not a code change).
|
||||
|
||||
Reading these honestly:
|
||||
|
||||
- **Steady-state ≠ cold-start.** The pose manifest's `5.4 ms` folds in one-time
|
||||
weight load / mmap / first-forward allocation. This bench warms the engine
|
||||
first and times only the recurring per-frame forward, on a *different
|
||||
machine*. The two numbers are not comparable and we do not claim this bench
|
||||
reproduces the 5.4 ms manifest figure.
|
||||
- Both cogs share the same conv encoder; person-count adds a count head +
|
||||
confidence head, pose adds a 256-wide MLP head. The host steady-state cost is
|
||||
dominated by the three dilated Conv1d layers (56→64→128→128) shared by both —
|
||||
which is why both land at ~305 µs.
|
||||
- **Empirical confirmation of the steady-state/cold-start gap:** pose
|
||||
steady-state (305 µs host) is ~18× *under* the manifest's 5.4 ms cold-start.
|
||||
Even accounting for the different machine, this is the expected shape — the
|
||||
bulk of cold-start is one-time setup, not the forward pass — and it is exactly
|
||||
why conflating the two would be dishonest.
|
||||
|
||||
---
|
||||
|
||||
## Status vs the deferred items
|
||||
|
||||
| Deferred item | Was | Now |
|
||||
|---|---|---|
|
||||
| ADR-160 "Criterion benches for `process_frame` budget claims" | ACCEPTED-FUTURE | **DONE (host)**; ESP32-on-hardware still **PENDING** (needs the wasm32 target + a flashed ESP32-S3) |
|
||||
| ADR-159/160 cog inference latency (`cold_start_ms_avg` uncommitted-benched) | CLAIMED | **MEASURED-on-host (steady-state)**; cold-start-on-ruvultra remains the manifest's separate claim |
|
||||
|
||||
Nothing here changes runtime behavior — these are benches + this results file
|
||||
only. No crate needs republishing.
|
||||
@@ -182,9 +182,15 @@ label or behavior change, consistent with leaving their claim surface intact.)
|
||||
sign-language claim requires labelled clinical/affective/ASL data and reference
|
||||
standards that do not exist in this repo. The disclaimers + feature gate are the
|
||||
honest stand-in. Nothing is claimed that is not measured.
|
||||
- **Criterion benches for `process_frame` budget claims** — **ACCEPTED-FUTURE**.
|
||||
`tests/budget_compliance.rs` asserts L/S/H tier wall-clock budgets (25 tests,
|
||||
passing), but a regression-grade criterion bench is not yet wired.
|
||||
- **Criterion benches for `process_frame` budget claims** — **DONE (host)**
|
||||
(ADR-163, 2026-06-12). `benches/process_frame_bench.rs` benches the heaviest
|
||||
hot paths (`exo_time_crystal` 256×128 autocorrelation, `exo_ghost_hunter`
|
||||
periodicity, `sec_weapon_detect` per-subcarrier Welford, `med_seizure_detect`
|
||||
clonic rhythm) and reports committed **host** medians
|
||||
(`benchmarks/edge-latency/RESULTS.md`). `tests/budget_compliance.rs` continues
|
||||
to assert the L/S/H tier wall-clock budgets (25 tests, passing). **ESP32-on-
|
||||
hardware (Xtensa/WASM3) latency remains PENDING** — the host bench is an
|
||||
upper-bound algorithm-cost proxy, NOT the ESP32 figure (needs hardware).
|
||||
- **`wasm32-unknown-unknown` `static_mut_refs` confirmation** — **ACCEPTED-FUTURE**
|
||||
(toolchain): the source pattern is eliminated; a CI job on the wasm target should
|
||||
assert zero `static_mut_refs` once the target is added to the build image.
|
||||
|
||||
@@ -0,0 +1,123 @@
|
||||
# ADR-163: Edge-Latency Measurement — CLAIMED budgets → MEASURED-on-host
|
||||
|
||||
- **Status**: accepted
|
||||
- **Date**: 2026-06-12
|
||||
- **Deciders**: ruv
|
||||
- **Tags**: edge-latency, wasm-edge, esp32, cog-inference, criterion, prove-everything, measurement-debt
|
||||
- **Amends**: ADR-160 (deferred "criterion benches for process_frame budget claims" line now DONE-on-host); ADR-159 (cog inference latency)
|
||||
|
||||
## Context — Milestone 9 of the beyond-SOTA sweep
|
||||
|
||||
Prior milestones (M5/M6, ADR-159/ADR-160) flagged **measurement debt**: edge
|
||||
latency budgets asserted in doc-comments and manifests but **never reproduced by
|
||||
a committed benchmark**. Specifically:
|
||||
|
||||
- Many `wifi-densepose-wasm-edge` skill modules document a timing budget *"on
|
||||
ESP32-S3 WASM3"* (e.g. `exo_time_crystal`: "H (heavy, <10 ms)"). These were
|
||||
**CLAIMED**, not benchmarked. ADR-160's deferred backlog named exactly this:
|
||||
*"Criterion benches for `process_frame` budget claims — ACCEPTED-FUTURE."*
|
||||
- `cog-pose-estimation`'s manifest cites `cold_start_ms_avg: 5.4`, but neither
|
||||
cog had a `benches/` directory or any committed inference-latency number.
|
||||
|
||||
Under the project's **prove-everything / anti-"AI-slop"** directive, a CLAIMED
|
||||
latency budget that a skeptic cannot reproduce is debt. M9 pays it down — benches
|
||||
and docs only, **no production-code behavior change** (so nothing republishes).
|
||||
|
||||
## Headline
|
||||
|
||||
**Converted the CLAIMED edge-latency budgets into MEASURED-on-host numbers, with
|
||||
the honest host-vs-ESP32 caveat stated everywhere.** Added committed criterion
|
||||
benches over the heaviest hot paths and a results file a skeptic can re-run. The
|
||||
ESP32-on-hardware figure remains explicitly **UNMEASURED** — this milestone does
|
||||
not pretend a laptop reproduces an Xtensa/WASM3 budget.
|
||||
|
||||
## Decision — benches landed
|
||||
|
||||
### T1 — wasm-edge `process_frame` budget benches
|
||||
|
||||
`v2/crates/wifi-densepose-wasm-edge/benches/process_frame_bench.rs` (criterion,
|
||||
`harness = false`, `required-features = ["std"]`). The crate is **excluded from
|
||||
the v2 workspace**, so it runs from the crate dir. Benches the M6-audit-named
|
||||
heaviest hot paths over a **fixed synthetic CSI frame**, each driven through the
|
||||
public `process_frame` after warming the relevant ring/phase buffers so the
|
||||
expensive path actually executes:
|
||||
|
||||
- `exo_time_crystal::process_frame` — full 256-pt × 128-lag autocorrelation.
|
||||
- `exo_ghost_hunter::process_frame` — empty-room periodicity / hidden-breathing.
|
||||
- `sec_weapon_detect::process_frame` — per-subcarrier (MAX_SC=32) Welford.
|
||||
- `med_seizure_detect::process_frame` — clonic-rhythm path (`#[cfg(feature =
|
||||
"medical-experimental")]`, only built/run with that gate).
|
||||
|
||||
The lib's `bench = false` was set so the libtest harness does not intercept
|
||||
criterion CLI flags; the `ghost_hunter` bin is already `standalone-bin`-gated and
|
||||
not built under `--features std`.
|
||||
|
||||
**Measured host medians** (Intel Core Ultra 9 285H, native `--release`):
|
||||
`exo_time_crystal` **17.3 µs** · `exo_ghost_hunter` **1.44 µs** ·
|
||||
`sec_weapon_detect` **0.42 µs** · `med_seizure_detect` **0.10 µs**.
|
||||
|
||||
### T2 — cog inference latency benches
|
||||
|
||||
`v2/crates/cog-person-count/benches/infer_bench.rs` and
|
||||
`v2/crates/cog-pose-estimation/benches/infer_bench.rs` (criterion,
|
||||
`harness = false`). Each loads the **real** shipped weights from the in-repo
|
||||
`cog/artifacts/`, asserts the Candle CPU backend (so the stub can never be
|
||||
silently benched), warms one forward, then times steady-state
|
||||
`InferenceEngine::infer` over a fixed CSI window on `Device::Cpu`.
|
||||
|
||||
**Measured host medians:** cog-person-count **305 µs** · cog-pose-estimation
|
||||
**305 µs** (steady-state, CPU, real weights).
|
||||
|
||||
### T3 — results file
|
||||
|
||||
`benchmarks/edge-latency/RESULTS.md`, in the `benchmarks/wiflow-std/RESULTS.md`
|
||||
style: each number with its exact reproduce command, the machine, the
|
||||
MEASURED-on-host grade, and the honest caveat.
|
||||
|
||||
## The honest caveat (recorded, non-negotiable)
|
||||
|
||||
1. **Host ≠ ESP32.** The wasm-edge benches run native x86_64, not Xtensa/WASM3.
|
||||
A host median is an **upper bound on algorithm work**, not the ESP32 number;
|
||||
WASM3 interpretation on a ~240 MHz core is 1–2 orders of magnitude slower than
|
||||
native `-O`. A host median under budget does **not** prove the ESP32 meets it.
|
||||
**The ESP32 figure is NOT reproduced here — it needs hardware.**
|
||||
2. **Bench ≠ the doc-claimed measurement.** The cogs' manifest cites a
|
||||
**cold-start** number (weight-load included); these benches measure
|
||||
**steady-state** per-frame `infer`. We report both, labelled, and do not
|
||||
conflate them. Empirically, pose steady-state (305 µs host) is ~18× under the
|
||||
5.4 ms cold-start — the expected shape, and exactly why conflating would lie.
|
||||
|
||||
## Deferred / still-pending (nothing dropped)
|
||||
|
||||
- **ESP32-on-hardware `process_frame` latency** — **PENDING (hardware)**. Needs
|
||||
the `wasm32-unknown-unknown` target built + flashed to an ESP32-S3 and timed
|
||||
under WASM3. The host bench is the algorithm-cost proxy until then.
|
||||
- **Per-skill *accuracy*** remains **DATA-GATED** (unchanged from ADR-160) —
|
||||
this ADR measures latency only, never claims detection accuracy.
|
||||
|
||||
## Reproduction (MEASURED)
|
||||
|
||||
```bash
|
||||
# T1 — wasm-edge (workspace-excluded → run from the crate dir)
|
||||
cd v2/crates/wifi-densepose-wasm-edge
|
||||
cargo bench --features std -- --warm-up-time 1 --measurement-time 2
|
||||
cargo bench --features std,medical-experimental -- --warm-up-time 1 --measurement-time 2 med_seizure
|
||||
|
||||
# T2 — cogs (workspace members)
|
||||
cd v2
|
||||
cargo bench -p cog-person-count --no-default-features --bench infer_bench
|
||||
cargo bench -p cog-pose-estimation --no-default-features --bench infer_bench
|
||||
|
||||
# existing tests still green (behavior unchanged)
|
||||
cargo test -p cog-person-count -p cog-pose-estimation --no-default-features
|
||||
```
|
||||
|
||||
## Consequences
|
||||
|
||||
- ADR-160's deferred *"Criterion benches for `process_frame` budget claims"* line
|
||||
is now **DONE (host)**; the ESP32-on-hardware confirmation is explicitly the
|
||||
one remaining pending item.
|
||||
- The cogs now ship committed, reproducible steady-state inference-latency
|
||||
numbers, cleanly distinguished from the manifest's cold-start claim.
|
||||
- No runtime behavior changed; no crate republishes. `PROOF.md`'s performance
|
||||
table and `scripts/prove.sh`'s gated section reference the new benches.
|
||||
@@ -131,6 +131,7 @@ else
|
||||
SKIP "named person-identity — DATA-GATED: needs a real enrollment feeding the AETHER/body-resonance channel (see docs/research/soul/)"
|
||||
SKIP "OccWorld trained accuracy — needs a trained checkpoint (predict() carries weights_trained=false until then)"
|
||||
SKIP "native wlanapi 9.74 Hz scan — Windows-only; run: cargo test -p wifi-densepose-wifiscan -- --ignored measure_native_scan_rate"
|
||||
SKIP "edge-latency benches (ADR-163) — host medians, not asserted here: (cd v2/crates/wifi-densepose-wasm-edge && cargo bench --features std) and (cd v2 && cargo bench -p cog-person-count -p cog-pose-estimation --no-default-features --bench infer_bench). HOST proxy only — the ESP32/WASM3 budget is NOT reproduced on a laptop; see benchmarks/edge-latency/RESULTS.md"
|
||||
echo " (re-run with --full to attempt the feature-gated subset where prereqs exist)"
|
||||
fi
|
||||
hr
|
||||
|
||||
Generated
+2
@@ -1015,6 +1015,7 @@ dependencies = [
|
||||
"candle-core 0.9.2",
|
||||
"candle-nn 0.9.2",
|
||||
"clap",
|
||||
"criterion",
|
||||
"safetensors 0.4.5",
|
||||
"serde",
|
||||
"serde_json",
|
||||
@@ -1034,6 +1035,7 @@ dependencies = [
|
||||
"candle-core 0.9.2",
|
||||
"candle-nn 0.9.2",
|
||||
"clap",
|
||||
"criterion",
|
||||
"hex",
|
||||
"safetensors 0.4.5",
|
||||
"serde",
|
||||
|
||||
@@ -34,6 +34,12 @@ safetensors = "0.4"
|
||||
[dev-dependencies]
|
||||
tempfile = "3"
|
||||
approx = "0.5"
|
||||
# ADR-163: steady-state infer latency bench (real count_v1 weights, Device::Cpu).
|
||||
criterion = { version = "0.5", features = ["html_reports"] }
|
||||
|
||||
[[bench]]
|
||||
name = "infer_bench"
|
||||
harness = false
|
||||
|
||||
[features]
|
||||
default = []
|
||||
|
||||
@@ -0,0 +1,95 @@
|
||||
//! Criterion bench for `cog-person-count` steady-state inference latency
|
||||
//! (ADR-163, closing the ADR-159/160 deferred "cog inference latency bench" item).
|
||||
//!
|
||||
//! ## What this measures — and what the manifest's `cold_start_ms` does NOT
|
||||
//!
|
||||
//! This benches **steady-state** `InferenceEngine::infer` over a FIXED CSI
|
||||
//! window on `Device::Cpu` with the **real** shipped `count_v1.safetensors`
|
||||
//! weights — i.e. the per-frame cost once the model is loaded and warm.
|
||||
//!
|
||||
//! The cog manifest's `build_metadata.cold_start_ms_avg` (in the pose cog;
|
||||
//! person-count's manifest carries comparable provenance) is a **DIFFERENT
|
||||
//! measurement**: it includes one-time weight load / mmap / first-forward
|
||||
//! allocation. Cold-start is a startup cost paid once; steady-state infer is the
|
||||
//! recurring per-frame cost. They are not comparable and we do not conflate them.
|
||||
//! `cold_start` was measured on ruvultra (RTX 5080 host, candle 0.9 cpu); this
|
||||
//! bench runs on whatever machine you run it on — see `benchmarks/edge-latency/RESULTS.md`
|
||||
//! for the host the committed numbers were taken on.
|
||||
//!
|
||||
//! If the weights file is absent the engine falls back to the zero-confidence
|
||||
//! stub; we skip the bench in that case rather than benchmark the stub (which
|
||||
//! would be a meaningless number) — the bench prints a notice and measures a
|
||||
//! no-op so criterion still produces a (clearly-labelled) datapoint.
|
||||
//!
|
||||
//! Run (cog crates are normal workspace members):
|
||||
//! cd v2 && cargo bench -p cog-person-count --no-default-features
|
||||
//! cd v2 && cargo bench -p cog-person-count --no-default-features -- --warm-up-time 1 --measurement-time 2
|
||||
|
||||
use std::hint::black_box;
|
||||
use std::path::Path;
|
||||
|
||||
use criterion::{criterion_group, criterion_main, Criterion};
|
||||
|
||||
use cog_person_count::inference::{CsiWindow, InferenceEngine, INPUT_SUBCARRIERS, INPUT_TIMESTEPS};
|
||||
|
||||
/// Deterministic fixed CSI window (seed-stable LCG), normalised-ish amplitudes.
|
||||
fn fixed_window() -> CsiWindow {
|
||||
let mut s = 0x00C0_FFEEu32;
|
||||
let data: Vec<f32> = (0..INPUT_SUBCARRIERS * INPUT_TIMESTEPS)
|
||||
.map(|_| {
|
||||
s = s.wrapping_mul(1103515245).wrapping_add(12345);
|
||||
(s >> 16) as f32 / 32768.0 // [0, 1)
|
||||
})
|
||||
.collect();
|
||||
CsiWindow { data }
|
||||
}
|
||||
|
||||
/// Locate the real weights from the crate dir or the repo root.
|
||||
fn real_weights() -> Option<std::path::PathBuf> {
|
||||
let candidates = [
|
||||
"cog/artifacts/count_v1.safetensors",
|
||||
"v2/crates/cog-person-count/cog/artifacts/count_v1.safetensors",
|
||||
"crates/cog-person-count/cog/artifacts/count_v1.safetensors",
|
||||
];
|
||||
candidates
|
||||
.iter()
|
||||
.map(Path::new)
|
||||
.find(|p| p.exists())
|
||||
.map(|p| p.to_path_buf())
|
||||
}
|
||||
|
||||
fn bench_infer(c: &mut Criterion) {
|
||||
let window = fixed_window();
|
||||
|
||||
match real_weights() {
|
||||
Some(path) => {
|
||||
let engine =
|
||||
InferenceEngine::with_weights(Some(&path)).expect("load real count_v1 weights");
|
||||
assert!(
|
||||
engine.backend().starts_with("candle-"),
|
||||
"expected real Candle backend, got {} — bench would measure the stub",
|
||||
engine.backend()
|
||||
);
|
||||
// Sanity: one real inference before timing.
|
||||
let _ = engine.infer(&window).expect("warmup infer");
|
||||
|
||||
c.bench_function("cog_person_count::infer[cpu_real_weights_steady_state]", |b| {
|
||||
b.iter(|| {
|
||||
black_box(engine.infer(black_box(&window)).expect("infer"));
|
||||
});
|
||||
});
|
||||
}
|
||||
None => {
|
||||
eprintln!(
|
||||
"NOTE: count_v1.safetensors not found — skipping the real-weights infer bench. \
|
||||
(The committed RESULTS.md numbers require the in-repo weights.)"
|
||||
);
|
||||
c.bench_function("cog_person_count::infer[SKIPPED_no_weights]", |b| {
|
||||
b.iter(|| black_box(1 + 1));
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_infer);
|
||||
criterion_main!(benches);
|
||||
@@ -39,6 +39,12 @@ wifi-densepose-train = { version = "0.3.1", path = "../wifi-densepose-train", de
|
||||
|
||||
[dev-dependencies]
|
||||
tempfile = "3"
|
||||
# ADR-163: steady-state infer latency bench (real pose_v1 weights, Device::Cpu).
|
||||
criterion = { version = "0.5", features = ["html_reports"] }
|
||||
|
||||
[[bench]]
|
||||
name = "infer_bench"
|
||||
harness = false
|
||||
|
||||
[features]
|
||||
default = []
|
||||
|
||||
@@ -0,0 +1,89 @@
|
||||
//! Criterion bench for `cog-pose-estimation` steady-state inference latency
|
||||
//! (ADR-163, closing the ADR-159/160 deferred "cog inference latency bench" item).
|
||||
//!
|
||||
//! ## What this measures — and what the manifest's `cold_start_ms_avg` does NOT
|
||||
//!
|
||||
//! The pose cog's manifest (`cog/artifacts/manifests/x86_64/manifest.json`)
|
||||
//! cites `build_metadata.cold_start_ms_avg: 5.4` (30 invocations, measured on
|
||||
//! ruvultra / RTX 5080 host, candle 0.9 cpu). **That is a cold-start number** —
|
||||
//! it folds in one-time weight load / mmap / first-forward allocation.
|
||||
//!
|
||||
//! This bench measures the **steady-state** per-frame cost instead:
|
||||
//! `InferenceEngine::infer` over a FIXED CSI window on `Device::Cpu` with the
|
||||
//! **real** shipped `pose_v1.safetensors`, after a warm-up forward. Steady-state
|
||||
//! and cold-start are different measurements; we label both honestly and do not
|
||||
//! claim this reproduces the 5.4 ms manifest figure (different machine, different
|
||||
//! measurement). See `benchmarks/edge-latency/RESULTS.md`.
|
||||
//!
|
||||
//! Run (cog crates are normal workspace members):
|
||||
//! cd v2 && cargo bench -p cog-pose-estimation --no-default-features
|
||||
//! cd v2 && cargo bench -p cog-pose-estimation --no-default-features -- --warm-up-time 1 --measurement-time 2
|
||||
|
||||
use std::hint::black_box;
|
||||
use std::path::Path;
|
||||
|
||||
use criterion::{criterion_group, criterion_main, Criterion};
|
||||
|
||||
use cog_pose_estimation::inference::{
|
||||
CsiWindow, InferenceEngine, INPUT_SUBCARRIERS, INPUT_TIMESTEPS,
|
||||
};
|
||||
|
||||
/// Deterministic fixed CSI window (seed-stable LCG).
|
||||
fn fixed_window() -> CsiWindow {
|
||||
let mut s = 0x00C0_FFEEu32;
|
||||
let data: Vec<f32> = (0..INPUT_SUBCARRIERS * INPUT_TIMESTEPS)
|
||||
.map(|_| {
|
||||
s = s.wrapping_mul(1103515245).wrapping_add(12345);
|
||||
(s >> 16) as f32 / 32768.0 // [0, 1)
|
||||
})
|
||||
.collect();
|
||||
CsiWindow { data }
|
||||
}
|
||||
|
||||
fn real_weights() -> Option<std::path::PathBuf> {
|
||||
let candidates = [
|
||||
"cog/artifacts/pose_v1.safetensors",
|
||||
"v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.safetensors",
|
||||
"crates/cog-pose-estimation/cog/artifacts/pose_v1.safetensors",
|
||||
];
|
||||
candidates
|
||||
.iter()
|
||||
.map(Path::new)
|
||||
.find(|p| p.exists())
|
||||
.map(|p| p.to_path_buf())
|
||||
}
|
||||
|
||||
fn bench_infer(c: &mut Criterion) {
|
||||
let window = fixed_window();
|
||||
|
||||
match real_weights() {
|
||||
Some(path) => {
|
||||
let engine =
|
||||
InferenceEngine::with_weights(Some(&path)).expect("load real pose_v1 weights");
|
||||
assert!(
|
||||
engine.backend().starts_with("candle-"),
|
||||
"expected real Candle backend, got {} — bench would measure the stub",
|
||||
engine.backend()
|
||||
);
|
||||
let _ = engine.infer(&window).expect("warmup infer");
|
||||
|
||||
c.bench_function("cog_pose_estimation::infer[cpu_real_weights_steady_state]", |b| {
|
||||
b.iter(|| {
|
||||
black_box(engine.infer(black_box(&window)).expect("infer"));
|
||||
});
|
||||
});
|
||||
}
|
||||
None => {
|
||||
eprintln!(
|
||||
"NOTE: pose_v1.safetensors not found — skipping the real-weights infer bench. \
|
||||
(The committed RESULTS.md numbers require the in-repo weights.)"
|
||||
);
|
||||
c.bench_function("cog_pose_estimation::infer[SKIPPED_no_weights]", |b| {
|
||||
b.iter(|| black_box(1 + 1));
|
||||
});
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
criterion_group!(benches, bench_infer);
|
||||
criterion_main!(benches);
|
||||
+577
@@ -2,6 +2,33 @@
|
||||
# It is not intended for manual editing.
|
||||
version = 4
|
||||
|
||||
[[package]]
|
||||
name = "aho-corasick"
|
||||
version = "1.1.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ddd31a130427c27518df266943a5308ed92d4b226cc639f5a8f1002816174301"
|
||||
dependencies = [
|
||||
"memchr",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "anes"
|
||||
version = "0.1.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "4b46cbb362ab8752921c97e041f5e366ee6297bd428a31275b9fcf1e380f7299"
|
||||
|
||||
[[package]]
|
||||
name = "anstyle"
|
||||
version = "1.0.14"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "940b3a0ca603d1eade50a4846a2afffd5ef57a9feac2c0e2ec2e14f9ead76000"
|
||||
|
||||
[[package]]
|
||||
name = "autocfg"
|
||||
version = "1.5.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f2032f911046de80f0a198e0901378627c33f59ea0ac00e363d481118bd70a53"
|
||||
|
||||
[[package]]
|
||||
name = "block-buffer"
|
||||
version = "0.10.4"
|
||||
@@ -11,12 +38,76 @@ dependencies = [
|
||||
"generic-array",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "bumpalo"
|
||||
version = "3.20.3"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "72f5acc6cb2ba439de613abc23857ec3d78374d8ed5ac84e9d11336e87da8649"
|
||||
|
||||
[[package]]
|
||||
name = "cast"
|
||||
version = "0.3.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "37b2a672a2cb129a2e41c10b1224bb368f9f37a2b16b612598138befd7b37eb5"
|
||||
|
||||
[[package]]
|
||||
name = "cfg-if"
|
||||
version = "1.0.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9330f8b2ff13f34540b44e946ef35111825727b38d33286ef986142615121801"
|
||||
|
||||
[[package]]
|
||||
name = "ciborium"
|
||||
version = "0.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "42e69ffd6f0917f5c029256a24d0161db17cea3997d185db0d35926308770f0e"
|
||||
dependencies = [
|
||||
"ciborium-io",
|
||||
"ciborium-ll",
|
||||
"serde",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "ciborium-io"
|
||||
version = "0.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "05afea1e0a06c9be33d539b876f1ce3692f4afea2cb41f740e7743225ed1c757"
|
||||
|
||||
[[package]]
|
||||
name = "ciborium-ll"
|
||||
version = "0.2.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "57663b653d948a338bfb3eeba9bb2fd5fcfaecb9e199e87e1eda4d9e8b240fd9"
|
||||
dependencies = [
|
||||
"ciborium-io",
|
||||
"half",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "clap"
|
||||
version = "4.6.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1ddb117e43bbf7dacf0a4190fef4d345b9bad68dfc649cb349e7d17d28428e51"
|
||||
dependencies = [
|
||||
"clap_builder",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "clap_builder"
|
||||
version = "4.6.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "714a53001bf66416adb0e2ef5ac857140e7dc3a0c48fb28b2f10762fc4b5069f"
|
||||
dependencies = [
|
||||
"anstyle",
|
||||
"clap_lex",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "clap_lex"
|
||||
version = "1.1.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c8d4a3bb8b1e0c1050499d1815f5ab16d04f0959b233085fb31653fbfc9d98f9"
|
||||
|
||||
[[package]]
|
||||
name = "cpufeatures"
|
||||
version = "0.2.17"
|
||||
@@ -26,6 +117,73 @@ dependencies = [
|
||||
"libc",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "criterion"
|
||||
version = "0.5.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f2b12d017a929603d80db1831cd3a24082f8137ce19c69e6447f54f5fc8d692f"
|
||||
dependencies = [
|
||||
"anes",
|
||||
"cast",
|
||||
"ciborium",
|
||||
"clap",
|
||||
"criterion-plot",
|
||||
"is-terminal",
|
||||
"itertools",
|
||||
"num-traits",
|
||||
"once_cell",
|
||||
"oorandom",
|
||||
"plotters",
|
||||
"rayon",
|
||||
"regex",
|
||||
"serde",
|
||||
"serde_derive",
|
||||
"serde_json",
|
||||
"tinytemplate",
|
||||
"walkdir",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "criterion-plot"
|
||||
version = "0.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6b50826342786a51a89e2da3a28f1c32b06e387201bc2d19791f622c673706b1"
|
||||
dependencies = [
|
||||
"cast",
|
||||
"itertools",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crossbeam-deque"
|
||||
version = "0.8.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9dd111b7b7f7d55b72c0a6ae361660ee5853c9af73f70c3c2ef6858b950e2e51"
|
||||
dependencies = [
|
||||
"crossbeam-epoch",
|
||||
"crossbeam-utils",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crossbeam-epoch"
|
||||
version = "0.9.18"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5b82ac4a3c2ca9c3460964f020e1402edd5753411d7737aa39c3714ad1b5420e"
|
||||
dependencies = [
|
||||
"crossbeam-utils",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "crossbeam-utils"
|
||||
version = "0.8.21"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d0a5c400df2834b80a4c3327b3aad3a4c4cd4de0629063962b03235697506a28"
|
||||
|
||||
[[package]]
|
||||
name = "crunchy"
|
||||
version = "0.2.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "460fbee9c2c2f33933d720630a6a0bac33ba7053db5344fac858d4b8952d77d5"
|
||||
|
||||
[[package]]
|
||||
name = "crypto-common"
|
||||
version = "0.1.7"
|
||||
@@ -46,6 +204,36 @@ dependencies = [
|
||||
"crypto-common",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "either"
|
||||
version = "1.16.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "91622ff5e7162018101f2fea40d6ebf4a78bbe5a49736a2020649edf9693679e"
|
||||
|
||||
[[package]]
|
||||
name = "futures-core"
|
||||
version = "0.3.32"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7e3450815272ef58cec6d564423f6e755e25379b217b0bc688e295ba24df6b1d"
|
||||
|
||||
[[package]]
|
||||
name = "futures-task"
|
||||
version = "0.3.32"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "037711b3d59c33004d3856fbdc83b99d4ff37a24768fa1be9ce3538a1cde4393"
|
||||
|
||||
[[package]]
|
||||
name = "futures-util"
|
||||
version = "0.3.32"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "389ca41296e6190b48053de0321d02a77f32f8a5d2461dd38762c0593805c6d6"
|
||||
dependencies = [
|
||||
"futures-core",
|
||||
"futures-task",
|
||||
"pin-project-lite",
|
||||
"slab",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "generic-array"
|
||||
version = "0.14.7"
|
||||
@@ -56,6 +244,60 @@ dependencies = [
|
||||
"version_check",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "half"
|
||||
version = "2.7.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6ea2d84b969582b4b1864a92dc5d27cd2b77b622a8d79306834f1be5ba20d84b"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"crunchy",
|
||||
"zerocopy",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "hermit-abi"
|
||||
version = "0.5.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "fc0fef456e4baa96da950455cd02c081ca953b141298e41db3fc7e36b1da849c"
|
||||
|
||||
[[package]]
|
||||
name = "is-terminal"
|
||||
version = "0.4.17"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "3640c1c38b8e4e43584d8df18be5fc6b0aa314ce6ebf51b53313d4306cca8e46"
|
||||
dependencies = [
|
||||
"hermit-abi",
|
||||
"libc",
|
||||
"windows-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "itertools"
|
||||
version = "0.10.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b0fd2260e829bddf4cb6ea802289de2f86d6a7a690192fbe91b3f46e0f2c8473"
|
||||
dependencies = [
|
||||
"either",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "itoa"
|
||||
version = "1.0.18"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8f42a60cbdf9a97f5d2305f08a87dc4e09308d1276d28c869c684d7777685682"
|
||||
|
||||
[[package]]
|
||||
name = "js-sys"
|
||||
version = "0.3.100"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f2025f20d7a4fa7785846e7b63d10a76d3f1cee98ee5cb79ea59703f95e42162"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"futures-util",
|
||||
"wasm-bindgen",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "libc"
|
||||
version = "0.2.182"
|
||||
@@ -68,6 +310,192 @@ version = "0.2.16"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b6d2cec3eae94f9f509c767b45932f1ada8350c4bdb85af2fcab4a3c14807981"
|
||||
|
||||
[[package]]
|
||||
name = "memchr"
|
||||
version = "2.8.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "88904434abc2901f197fe8cc55f0445e7ded921dba5911dad2e2b39b48e663c4"
|
||||
|
||||
[[package]]
|
||||
name = "num-traits"
|
||||
version = "0.2.19"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "071dfc062690e90b734c0b2273ce72ad0ffa95f0c74596bc250dcfd960262841"
|
||||
dependencies = [
|
||||
"autocfg",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "once_cell"
|
||||
version = "1.21.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9f7c3e4beb33f85d45ae3e3a1792185706c8e16d043238c593331cc7cd313b50"
|
||||
|
||||
[[package]]
|
||||
name = "oorandom"
|
||||
version = "11.1.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d6790f58c7ff633d8771f42965289203411a5e5c68388703c06e14f24770b41e"
|
||||
|
||||
[[package]]
|
||||
name = "pin-project-lite"
|
||||
version = "0.2.17"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "a89322df9ebe1c1578d689c92318e070967d1042b512afbe49518723f4e6d5cd"
|
||||
|
||||
[[package]]
|
||||
name = "plotters"
|
||||
version = "0.3.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "5aeb6f403d7a4911efb1e33402027fc44f29b5bf6def3effcc22d7bb75f2b747"
|
||||
dependencies = [
|
||||
"num-traits",
|
||||
"plotters-backend",
|
||||
"plotters-svg",
|
||||
"wasm-bindgen",
|
||||
"web-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "plotters-backend"
|
||||
version = "0.3.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "df42e13c12958a16b3f7f4386b9ab1f3e7933914ecea48da7139435263a4172a"
|
||||
|
||||
[[package]]
|
||||
name = "plotters-svg"
|
||||
version = "0.3.7"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "51bae2ac328883f7acdfea3d66a7c35751187f870bc81f94563733a154d7a670"
|
||||
dependencies = [
|
||||
"plotters-backend",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "proc-macro2"
|
||||
version = "1.0.106"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "8fd00f0bb2e90d81d1044c2b32617f68fcb9fa3bb7640c23e9c748e53fb30934"
|
||||
dependencies = [
|
||||
"unicode-ident",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "quote"
|
||||
version = "1.0.45"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "41f2619966050689382d2b44f664f4bc593e129785a36d6ee376ddf37259b924"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rayon"
|
||||
version = "1.12.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "fb39b166781f92d482534ef4b4b1b2568f42613b53e5b6c160e24cfbfa30926d"
|
||||
dependencies = [
|
||||
"either",
|
||||
"rayon-core",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "rayon-core"
|
||||
version = "1.13.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "22e18b0f0062d30d4230b2e85ff77fdfe4326feb054b9783a3460d8435c8ab91"
|
||||
dependencies = [
|
||||
"crossbeam-deque",
|
||||
"crossbeam-utils",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "regex"
|
||||
version = "1.12.4"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f1292b7759ae1cb9ec195452d1390a074f0cd8541ab7a5a8c31cd6db45d4a6ba"
|
||||
dependencies = [
|
||||
"aho-corasick",
|
||||
"memchr",
|
||||
"regex-automata",
|
||||
"regex-syntax",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "regex-automata"
|
||||
version = "0.4.14"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6e1dd4122fc1595e8162618945476892eefca7b88c52820e74af6262213cae8f"
|
||||
dependencies = [
|
||||
"aho-corasick",
|
||||
"memchr",
|
||||
"regex-syntax",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "regex-syntax"
|
||||
version = "0.8.11"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d6f6ff9a378485b298a5286656da665ba74413d36db0979633275d2e708145d4"
|
||||
|
||||
[[package]]
|
||||
name = "rustversion"
|
||||
version = "1.0.22"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b39cdef0fa800fc44525c84ccb54a029961a8215f9619753635a9c0d2538d46d"
|
||||
|
||||
[[package]]
|
||||
name = "same-file"
|
||||
version = "1.0.6"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "93fc1dc3aaa9bfed95e02e6eadabb4baf7e3078b0bd1b4d7b6b0b68378900502"
|
||||
dependencies = [
|
||||
"winapi-util",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "serde"
|
||||
version = "1.0.228"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "9a8e94ea7f378bd32cbbd37198a4a91436180c5bb472411e48b5ec2e2124ae9e"
|
||||
dependencies = [
|
||||
"serde_core",
|
||||
"serde_derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "serde_core"
|
||||
version = "1.0.228"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "41d385c7d4ca58e59fc732af25c3983b67ac852c1a25000afe1175de458b67ad"
|
||||
dependencies = [
|
||||
"serde_derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "serde_derive"
|
||||
version = "1.0.228"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "d540f220d3187173da220f885ab66608367b6574e925011a9353e4badda91d79"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "serde_json"
|
||||
version = "1.0.150"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e8014e44b4736ed0538adeecded0fce2a272f22dc9578a7eb6b2d9993c74cfb9"
|
||||
dependencies = [
|
||||
"itoa",
|
||||
"memchr",
|
||||
"serde",
|
||||
"serde_core",
|
||||
"zmij",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "sha2"
|
||||
version = "0.10.9"
|
||||
@@ -79,22 +507,171 @@ dependencies = [
|
||||
"digest",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "slab"
|
||||
version = "0.4.12"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0c790de23124f9ab44544d7ac05d60440adc586479ce501c1d6d7da3cd8c9cf5"
|
||||
|
||||
[[package]]
|
||||
name = "syn"
|
||||
version = "2.0.117"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e665b8803e7b1d2a727f4023456bbbbe74da67099c585258af0ad9c5013b9b99"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"unicode-ident",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "tinytemplate"
|
||||
version = "1.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "be4d6b5f19ff7664e8c98d03e2139cb510db9b0a60b55f8e8709b689d939b6bc"
|
||||
dependencies = [
|
||||
"serde",
|
||||
"serde_json",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "typenum"
|
||||
version = "1.19.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "562d481066bde0658276a35467c4af00bdc6ee726305698a55b86e61d7ad82bb"
|
||||
|
||||
[[package]]
|
||||
name = "unicode-ident"
|
||||
version = "1.0.24"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "e6e4313cd5fcd3dad5cafa179702e2b244f760991f45397d14d4ebf38247da75"
|
||||
|
||||
[[package]]
|
||||
name = "version_check"
|
||||
version = "0.9.5"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "0b928f33d975fc6ad9f86c8f283853ad26bdd5b10b7f1542aa2fa15e2289105a"
|
||||
|
||||
[[package]]
|
||||
name = "walkdir"
|
||||
version = "2.5.0"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "29790946404f91d9c5d06f9874efddea1dc06c5efe94541a7d6863108e3a5e4b"
|
||||
dependencies = [
|
||||
"same-file",
|
||||
"winapi-util",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen"
|
||||
version = "0.2.123"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "a254a4b10c19a76f09a27640e7ffbf9bc30bf67e16a3bf28aaefa4920fe81563"
|
||||
dependencies = [
|
||||
"cfg-if",
|
||||
"once_cell",
|
||||
"rustversion",
|
||||
"wasm-bindgen-macro",
|
||||
"wasm-bindgen-shared",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-macro"
|
||||
version = "0.2.123"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "24a40fc75b0ec6f3746ceb10d36f53a93dcd68a93b11b6445983945d79eba0dc"
|
||||
dependencies = [
|
||||
"quote",
|
||||
"wasm-bindgen-macro-support",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-macro-support"
|
||||
version = "0.2.123"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "908f34bd9b9ce3d4caf07b72dfab63d61504d156856c6bd3cd87fa350cf3985b"
|
||||
dependencies = [
|
||||
"bumpalo",
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
"wasm-bindgen-shared",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wasm-bindgen-shared"
|
||||
version = "0.2.123"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "7acbf7616c27b194bbb550bf77ed0c2c3e5b7fd1260a93082b95fb7f47959b92"
|
||||
dependencies = [
|
||||
"unicode-ident",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "web-sys"
|
||||
version = "0.3.100"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "6e0871acf327f283dc6da28a1696cdc64fb355ba9f935d052021fa77f35cce69"
|
||||
dependencies = [
|
||||
"js-sys",
|
||||
"wasm-bindgen",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "wifi-densepose-wasm-edge"
|
||||
version = "0.3.0"
|
||||
dependencies = [
|
||||
"criterion",
|
||||
"libm",
|
||||
"sha2",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "winapi-util"
|
||||
version = "0.1.11"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "c2a7b1c03c876122aa43f3020e6c3c3ee5c05081c9a00739faf7503aeba10d22"
|
||||
dependencies = [
|
||||
"windows-sys",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "windows-link"
|
||||
version = "0.2.1"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "f0805222e57f7521d6a62e36fa9163bc891acd422f971defe97d64e70d0a4fe5"
|
||||
|
||||
[[package]]
|
||||
name = "windows-sys"
|
||||
version = "0.61.2"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ae137229bcbd6cdf0f7b80a31df61766145077ddf49416a728b02cb3921ff3fc"
|
||||
dependencies = [
|
||||
"windows-link",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerocopy"
|
||||
version = "0.8.52"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "ce1022995ff5ff5d841ad7d994facc23098cd40152f2c1d11cd607c6f530653f"
|
||||
dependencies = [
|
||||
"zerocopy-derive",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zerocopy-derive"
|
||||
version = "0.8.52"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "1ae7f38b72ec2a254e2b87ef277cf2cd4fb97cbebf944faa6f33354da0867930"
|
||||
dependencies = [
|
||||
"proc-macro2",
|
||||
"quote",
|
||||
"syn",
|
||||
]
|
||||
|
||||
[[package]]
|
||||
name = "zmij"
|
||||
version = "1.0.21"
|
||||
source = "registry+https://github.com/rust-lang/crates.io-index"
|
||||
checksum = "b8848ee67ecc8aedbaf3e4122217aff892639231befc6a1b58d29fff4c2cabaa"
|
||||
|
||||
@@ -11,6 +11,20 @@ categories = ["embedded", "wasm", "science"]
|
||||
|
||||
[lib]
|
||||
crate-type = ["cdylib", "rlib"]
|
||||
# The lib's libtest harness does not understand criterion CLI flags
|
||||
# (`--warm-up-time` etc.), so exclude it from `cargo bench` — only the criterion
|
||||
# bench target below should receive bench args (ADR-163).
|
||||
bench = false
|
||||
|
||||
# ADR-163: host-measured process_frame latency benches (closes the ADR-160
|
||||
# "criterion benches for process_frame budget claims" deferred item — HOST only;
|
||||
# the ESP32-S3 WASM3 budget remains unmeasured, see the bench header).
|
||||
# `std` is required (criterion is a host crate); the crate is workspace-EXCLUDED
|
||||
# so run from the crate dir: `cargo bench --features std`.
|
||||
[[bench]]
|
||||
name = "process_frame_bench"
|
||||
harness = false
|
||||
required-features = ["std"]
|
||||
|
||||
[dependencies]
|
||||
# no_std math
|
||||
@@ -18,6 +32,11 @@ libm = "0.2"
|
||||
# SHA-256 for RVF build hash (optional, used by builder)
|
||||
sha2 = { version = "0.10", optional = true, default-features = false }
|
||||
|
||||
[dev-dependencies]
|
||||
# Host-only latency regression benches (ADR-163). Pinned to match the rest of
|
||||
# the workspace's bench crates.
|
||||
criterion = { version = "0.5", features = ["html_reports"] }
|
||||
|
||||
[features]
|
||||
default = ["default-pipeline"]
|
||||
# Enable std for testing on host + RVF builder
|
||||
|
||||
@@ -0,0 +1,259 @@
|
||||
//! Criterion benches for the heaviest `process_frame` hot paths in the edge
|
||||
//! skill library (ADR-163, closing the ADR-160 §"Deferred Backlog" item
|
||||
//! "Criterion benches for process_frame budget claims").
|
||||
//!
|
||||
//! ## HONEST SCOPE — read this before citing any number here
|
||||
//!
|
||||
//! These benches measure **HOST** wall-clock latency on a development laptop.
|
||||
//! The per-module doc budgets (e.g. `exo_time_crystal` "H (heavy, <10ms) on
|
||||
//! ESP32-S3 WASM3") are **for a different target**: an Xtensa ESP32-S3 running
|
||||
//! the WASM3 interpreter. A native x86_64 host with `-O` is an **upper-bound
|
||||
//! proxy for the ALGORITHM cost only**; it is NOT the ESP32 number and does NOT
|
||||
//! reproduce the ESP32 budget. WASM3 interpretation on a ~240 MHz Xtensa core is
|
||||
//! typically 1-2 orders of magnitude slower than native host code, so a host
|
||||
//! median well under the budget does NOT prove the ESP32 meets it — it only
|
||||
//! bounds the work. The ESP32 figure remains UNMEASURED (needs hardware).
|
||||
//!
|
||||
//! What these benches DO prove (MEASURED-on-host):
|
||||
//! * the hot paths run, on a fixed synthetic CSI frame, with a real median;
|
||||
//! * a regression guard exists so a future change that 10×'s the host cost
|
||||
//! is caught in CI/dev even before anyone reflashes an ESP32.
|
||||
//!
|
||||
//! Run (the crate is EXCLUDED from the v2 workspace — bench from the crate dir):
|
||||
//! cd v2/crates/wifi-densepose-wasm-edge
|
||||
//! cargo bench --features std
|
||||
//! # quick smoke:
|
||||
//! cargo bench --features std -- --warm-up-time 1 --measurement-time 2
|
||||
//!
|
||||
//! `med_seizure_detect` is gated behind `medical-experimental`; its bench is
|
||||
//! `#[cfg(feature = "medical-experimental")]` and only runs when that feature is
|
||||
//! also enabled:
|
||||
//! cargo bench --features std,medical-experimental
|
||||
|
||||
use criterion::{criterion_group, criterion_main, BatchSize, Criterion};
|
||||
use std::hint::black_box;
|
||||
|
||||
use wifi_densepose_wasm_edge::exo_ghost_hunter::GhostHunterDetector;
|
||||
use wifi_densepose_wasm_edge::exo_time_crystal::TimeCrystalDetector;
|
||||
use wifi_densepose_wasm_edge::sec_weapon_detect::WeaponDetector;
|
||||
|
||||
// ── Fixed synthetic CSI fixtures (deterministic LCG, seed-stable) ────────────
|
||||
|
||||
/// Deterministic pseudo-random in [lo, hi) from a 32-bit LCG, matching the
|
||||
/// generator style used by `tests/budget_compliance.rs`.
|
||||
fn lcg(seed: &mut u32) -> f32 {
|
||||
*seed = seed.wrapping_mul(1103515245).wrapping_add(12345);
|
||||
(*seed >> 16) as f32 / 32768.0
|
||||
}
|
||||
|
||||
fn synthetic_phases(n: usize, seed: u32) -> Vec<f32> {
|
||||
let mut s = seed;
|
||||
(0..n).map(|_| lcg(&mut s) * 6.2832 - 3.1416).collect()
|
||||
}
|
||||
|
||||
fn synthetic_amplitudes(n: usize, seed: u32) -> Vec<f32> {
|
||||
let mut s = seed;
|
||||
(0..n).map(|_| lcg(&mut s) * 10.0 + 0.1).collect()
|
||||
}
|
||||
|
||||
fn synthetic_variance(n: usize, seed: u32) -> Vec<f32> {
|
||||
let mut s = seed;
|
||||
(0..n).map(|_| lcg(&mut s) * 2.0 + 0.05).collect()
|
||||
}
|
||||
|
||||
const N_SC: usize = 32; // per-subcarrier width (matches both modules' MAX_SC)
|
||||
|
||||
// ── exo_time_crystal: compute_autocorrelation 256×128 hot path ───────────────
|
||||
//
|
||||
// `compute_autocorrelation` is private, so we drive it through the public
|
||||
// `process_frame`. To hit the full 256-point × 128-lag autocorrelation the
|
||||
// circular buffer must be FULL (≥256 samples) and the signal must be
|
||||
// non-constant (the module early-outs on `buf_var < 1e-8`). We pre-fill once
|
||||
// with a periodic-plus-noise motion-energy stream, then bench a single
|
||||
// `process_frame` (each call recomputes the full 256×128 autocorrelation =
|
||||
// ~32K multiply-accumulates, the M6-audit-named hot path).
|
||||
|
||||
fn prefilled_time_crystal() -> TimeCrystalDetector {
|
||||
let mut d = TimeCrystalDetector::new();
|
||||
let mut s = 0xC0FFEEu32;
|
||||
// 300 frames (> BUF_LEN=256) so the buffer is full and statistics are warm.
|
||||
for i in 0..300 {
|
||||
// period-10 square wave + small noise → guarantees buf_var > 0 and a
|
||||
// genuine autocorrelation structure (the expensive path runs).
|
||||
let base = if (i % 10) < 5 { 1.0 } else { 0.0 };
|
||||
let me = base + lcg(&mut s) * 0.05;
|
||||
black_box(d.process_frame(black_box(me)));
|
||||
}
|
||||
d
|
||||
}
|
||||
|
||||
fn bench_exo_time_crystal(c: &mut Criterion) {
|
||||
c.bench_function("exo_time_crystal::process_frame[autocorr_256x128]", |b| {
|
||||
let mut s = 0x1357_9BDFu32;
|
||||
b.iter_batched(
|
||||
prefilled_time_crystal,
|
||||
|mut d| {
|
||||
// One frame = one full 256×128 autocorrelation pass.
|
||||
let me = if (d.frame_count() % 10) < 5 { 1.0 } else { 0.0 } + lcg(&mut s) * 0.05;
|
||||
black_box(d.process_frame(black_box(me)));
|
||||
},
|
||||
BatchSize::SmallInput,
|
||||
);
|
||||
});
|
||||
}
|
||||
|
||||
// ── exo_ghost_hunter: periodicity + hidden-breathing hot path ────────────────
|
||||
//
|
||||
// Heaviest path runs only when the room is reported EMPTY (presence == 0):
|
||||
// per-group anomaly accumulation + aggregate-phase autocorrelation for hidden
|
||||
// periodic (breathing) signatures. We warm the noise floor + phase buffer first,
|
||||
// then bench one empty-room frame.
|
||||
|
||||
fn prefilled_ghost_hunter() -> GhostHunterDetector {
|
||||
let mut d = GhostHunterDetector::new();
|
||||
let mut s = 0xBADC0DEu32;
|
||||
// Warm the per-group EWMA noise floors + fill the phase buffer (PHASE_BUF_LEN=64)
|
||||
// with a periodic phase signal so the periodicity autocorrelation has structure.
|
||||
for i in 0..120u32 {
|
||||
let phases: Vec<f32> = (0..N_SC)
|
||||
.map(|k| libm::sinf(i as f32 * 0.4 + k as f32 * 0.1) * 0.3 + lcg(&mut s) * 0.02)
|
||||
.collect();
|
||||
let amps = synthetic_amplitudes(N_SC, 4000 + i);
|
||||
let var = synthetic_variance(N_SC, 4500 + i);
|
||||
black_box(d.process_frame(&phases, &s, &var, 0, 0.05));
|
||||
}
|
||||
d
|
||||
}
|
||||
|
||||
fn bench_exo_ghost_hunter(c: &mut Criterion) {
|
||||
let amps = synthetic_amplitudes(N_SC, 9000);
|
||||
let var = synthetic_variance(N_SC, 9500);
|
||||
c.bench_function("exo_ghost_hunter::process_frame[empty_room_periodicity]", |b| {
|
||||
let mut s = 0x2468_ACE0u32;
|
||||
b.iter_batched(
|
||||
prefilled_ghost_hunter,
|
||||
|mut d| {
|
||||
let i = d.frame_count();
|
||||
let phases: Vec<f32> = (0..N_SC)
|
||||
.map(|k| libm::sinf(i as f32 * 0.4 + k as f32 * 0.1) * 0.3 + lcg(&mut s) * 0.02)
|
||||
.collect();
|
||||
black_box(d.process_frame(
|
||||
black_box(&phases),
|
||||
black_box(&s),
|
||||
black_box(&var),
|
||||
black_box(0),
|
||||
black_box(0.05),
|
||||
));
|
||||
},
|
||||
BatchSize::SmallInput,
|
||||
);
|
||||
});
|
||||
}
|
||||
|
||||
// ── sec_weapon_detect: per-subcarrier Welford hot path ───────────────────────
|
||||
//
|
||||
// After calibration the detector runs a per-subcarrier online Welford update
|
||||
// over MAX_SC=32 subcarriers each frame (the M6-audit-named hot path). We
|
||||
// calibrate first (the early frames just accumulate baseline stats), then bench
|
||||
// one steady-state frame.
|
||||
|
||||
fn calibrated_weapon_detector() -> WeaponDetector {
|
||||
let mut d = WeaponDetector::new();
|
||||
// Drive enough empty-room frames to complete calibration + warm the running
|
||||
// Welford state. Calibration window is internal; 200 frames is comfortably
|
||||
// past it for MAX_SC=32.
|
||||
for i in 0..200u32 {
|
||||
let phases = synthetic_phases(N_SC, 6000 + i);
|
||||
let amps = synthetic_amplitudes(N_SC, 6500 + i);
|
||||
let var = synthetic_variance(N_SC, 7000 + i);
|
||||
black_box(d.process_frame(&phases, &s, &var, 0.05, 0));
|
||||
}
|
||||
d
|
||||
}
|
||||
|
||||
fn bench_sec_weapon_detect(c: &mut Criterion) {
|
||||
c.bench_function("sec_weapon_detect::process_frame[per_sc_welford]", |b| {
|
||||
let mut seed = 8000u32;
|
||||
b.iter_batched(
|
||||
calibrated_weapon_detector,
|
||||
|mut d| {
|
||||
seed = seed.wrapping_add(1);
|
||||
let phases = synthetic_phases(N_SC, seed);
|
||||
let amps = synthetic_amplitudes(N_SC, seed.wrapping_add(500));
|
||||
let var = synthetic_variance(N_SC, seed.wrapping_add(1000));
|
||||
black_box(d.process_frame(
|
||||
black_box(&phases),
|
||||
black_box(&s),
|
||||
black_box(&var),
|
||||
black_box(0.3),
|
||||
black_box(1),
|
||||
));
|
||||
},
|
||||
BatchSize::SmallInput,
|
||||
);
|
||||
});
|
||||
}
|
||||
|
||||
// ── med_seizure_detect: detect_rhythm / clonic autocorrelation hot path ──────
|
||||
//
|
||||
// Gated behind `medical-experimental` (ADR-160 §A1). The clonic-phase rhythm
|
||||
// detection autocorrelates the amplitude ring buffer (PHASE_WINDOW=100); we warm
|
||||
// the buffers with a high-energy rhythmic signal, then bench one frame.
|
||||
#[cfg(feature = "medical-experimental")]
|
||||
mod med {
|
||||
use super::*;
|
||||
use wifi_densepose_wasm_edge::med_seizure_detect::SeizureDetector;
|
||||
|
||||
fn warmed_seizure_detector() -> SeizureDetector {
|
||||
let mut d = SeizureDetector::new();
|
||||
let mut s = 0x5EE_D00Du32;
|
||||
// High-energy ~4 Hz rhythmic (period ~5 frames at 20 Hz) → exercises the
|
||||
// clonic-phase rhythm/autocorrelation path, with presence asserted.
|
||||
for i in 0..150u32 {
|
||||
let me = 2.5 + libm::sinf(i as f32 * 1.25) * 1.5;
|
||||
let amp = 1.0 + lcg(&mut s) * 0.2;
|
||||
black_box(d.process_frame(0.0, amp, me, 1));
|
||||
}
|
||||
d
|
||||
}
|
||||
|
||||
pub fn bench_med_seizure_detect(c: &mut Criterion) {
|
||||
c.bench_function("med_seizure_detect::process_frame[clonic_rhythm]", |b| {
|
||||
let mut s = 0x9A_BCDE_F0u32;
|
||||
b.iter_batched(
|
||||
warmed_seizure_detector,
|
||||
|mut d| {
|
||||
let i = d.frame_count();
|
||||
let me = 2.5 + libm::sinf(i as f32 * 1.25) * 1.5;
|
||||
let amp = 1.0 + lcg(&mut s) * 0.2;
|
||||
black_box(d.process_frame(
|
||||
black_box(0.0),
|
||||
black_box(amp),
|
||||
black_box(me),
|
||||
black_box(1),
|
||||
));
|
||||
},
|
||||
BatchSize::SmallInput,
|
||||
);
|
||||
});
|
||||
}
|
||||
}
|
||||
|
||||
#[cfg(feature = "medical-experimental")]
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_exo_time_crystal,
|
||||
bench_exo_ghost_hunter,
|
||||
bench_sec_weapon_detect,
|
||||
med::bench_med_seizure_detect,
|
||||
);
|
||||
|
||||
#[cfg(not(feature = "medical-experimental"))]
|
||||
criterion_group!(
|
||||
benches,
|
||||
bench_exo_time_crystal,
|
||||
bench_exo_ghost_hunter,
|
||||
bench_sec_weapon_detect,
|
||||
);
|
||||
|
||||
criterion_main!(benches);
|
||||
Reference in New Issue
Block a user