Compare commits

...

22 Commits

Author SHA1 Message Date
rUv 573b00fd98 perf(ci): drop dead uvicorn start from perf job (#917)
Since #915 the perf job gates only on test_frame_budget.py, which drives
the CSIProcessor pipeline in-process and makes no HTTP calls. The
"Start application" step (uvicorn + `sleep 10`) was therefore dead weight:
it existed only for the now-excluded api_throughput/inference_speed tests,
wasted ~10-15 s per main-push run, and dumped ~50 misleading
"router requires hardware setup" ERROR lines into every CI log for a
server no test touched. MOCK_POSE_DATA is server-only, unused here.

Removed the step and the vestigial env. The gated test is unchanged and
passes (verified locally, 3/3).
2026-06-02 19:01:08 +02:00
rUv 91b0e625bd docs(#882): complete the "100% presence" retraction across all docs (#916)
The v1 "100% presence accuracy" headline was already retracted in the
README / user-guide intro / proof-of-capabilities — but 6 secondary
spots still flatly claimed "100% accuracy, never false alarms", which
made proof-of-capabilities.md's "replaced everywhere" assertion untrue.

Completed the retraction in-place with the honest label-free metric
(82.3% held-out temporal-triplet; v1 was a single-class recording where
a constant "yes" scores ~99.98%):

- docs/readme-details.md — 2 benchmark tables + the pre-trained-model row
- docs/user-guide.md — capability table, model-file comment, applications list
- CHANGELOG.md — annotated the historical entry in-place (kept as public
  record per built-in-public ethos, not rewritten)

Verified: no remaining flat "100% presence/accuracy" claim lacks a
retraction marker; proof-of-capabilities.md "replaced everywhere" is now
accurate.
2026-06-02 18:50:39 +02:00
rUv 88b835dd89 fix(ci): perf job gates on the real frame-budget guard, not TDD stubs (#915)
After #914 fixed collection, the perf job actually ran the suite and
exposed that test_api_throughput.py / test_inference_speed.py are TDD
red-phase stubs (every test suffixed `_should_fail_initially`) that time
a *mock that sleeps* — not a real perf signal. They carry machine-
dependent wall-clock asserts (actual_rps >= 40, batch_time < individual_time)
that are inherently flaky on shared CI runners, plus a cross-class
fixture-scope bug (`fixture 'standard_model' not found`). Result: 3 failed,
10 errored — by design, not a regression.

Forcing those green would manufacture a false signal. Instead, gate only
on test_frame_budget.py, which times the *real* CSIProcessor pipeline
against the ADR 50 ms per-frame budget (single-frame, p95/100-frames,
+Doppler) — a genuine regression guard. Verified locally: 3 passed.

The stub files remain in-repo for local TDD; they re-enter CI when their
features are implemented and the mock-timing asserts are made deterministic.
2026-06-02 18:31:55 +02:00
rUv f8f08076eb fix(ci): perf tests — use python -m pytest so src import resolves (#914)
The Performance Tests job collected 26 items then aborted with
`ModuleNotFoundError: No module named 'src'` on test_frame_budget.py,
which does `from src.core.csi_processor import CSIProcessor`. The bare
`pytest` console script does not put the cwd (archive/v1) on sys.path;
`python -m pytest` does. pytest aborts the whole session on a collection
error, so this one import masked the entire (otherwise mock-based,
self-contained) perf suite.

Verified locally: bare-script path reproduces the exact error; `-m`
resolves it and test_frame_budget.py passes 3/3. The other two files
(test_api_throughput.py mock server, test_inference_speed.py MockPoseModel
+psutil) are fully self-contained — no test hits the running server.

Closes the last red job in the v1-API CI chain (#910/#911/#913).
2026-06-02 18:12:00 +02:00
rUv 55f6a74e1e Merge pull request #913 from ruvnet/fix/ci-v1-api-perms-locust
ci(v1-api): fix gh-pages 403 + run real pytest perf suite
2026-06-02 17:36:43 +02:00
ruv b5a91c5635 ci(v1-api): install pytest, drop root --cov addopts for perf suite, ascii comment 2026-06-02 17:29:04 +02:00
ruv 308d2fc89d ci(v1-api): fix gh-pages 403 + run real perf suite — green main CI
Two more latent v1-API CI bugs surfaced once #910/#911 let the jobs reach
their later steps:

- API Documentation: openapi generation now succeeds (psutil fix), but the
  gh-pages deploy failed with HTTP 403 — the job had no `permissions` block
  and GITHUB_TOKEN is read-only by default. Add `permissions: contents:
  write`, and make the deploy `continue-on-error` (the openapi generation is
  the real validation; Pages may be disabled).
- Performance Tests: ran `locust -f tests/performance/locustfile.py`, but
  there is no locustfile — the suite is pytest (test_api_throughput.py,
  test_frame_budget.py, test_inference_speed.py). Run pytest instead, with
  working-directory: archive/v1 and MOCK_POSE_DATA=true.

ci.yml validated as well-formed YAML.
2026-06-02 17:26:39 +02:00
rUv 5038e3c8e1 Merge pull request #911 from ruvnet/fix/ci-v1-api-mock-mode
ci(v1-api): MOCK_POSE_DATA + declare psutil — green Performance Tests & API Docs
2026-06-02 06:20:21 -04:00
ruv e239af3636 fix(deps): declare psutil in requirements.txt — green API Documentation CI
The API Documentation job (and any env without locust) failed with
`ModuleNotFoundError: No module named 'psutil'` when importing the app:
psutil is imported by src/api/routers/health.py, services/metrics.py,
commands/status.py, and tasks/monitoring.py, but was never declared as a
dependency — it only happened to be present where locust (Performance
Tests) pulled it in transitively. Declare it explicitly (psutil>=5.9.0).

Verified locally: `from src.api.main import app; app.openapi()` (the exact
docs-job operation) now succeeds.
2026-06-02 12:11:55 +02:00
ruv 4856afbd0c ci(v1-api): run Performance Tests + API Docs with MOCK_POSE_DATA=true
After the DensePoseHead startup fix (#910), the v1 API starts, but the
Performance Tests load-hit the pose endpoints which error "requires real
CSI data" (no hardware in CI, mock_pose_data defaults False), and the
API-docs job imports the app the same way. Set MOCK_POSE_DATA=true on both
jobs so they exercise the mock path. Verified: the env var maps to
settings.mock_pose_data=True (pydantic, no env_prefix).

(Note: Performance Tests is continue-on-error so this is cleanup, not a
run-blocker; the run-level red on main has been transient Docker Hub pull
timeouts on Tests/docker-build, which are infra flakes that pass on re-run.)
2026-06-02 12:04:58 +02:00
rUv 4d205a05c4 Merge pull request #910 from ruvnet/fix/v1-pose-service-densepose-config
fix(v1-api): pass required config to DensePoseHead — green main CI
2026-06-02 05:50:25 -04:00
ruv bc42ae7903 fix(v1-api): pass required config to DensePoseHead — green main CI
The "Continuous Integration" workflow (Performance Tests + API
Documentation jobs) has failed on every main commit since the API start
path was exercised: pose_service._initialize_models() called
`DensePoseHead()` with no args, but DensePoseHead.__init__ requires a
config dict → "TypeError: DensePoseHead.__init__() missing 1 required
positional argument: 'config'" → uvicorn "Application startup failed".

Pass a config: input_channels=256 (matches the modality translator's
output), num_body_parts=24 (DensePose standard), num_uv_coordinates=2.
Both call sites (with/without pose_model_path) fixed.

Verified locally: DensePoseHead(config) + ModalityTranslationNetwork(config)
both construct + eval, clearing the startup TypeError.
2026-06-02 11:42:52 +02:00
rUv b7b8c1109b Merge pull request #908 from ruvnet/fix/893-release-bins-refresh
release(firmware): refresh release_bins with the #893 CSI fix → v0.6.7
2026-06-02 05:35:34 -04:00
ruv 786e834dae release(firmware): refresh release_bins with the #893 CSI fix → v0.6.7
The pre-built binaries in release_bins/ were v0.6.6 (May 21) and shipped
the MGMT-only promiscuous filter, so display-less boards flashed from them
got yield=0pps (#893/#866/#897 — the root cause of the "can't reproduce /
it's fake" reports). Rebuilt every flashable variant from main (which has
the #893 display-gated DATA-frame fix) and refreshed the binaries:

- top-level ESP32-S3 8MB (sdkconfig.defaults) — esp32-csi-node.bin +
  bootloader (partition-table/ota_data unchanged — code-only fix)
- esp32-csi-node-4mb.bin (ESP32-S3 4MB, sdkconfig.defaults.4mb)
- c6-adr110/ (ESP32-C6, sdkconfig.defaults.esp32c6) — the exact firmware
  hardware-verified on COM6 (CSI yield 0→27 pps, presence/motion alive,
  no #396 crash)
- s3-adr110/ (same production S3 8MB config)

Left untouched: s3-fair-adr110/ (a non-production size-comparison build,
features stripped — not a board anyone flashes for sensing).

version.txt → 0.6.7; SHA256SUMS regenerated for the changed variant dirs.
Display boards keep MGMT-only (preserves the #396 crash protection);
display-less boards now capture DATA frames and stream CSI.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-02 11:18:03 +02:00
rUv 8703ade9b6 Merge pull request #907 from ruvnet/fix/894-occupancy-cap
fix(occupancy): bound eigenvalue person-count to single-link max — #894
2026-06-02 04:53:18 -04:00
ruv 4c87f04919 Merge remote-tracking branch 'origin/main' into fix/894-occupancy-cap
# Conflicts:
#	CHANGELOG.md
2026-06-02 10:52:53 +02:00
rUv 9df908d898 Merge pull request #904 from ruvnet/fix/898-mqtt-per-node-devices
fix(mqtt): one Home-Assistant device per node — closes #898
2026-06-02 04:44:09 -04:00
ruv f34b94aa46 fix(occupancy): bound eigenvalue person-count to single-link max — #894
field_bridge::occupancy_or_fallback returned FieldModel::estimate_occupancy
unbounded (internal ceiling 10), while the perturbation fallback below it
and score_to_person_count both cap at 3 ("1-3 for single ESP32"). On noisy
or under-calibrated CSI the eigenvalue count inflated → "10 persons when 1
present" (#894, seen when --model fails to load → heuristic mode). Bound the
eigenvalue path to a shared MAX_SINGLE_LINK_OCCUPANCY const (3) so every
single-link estimator agrees. Genuine higher counts come from the
multistatic fusion path. Build clean, field_bridge tests pass.
2026-06-02 10:40:24 +02:00
ruv 27edf153dc test(mqtt): drive per-node snapshots in discovery integration tests — #898
After the per-node discovery change, discovery configs are published the
first time a snapshot for a node_id arrives (not eagerly at startup). The
two discovery integration tests (discovery_topics_appear_on_broker,
privacy_mode_suppresses_biometric_discovery) spawned the publisher with an
empty broadcast channel and never sent a snapshot, so they collected []
and failed ("missing presence discovery topic in []").

Drive snapshots for the test node_id throughout the capture window (same
pattern as state_messages_published_on_snapshot_broadcast) so the per-node
device's discovery lands. Verified against a local mosquitto: 3 passed.
2026-06-02 10:29:17 +02:00
rUv 3fec67654a Merge pull request #906 from ruvnet/fix/893-csi-data-frame-capture
fix(firmware): capture DATA frames on display-less boards — #893/#866/#897 (yield=0pps root cause)
2026-06-02 04:23:44 -04:00
ruv 898c536eac fix(firmware): capture DATA frames on display-less boards — #893/#866/#897
The pre-built binaries set a MGMT-only promiscuous filter
(WIFI_PROMIS_FILTER_MASK_MGMT) as the #396 workaround — DATA-frame
interrupt load races the QSPI display's SPI traffic against the SPI-flash
cache and crashes Core 0 in wDev_ProcessFiq. But MGMT-only fires the CSI
callback only on sparse management frames, so on the common DISPLAY-LESS
boards (DevKitC-1, T7-S3, N8R8) CSI yield collapses to 0 pps under real
traffic (#521) — the node looks dead despite being on the network, which
is the root cause of most "can't reproduce / it's fake" reports (#804/#37).

A board with no AMOLED panel has no QSPI/SPI-flash contention, so it can
safely capture DATA frames. After the boot-time display probe runs:
  - display present  -> keep MGMT-only (preserve #396 crash protection)
  - no display       -> upgrade filter to MGMT|DATA (restore CSI yield)

Implementation (runtime-gated, no boot reorder):
  - display_task.c: s_display_active flag + display_is_active() accessor,
    set true only when the panel is detected and the display task starts.
  - csi_collector.c: csi_collector_enable_data_capture() re-sets the
    promiscuous filter to MGMT|DATA.
  - main.c: after display_task_start(), if !display_is_active() (or display
    support not compiled in), upgrade the filter.

Build-verified on BOTH targets: esp32c6 (headless path) and esp32s3
(display path, display_task.c compiled) — Project build complete, RC 0.
Needs on-hardware confirmation that yield recovers and no #396 crash.
2026-06-02 09:57:19 +02:00
ruv 9ddcf0c9fc fix(mqtt): one HA device per node — closes #898
After the #872 MQTT wiring, the JSON->VitalsSnapshot bridge hard-coded a
single node_id (the MQTT client id) and the publisher used one
OwnedDiscoveryBuilder, so every physical node collapsed into a single
Home-Assistant device (identifiers:["wifi_densepose_wifi-densepose-1"]),
contradicting the one-device-per-node docs.

- Bridge (main.rs): emit one VitalsSnapshot per node in the sensing
  update's nodes[] (each carries its own node_id + RSSI; shared aggregate
  presence/vitals), falling back to a single aggregate snapshot when
  there is no per-node data (wifi/simulate sources).
- Publisher (publisher.rs): add OwnedDiscoveryBuilder::for_node(), and
  publish discovery + availability lazily on first sight of each node_id,
  routing state to per-node topics. Heartbeat/refresh/offline-LWT iterate
  all known nodes. Result: N distinct HA devices, one per node.

3 new unit tests (distinct nodes -> distinct wifi_densepose_<node>
identifiers); full MQTT suite 71 passed, example builds.
2026-06-02 09:43:28 +02:00
26 changed files with 317 additions and 57 deletions
+35 -8
View File
@@ -265,23 +265,45 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install locust
pip install pytest # the perf suite is pytest, not locust
- name: Start application
working-directory: archive/v1
run: |
uvicorn src.api.main:app --host 0.0.0.0 --port 8000 &
sleep 10
# No "Start application" step: the gated test (test_frame_budget.py) drives
# the CSIProcessor pipeline in-process and makes no HTTP calls, so the old
# uvicorn server + `sleep 10` were dead weight — they only existed for the
# now-excluded api_throughput/inference_speed tests, and on every run dumped
# ~50 misleading "router requires hardware setup" ERROR lines for a server
# no test touched. MOCK_POSE_DATA is server-only and unused here.
- name: Run performance tests
working-directory: archive/v1
run: |
locust -f tests/performance/locustfile.py --headless --users 50 --spawn-rate 5 --run-time 60s --host http://localhost:8000
# Gate only on the genuine, deterministic perf guard:
# test_frame_budget.py times the *real* CSIProcessor pipeline against
# the ADR 50 ms per-frame budget (single-frame, p95 over 100 frames,
# +Doppler) — a true regression signal.
#
# test_api_throughput.py / test_inference_speed.py are excluded: every
# test there is a TDD red-phase stub (suffix `_should_fail_initially`)
# that times a *mock that sleeps* — meaningless as a perf signal, with
# machine-dependent wall-clock asserts (e.g. `actual_rps >= 40`,
# `batch_time < individual_time`) that are inherently flaky on shared
# CI runners, plus a cross-class fixture-scope bug. Forcing them green
# would be manufacturing a false signal; they stay in-repo for local
# TDD but do not gate CI until the underlying features are implemented.
#
# `python -m pytest` (not the bare `pytest` script) puts the cwd
# (archive/v1) on sys.path so `from src.core...` resolves — the bare
# script omits cwd and raises ModuleNotFoundError: No module named 'src'.
# -o addopts="" drops the root pyproject's --cov/--cov-fail-under=100.
python -m pytest tests/performance/test_frame_budget.py \
-o addopts="" -v --junitxml=perf-junit.xml
- name: Upload performance results
if: always()
uses: actions/upload-artifact@v4
with:
name: performance-results
path: locust_report.html
path: archive/v1/perf-junit.xml
# Docker Build and Test
# NOTE: the canonical Docker build for the sensing-server is now
@@ -367,6 +389,8 @@ jobs:
runs-on: ubuntu-latest
needs: [docker-build]
if: github.ref == 'refs/heads/main'
permissions:
contents: write # gh-pages deploy needs write (GITHUB_TOKEN is read-only by default -> 403)
steps:
- name: Checkout code
uses: actions/checkout@v4
@@ -384,6 +408,8 @@ jobs:
- name: Generate OpenAPI spec
working-directory: archive/v1
env:
MOCK_POSE_DATA: "true" # no CSI hardware in CI
run: |
python -c "
from src.api.main import app
@@ -394,6 +420,7 @@ jobs:
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v4
continue-on-error: true # openapi generation above is the real validation; deploy is best-effort (Pages may be disabled)
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs
+3 -1
View File
@@ -8,6 +8,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
### Fixed
- **Person count no longer leaks up to 10 in heuristic mode — addresses #894.** `field_bridge::occupancy_or_fallback` returned the eigenvalue-based `FieldModel::estimate_occupancy` count **unbounded** (its internal ceiling is 10), while the sibling estimators on the same single-link data — the perturbation-energy fallback right below it and `score_to_person_count` — both cap at 3 ("1-3 for single ESP32"). On noisy / under-calibrated CSI the eigenvalue count inflated, producing the "10 persons reported when 1 present" symptom (seen when `--model` fails to load and the server runs on heuristics). Bounded the eigenvalue path to the shared `MAX_SINGLE_LINK_OCCUPANCY` (3) so every estimator on one link agrees; genuine higher counts come from the multistatic fusion path, not a single-link covariance estimate.
- **MQTT multi-node deployments now create one Home-Assistant device per node — closes #898.** After the #872 MQTT wiring landed, the JSON→`VitalsSnapshot` bridge hard-coded a single `node_id` (the MQTT client id) and the publisher used a single `OwnedDiscoveryBuilder`, so every physical node collapsed into one device (`identifiers:["wifi_densepose_wifi-densepose-1"]`), contradicting the "one device per node" docs. The bridge now emits one snapshot per node in the sensing update's `nodes[]` (each with its own `node_id` + RSSI, falling back to a single aggregate snapshot for wifi/simulate sources), and the publisher derives a per-node builder (`OwnedDiscoveryBuilder::for_node`) that publishes discovery + availability lazily on first sight of each `node_id` and routes state to per-node topics — yielding N distinct HA devices with per-node availability/LWT. Unit-tested (distinct nodes → distinct `wifi_densepose_<node>` identifiers); 71 MQTT tests pass.
- **Person count no longer pinned to 1 — addresses #803.** The aggregate occupancy reported by the sensing server was derived from `smoothed_person_score`, an EMA-smoothed *activity* score (amplitude variance / motion / spectral energy). That score saturates near a single occupant — one moving person maxes it out — so it cannot discriminate occupancy *count* and stayed clamped at 1 across S3/C6 and the Python/Docker/Rust servers. Meanwhile the count-aware per-node estimates the ESP32 paths already compute (firmware `n_persons`, and the DynamicMinCut `corr_persons`) were stashed in `NodeState::prev_person_count` and then **discarded** by the aggregator (same dead-wiring class as #872). The aggregator now takes `max(activity_count, node_max)` via a unit-tested `aggregate_person_count` helper, so a node positively estimating 23 occupants is surfaced instead of overwritten. The fix can only ever *raise* the count when a node reports more people, so the single-occupant case is provably never inflated (regression-guarded by test). **Second half:** the pure-CSI per-node path itself clamped its own estimate — the DynamicMinCut occupancy (`estimate_persons_from_correlation`, 03) was mapped to a score via `corr_persons / 3.0`, putting 2 people at 0.667, *just under* the 0.70 up-threshold of `score_to_person_count`, so the per-node count never climbed past 1 (so `node_max` was also stuck at 1 for CSI-only nodes). Replaced it with a threshold-aligned `corr_persons_to_score` mapping (1→0.40, 2→0.74, 3→0.96) whose steady state round-trips back to the same count through the EMA + hysteresis, while still gating transient noise. A convergence test replays the exact EMA loop to prove min-cut=2 now reports 2 (and documents that the old `/3.0` mapping reported 1). Full multi-person accuracy still depends on the underlying estimator quality; this removes the two server-side clamps that masked it. 586 sensing-server tests pass.
- **MQTT publisher now actually runs (`--mqtt`) — closes #872.** The `--mqtt*` flags were defined only in `cli::Args` (dead code, referenced nowhere) while the binary parses a *separate* `main::Args` with no mqtt fields, and `main.rs` never started the `mqtt::` publisher — so MQTT/Home-Assistant integration was completely unwired (`--mqtt` errored as an unexpected argument, and even with the Docker image's `--features mqtt` build the publisher never ran). Earlier attempts chased a Docker *rebuild*; the real cause was disconnected *code*. Extracted the flags into a shared `cli::MqttArgs` (`#[command(flatten)]` into both structs), spawn the publisher on `--mqtt`, and bridge the JSON sensing broadcast into the typed `VitalsSnapshot` stream with a defensive `serde_json::Value` mapping. Verified end-to-end against `mosquitto`: 20 HA auto-discovery entities + live state (presence/person-count/…). 577 (default) / 580 (`--features mqtt`) tests pass.
@@ -428,7 +430,7 @@ Model release (no new firmware binary). Firmware remains at v0.6.0-esp32.
- Security fix merged via PR #310.
### Performance
- Presence detection: 100% accuracy on 60,630 overnight samples.
- Presence detection: 100% accuracy on 60,630 overnight samples. *(Retracted — that recording was single-class (one sleeping person, 6,062/6,063 frames "present"), so a constant "yes" scores ~99.98%. Superseded by the honest 82.3% held-out temporal-triplet metric; see [#882](https://github.com/ruvnet/RuView/issues/882). Kept here as the in-place public record.)*
- Inference: 0.008 ms per sample, 164K embeddings/sec.
- Contrastive self-supervised training: 51.6% improvement over baseline.
+12 -3
View File
@@ -107,16 +107,25 @@ class PoseService:
async def _initialize_models(self):
"""Initialize neural network models."""
try:
# Initialize DensePose model
# Initialize DensePose model. DensePoseHead requires a config
# dict — input_channels matches the modality translator's output
# (256), with the standard DensePose 24 body parts and 2 (U,V)
# coordinates. (Previously called with no args → TypeError at
# startup, which broke the API service.)
densepose_config = {
'input_channels': 256,
'num_body_parts': 24,
'num_uv_coordinates': 2,
}
if self.settings.pose_model_path:
self.densepose_model = DensePoseHead()
self.densepose_model = DensePoseHead(densepose_config)
# Load model weights if path is provided
# model_state = torch.load(self.settings.pose_model_path)
# self.densepose_model.load_state_dict(model_state)
self.logger.info("DensePose model loaded")
else:
self.logger.warning("No pose model path provided, using default model")
self.densepose_model = DensePoseHead()
self.densepose_model = DensePoseHead(densepose_config)
# Initialize modality translation
config = {
+3 -3
View File
@@ -122,7 +122,7 @@ node scripts/benchmark-ruvllm.js --model models/csi-ruvllm # benchmark
| What we measured | Result | Why it matters |
|-----------------|--------|---------------|
| **Presence detection** | **100% accuracy** | Never misses a person, never false alarms |
| **CSI embedding quality** | **82.3% held-out temporal-triplet** | Honest label-free metric on the last 20% by time (v1's "100% presence" was a single-class recording — retracted, [#882](https://github.com/ruvnet/RuView/issues/882)) |
| **Inference speed** | **0.008 ms** per embedding | 125,000x faster than real-time |
| **Throughput** | **164,183 embeddings/sec** | One Mac Mini handles 1,600+ ESP32 nodes |
| **Contrastive learning** | **51.6% improvement** | Strong pattern learning from real overnight data |
@@ -233,7 +233,7 @@ python firmware/esp32-csi-node/provision.py --port COM9 --hop-channels "1,6,11"
| **kNN similarity search** | "Find the 10 most similar states to right now" — anomaly detection, fingerprinting | Cognitum Seed |
| **Witness chain** | SHA-256 tamper-evident audit trail for every measurement (1,747 entries validated) | Cognitum Seed |
| **Camera-free pose training** | 17 COCO keypoints from 10 sensor signals — PIR, RSSI triangulation, subcarrier asymmetry, vibration, BME280 | 2x ESP32 + Seed |
| **Pre-trained model** | 82.8 KB (8 KB at 4-bit quantization), 100% presence accuracy, 0 skeleton violations | Download from release |
| **Pre-trained model** | 82.8 KB (8 KB at 4-bit quantization), 82.3% held-out temporal-triplet accuracy (v1's "100% presence" was single-class — retracted, [#882](https://github.com/ruvnet/RuView/issues/882)) | Download from release |
| **Sub-ms inference** | 0.012 ms latency, 171,472 embeddings/sec on M4 Pro | Any machine with Node.js |
| **SONA adaptation** | Adapts to new rooms in <1ms without retraining | ruvllm runtime |
| **LoRA room adapters** | Per-node fine-tuning with 2,048 parameters per adapter | Automatic |
@@ -262,7 +262,7 @@ node scripts/benchmark-ruvllm.js --model models/csi-ruvllm
| What we measured | Result | Why it matters |
|-----------------|--------|---------------|
| **Presence detection** | **100% accuracy** | Never misses a person, never false alarms |
| **CSI embedding quality** | **82.3% held-out temporal-triplet** | Honest label-free metric (v1's "100% presence" was single-class — retracted, [#882](https://github.com/ruvnet/RuView/issues/882)) |
| **Person counting** | **24/24 correct** (MinCut) | Fixed the #1 user-reported issue |
| **Inference speed** | **0.012 ms** per embedding | 83,000x faster than real-time |
| **Throughput** | **171,472 embeddings/sec** | One Mac Mini handles 1,700+ ESP32 nodes |
+3 -3
View File
@@ -1119,7 +1119,7 @@ What it ships (and what it does not):
| Capability | Status |
|------------|--------|
| Presence detection (occupied / empty) | ✅ Trained head — 100% accuracy on validation |
| Presence detection (occupied / empty) | ✅ Trained head — v2 encoder reports 82.3% held-out temporal-triplet acc (v1's "100% on validation" was a single-class recording — retracted, [#882](https://github.com/ruvnet/RuView/issues/882)) |
| 128-dim CSI embeddings (re-ID, similarity, downstream training) | ✅ Trained encoder |
| Single-person breathing / heart-rate | ⚠️ Server still uses heuristic DSP — model does not replace this yet |
| 17-keypoint full-body pose | 🔬 No keypoint weights shipped yet — pose pipeline runs but without a learned head |
@@ -1824,7 +1824,7 @@ huggingface-cli download ruvnet/wifi-densepose-pretrained --local-dir models/pre
# model.safetensors — 48 KB contrastive encoder
# model-q4.bin — 8 KB quantized (recommended)
# model-q2.bin — 4 KB ultra-compact (ESP32 edge)
# presence-head.json — presence detection head (100% accuracy)
# presence-head.json — presence detection head (v2 encoder: 82.3% held-out triplet acc)
# node-1.json — LoRA adapter for room 1
# node-2.json — LoRA adapter for room 2
```
@@ -1833,7 +1833,7 @@ huggingface-cli download ruvnet/wifi-densepose-pretrained --local-dir models/pre
The pre-trained encoder converts 8-dim CSI feature vectors into 128-dim embeddings. These embeddings power all 17 sensing applications:
- **Presence detection** — 100% accuracy, never misses, never false alarms
- **Presence detection** — v2 encoder: 82.3% held-out temporal-triplet accuracy (v1's "100%" was a single-class recording — retracted, [#882](https://github.com/ruvnet/RuView/issues/882))
- **Environment fingerprinting** — kNN search finds "states like this one"
- **Anomaly detection** — embeddings that don't match known clusters = anomaly
- **Activity classification** — different activities cluster in embedding space
@@ -637,6 +637,23 @@ static void hop_timer_cb(void *arg)
csi_hop_next_channel();
}
void csi_collector_enable_data_capture(void)
{
/* MGMT-only (RuView#396) starves the CSI callback on display-less boards
* (RuView#521/#893): beacons alone are sparse, yield collapses to 0 pps.
* Without a display there is no QSPI/SPI-flash cache contention with the
* DATA-frame interrupt load, so capture DATA frames too. */
wifi_promiscuous_filter_t filt = {
.filter_mask = WIFI_PROMIS_FILTER_MASK_MGMT | WIFI_PROMIS_FILTER_MASK_DATA,
};
esp_err_t err = esp_wifi_set_promiscuous_filter(&filt);
if (err == ESP_OK) {
ESP_LOGI(TAG, "CSI filter upgraded to MGMT+DATA (no display, RuView#893)");
} else {
ESP_LOGW(TAG, "Failed to enable DATA-frame CSI capture: %s", esp_err_to_name(err));
}
}
void csi_collector_start_hop_timer(void)
{
if (s_hop_count <= 1) {
@@ -90,6 +90,19 @@ void csi_hop_next_channel(void);
*/
void csi_collector_start_hop_timer(void);
/**
* Upgrade the promiscuous filter to capture DATA frames in addition to MGMT
* (RuView#893/#521).
*
* Called on display-less boards: the MGMT-only filter (the #396 display-crash
* workaround set in csi_collector_init) only fires the CSI callback on sparse
* management frames, so yield collapses to 0 pps under real traffic and the
* node looks dead. A board with no AMOLED panel has no QSPI/SPI-flash cache
* contention, so it can safely capture DATA frames — restoring abundant CSI.
* Display boards keep MGMT-only to avoid the #396 crash.
*/
void csi_collector_enable_data_capture(void);
/**
* Inject an NDP (Null Data Packet) frame for sensing.
*
@@ -9,6 +9,14 @@
#include "display_task.h"
#include "sdkconfig.h"
/* Set true once an AMOLED panel is detected and the display task starts.
* Defined outside the CONFIG_DISPLAY_ENABLE guard so display_is_active()
* exists on headless builds too (where it stays false → CSI captures DATA
* frames; see RuView#893). */
static bool s_display_active = false;
bool display_is_active(void) { return s_display_active; }
#if CONFIG_DISPLAY_ENABLE
#include <string.h>
@@ -162,6 +170,7 @@ esp_err_t display_task_start(void)
ESP_LOGI(TAG, "Display task started (Core %d, priority %d, %d fps)",
DISP_TASK_CORE, DISP_TASK_PRIORITY, DISP_FPS_LIMIT);
s_display_active = true;
return ESP_OK;
}
@@ -7,6 +7,7 @@
#define DISPLAY_TASK_H
#include "esp_err.h"
#include <stdbool.h>
#ifdef __cplusplus
extern "C" {
@@ -22,6 +23,15 @@ extern "C" {
*/
esp_err_t display_task_start(void);
/**
* @return true once an AMOLED panel has been detected and the display task
* is running; false on headless boards (no panel, or built without display
* support). Used to choose the CSI promiscuous filter (RuView#893): a board
* with no display has no QSPI/SPI-flash contention, so it can safely capture
* DATA frames for proper CSI yield instead of starving on MGMT-only.
*/
bool display_is_active(void);
#ifdef __cplusplus
}
#endif
+15
View File
@@ -410,6 +410,21 @@ void app_main(void)
}
#endif
/* RuView#893/#521: the MGMT-only promiscuous filter (set in
* csi_collector_init as the #396 display-crash workaround) starves the CSI
* callback on display-less boards — yield collapses to 0 pps and the node
* looks dead despite being on the network. Now that the display probe has
* run, boards with no AMOLED panel (no QSPI/SPI-flash cache contention)
* upgrade the filter to capture DATA frames too, restoring CSI yield. */
#ifdef CONFIG_DISPLAY_ENABLE
bool has_display = display_is_active(); /* runtime panel probe result */
#else
bool has_display = false; /* display support not compiled in */
#endif
if (!has_display) {
csi_collector_enable_data_capture();
}
ESP_LOGI(TAG, "CSI streaming active → %s:%d (edge_tier=%u, OTA=%s, WASM=%s, mmWave=%s, swarm=%s, adapt=%s)",
g_nvs_config.target_ip, g_nvs_config.target_port,
g_nvs_config.edge_tier,
Binary file not shown.
@@ -1,4 +1,4 @@
889715e9d698ad78f9978ad8b93b6af24a726b0494247201c8f0d920d9fc80ca *firmware/esp32-csi-node/release_bins/c6-adr110/bootloader.bin
d8539e47c6f10a3344679118619e3fe01cfd66eb560ea8883268ca7c9a12efa4 *firmware/esp32-csi-node/release_bins/c6-adr110/esp32-csi-node.bin
b0fb1f217a39c80bc95b5eb8208a0b8572ae64efa0f6d580b76caff4affe0f4d *firmware/esp32-csi-node/release_bins/c6-adr110/bootloader.bin
4764c5b20a353895f70122816adc98f861ec20e9a8ea9b344dc0648b6341073c *firmware/esp32-csi-node/release_bins/c6-adr110/esp32-csi-node.bin
7d2c7ac4888bfd75cd5f56e8d61f69595121183afc81556c876732fd3782c62f *firmware/esp32-csi-node/release_bins/c6-adr110/ota_data_initial.bin
4c2cc4ffd52641e23b779bd57b3908014083ac3c1aab395756478c89e70d81f0 *firmware/esp32-csi-node/release_bins/c6-adr110/partition-table.bin
@@ -1,3 +1,3 @@
3c4905dd202ccabf4230cbabcc9320f250a60b1a7254eff7424780201bcb2072 *firmware/esp32-csi-node/release_bins/s3-adr110/bootloader.bin
7a8bf9582c9031fed32f1ada44f5c41dd99bd07fadff8e5c86e07aa0f343e847 *firmware/esp32-csi-node/release_bins/s3-adr110/esp32-csi-node.bin
b973d7eda65affb746adcfa63ceb18f779f206d240b76f01b8c9ae7485455660 *firmware/esp32-csi-node/release_bins/s3-adr110/bootloader.bin
e21ef94aba779d534dc048c1b9da731c81e5dbe09d0645cfd70a05ad3642d3e9 *firmware/esp32-csi-node/release_bins/s3-adr110/esp32-csi-node.bin
67222c257c0477501fd4002275638dc4262b34eb68235b8289fb1337054d322b *firmware/esp32-csi-node/release_bins/s3-adr110/partition-table.bin
@@ -1,3 +1,4 @@
0.6.6
git-sha: cbcb389cb (pre-commit)
built: 2026-05-21
0.6.7
git-sha: 8703ade9b
built: 2026-06-02
note: RuView#893 — display-less boards capture DATA frames (CSI yield 0pps fix); hardware-verified on ESP32-C6 (0->27 pps)
+1
View File
@@ -36,3 +36,4 @@ scikit-learn>=1.2.0
# Monitoring dependencies
prometheus-client>=0.16.0
psutil>=5.9.0 # system metrics — imported by health.py / metrics.py / status.py / monitoring.py
@@ -21,6 +21,15 @@ const ENERGY_THRESH_2: f64 = 12.0;
/// Perturbation energy threshold for detecting a third person.
const ENERGY_THRESH_3: f64 = 25.0;
/// Maximum occupancy a single ESP32 link can plausibly resolve (#894).
/// The score heuristic (`score_to_person_count`) and the perturbation-energy
/// fallback below both cap here; the eigenvalue path is bounded to match,
/// rather than leaking its internal `min(10)` ceiling on noisy / under-
/// calibrated CSI (the "10 persons reported when 1 present" symptom).
/// Resolving more than this from one link's subcarrier covariance is not
/// reliable — genuine higher counts come from the multistatic fusion path.
const MAX_SINGLE_LINK_OCCUPANCY: usize = 3;
/// Create a FieldModelConfig for single-link mode (one ESP32 node = one link).
/// This avoids the DimensionMismatch error when feeding single-frame observations.
pub fn single_link_config() -> FieldModelConfig {
@@ -55,9 +64,15 @@ pub fn occupancy_or_fallback(
return score_to_person_count(smoothed_score, prev_count);
}
// Try eigenvalue-based occupancy first (best accuracy).
// Try eigenvalue-based occupancy first (best accuracy). Bound it to
// the same single-link maximum the sibling estimators use — the
// perturbation fallback below and score_to_person_count both cap at
// MAX_SINGLE_LINK_OCCUPANCY. Without this, estimate_occupancy's
// internal min(10) ceiling leaks up to 10 persons on noisy / under-
// calibrated CSI (#894), while every other path on the same data
// would report ≤3.
if let Ok(count) = field.estimate_occupancy(&frames) {
return count;
return count.min(MAX_SINGLE_LINK_OCCUPANCY);
} // else fall through to perturbation energy
// Fallback: perturbation energy thresholds.
@@ -6213,24 +6213,44 @@ async fn main() {
Some(_) => 1.0,
None => 0.0,
};
let snap = mqtt::state::VitalsSnapshot {
node_id: node_id.clone(),
timestamp_ms: (v["timestamp"].as_f64().unwrap_or(0.0) * 1000.0) as i64,
let ts = (v["timestamp"].as_f64().unwrap_or(0.0) * 1000.0) as i64;
let conf = cls["confidence"].as_f64().unwrap_or(0.0);
let presence_score = if presence { conf.max(0.0) } else { 0.0 };
let breathing = vit["breathing_rate_bpm"].as_f64();
let hr = vit["heart_rate_bpm"].as_f64();
// #898: emit one snapshot per physical node so each
// surfaces as its own Home-Assistant device (with
// its own RSSI + availability). Falls back to a
// single aggregate snapshot when there is no
// per-node data (e.g. wifi / simulate sources).
let mk = |nid: String, rssi: Option<f64>| mqtt::state::VitalsSnapshot {
node_id: nid,
timestamp_ms: ts,
presence,
motion,
presence_score: if presence {
cls["confidence"].as_f64().unwrap_or(1.0)
} else {
0.0
},
breathing_rate_bpm: vit["breathing_rate_bpm"].as_f64(),
heartrate_bpm: vit["heart_rate_bpm"].as_f64(),
presence_score,
breathing_rate_bpm: breathing,
heartrate_bpm: hr,
n_persons,
rssi_dbm: v["nodes"][0]["rssi_dbm"].as_f64(),
vital_confidence: cls["confidence"].as_f64().unwrap_or(0.0),
rssi_dbm: rssi,
vital_confidence: conf,
..Default::default()
};
let _ = vtx.send(snap);
match v["nodes"].as_array() {
Some(arr) if !arr.is_empty() => {
for node in arr {
let n = node["node_id"].as_u64().unwrap_or(0);
let nid = format!("{node_id}-node{n}");
let _ = vtx.send(mk(nid, node["rssi_dbm"].as_f64()));
}
}
_ => {
let _ = vtx.send(mk(
node_id.clone(),
v["nodes"][0]["rssi_dbm"].as_f64(),
));
}
}
}
});
tracing::info!("MQTT publisher started -> {host}:{port}");
@@ -117,6 +117,23 @@ impl OwnedDiscoveryBuilder {
via_device: self.via_device.as_deref(),
}
}
/// Derive a per-node builder from this base (issue #898). Each physical
/// RuView node must surface as its own Home-Assistant device — the base
/// builder's `node_id` (the MQTT client id) is replaced with the actual
/// node id, giving a distinct `wifi_densepose_<node>` device identifier
/// and a per-node friendly name, instead of collapsing every node into a
/// single hard-coded device.
pub fn for_node(&self, node_id: &str) -> OwnedDiscoveryBuilder {
OwnedDiscoveryBuilder {
discovery_prefix: self.discovery_prefix.clone(),
node_id: node_id.to_string(),
node_friendly_name: Some(format!("RuView node {node_id}")),
sw_version: self.sw_version.clone(),
model: self.model.clone(),
via_device: self.via_device.clone(),
}
}
}
/// Core run loop. Pumps the broadcast channel + the MQTT event loop in
@@ -129,20 +146,19 @@ async fn run(
let opts = build_mqtt_options(&cfg);
let (client, mut eventloop): (AsyncClient, EventLoop) = AsyncClient::new(opts, 256);
let builder_borrowed = builder_owned.as_borrowed();
let entities = DiscoveryBuilder::enabled_entities(
cfg.privacy_mode,
cfg.publish_pose,
&[], // no_semantic — wire from cli::Args in P3.5
);
if let Err(e) = publish_all_discovery(&client, &builder_borrowed, &entities).await {
warn!("[mqtt] initial discovery publish failed: {e}");
}
let avail = NodeAvailability::for_builder(&builder_borrowed, &entities);
if let Err(e) = publish_availability(&client, &avail, "online").await {
warn!("[mqtt] initial availability publish failed: {e}");
}
// #898: one Home-Assistant device per node. Discovery + availability are
// published lazily the first time a snapshot for a given node_id arrives;
// each node's builder + availability are retained here for heartbeats and
// the offline LWT. (Previously a single hard-coded builder collapsed every
// node into one device.)
let mut nodes: std::collections::HashMap<String, (OwnedDiscoveryBuilder, NodeAvailability)> =
std::collections::HashMap::new();
let mut rate_limiter = RateLimiter::new();
let mut last_heartbeat = Instant::now();
@@ -179,14 +195,20 @@ async fn run(
// Periodic heartbeat / discovery refresh.
_ = tokio::time::sleep(Duration::from_secs(1)) => {
if last_heartbeat.elapsed() >= AVAILABILITY_HEARTBEAT {
if let Err(e) = publish_availability(&client, &avail, "online").await {
warn!("[mqtt] heartbeat publish failed: {e}");
for (_, na) in nodes.values() {
if let Err(e) = publish_availability(&client, na, "online").await {
warn!("[mqtt] heartbeat publish failed: {e}");
}
}
last_heartbeat = Instant::now();
}
if last_refresh.elapsed() >= Duration::from_secs(cfg.refresh_secs) {
if let Err(e) = publish_all_discovery(&client, &builder_borrowed, &entities).await {
warn!("[mqtt] discovery refresh failed: {e}");
for (nb, _) in nodes.values() {
if let Err(e) =
publish_all_discovery(&client, &nb.as_borrowed(), &entities).await
{
warn!("[mqtt] discovery refresh failed: {e}");
}
}
last_refresh = Instant::now();
}
@@ -197,18 +219,39 @@ async fn run(
match recv {
Ok(snap) => {
let elapsed = start_instant.elapsed();
publish_snapshot(&client, &builder_borrowed, &snap, &cfg, &mut rate_limiter, elapsed).await;
// #898: on first sight of a node_id, publish that
// node's discovery + availability; then route its
// state to per-node topics.
if !nodes.contains_key(&snap.node_id) {
let nb = builder_owned.for_node(&snap.node_id);
let borrowed = nb.as_borrowed();
if let Err(e) =
publish_all_discovery(&client, &borrowed, &entities).await
{
warn!("[mqtt] node {} discovery failed: {e}", snap.node_id);
}
let na = NodeAvailability::for_builder(&borrowed, &entities);
if let Err(e) = publish_availability(&client, &na, "online").await {
warn!("[mqtt] node {} availability failed: {e}", snap.node_id);
}
nodes.insert(snap.node_id.clone(), (nb, na));
}
let borrowed = nodes[&snap.node_id].0.as_borrowed();
publish_snapshot(&client, &borrowed, &snap, &cfg, &mut rate_limiter, elapsed).await;
}
Err(broadcast::error::RecvError::Lagged(n)) => {
warn!("[mqtt] lagged behind broadcast by {n} messages — dropped");
}
Err(broadcast::error::RecvError::Closed) => {
info!("[mqtt] broadcast channel closed, draining");
// Publish offline before exit.
let _ = publish_availability(&client, &avail, "offline").await;
// Publish offline for every known node before exit.
for (_, na) in nodes.values() {
let _ = publish_availability(&client, na, "offline").await;
}
let _ = client.disconnect().await;
return;
}
}
}
}
@@ -296,3 +339,52 @@ async fn publish_state(client: &AsyncClient, m: &StateMessage) -> Result<(), Cli
};
client.publish(&m.topic, qos, m.retain, m.payload.clone()).await
}
#[cfg(test)]
mod per_node_device_tests {
//! Issue #898 — each physical node must surface as its own Home-Assistant
//! device, not collapse into one hard-coded device.
use super::*;
fn base() -> OwnedDiscoveryBuilder {
OwnedDiscoveryBuilder {
discovery_prefix: "homeassistant".into(),
node_id: "wifi-densepose-1".into(),
node_friendly_name: Some("RuView".into()),
sw_version: "0.0.0".into(),
model: "test".into(),
via_device: None,
}
}
fn device_identifiers(b: &OwnedDiscoveryBuilder) -> Vec<String> {
b.as_borrowed().build(EntityKind::Presence).device.identifiers
}
#[test]
fn for_node_overrides_node_id_and_friendly_name() {
let n = base().for_node("node-A");
assert_eq!(n.node_id, "node-A");
assert_eq!(n.node_friendly_name.as_deref(), Some("RuView node node-A"));
}
#[test]
fn distinct_nodes_yield_distinct_ha_device_identifiers() {
let b = base();
let a = device_identifiers(&b.for_node("node-A"));
let c = device_identifiers(&b.for_node("node-B"));
assert_eq!(a, vec!["wifi_densepose_node-A".to_string()]);
assert_eq!(c, vec!["wifi_densepose_node-B".to_string()]);
assert_ne!(a, c, "#898: two nodes must not collapse into one device");
}
#[test]
fn single_node_keeps_a_stable_identity() {
// Two snapshots from the same node map to the same device.
let b = base();
assert_eq!(
device_identifiers(&b.for_node("node-7")),
device_identifiers(&b.for_node("node-7"))
);
}
}
@@ -171,12 +171,28 @@ async fn discovery_topics_appear_on_broker() {
// Spawn the publisher.
let cfg = make_cfg(port, false, "discovery");
let builder = make_builder("inttest1");
let (_tx, rx) = broadcast::channel::<VitalsSnapshot>(32);
let (tx, rx) = broadcast::channel::<VitalsSnapshot>(32);
let _handle = spawn(cfg, builder, rx);
// #898: discovery is now published per-node the first time a snapshot for
// that node_id arrives (not eagerly at startup). Drive snapshots for
// "inttest1" throughout the window so its device's discovery lands — same
// pattern as state_messages_published_on_snapshot_broadcast.
let tx_bg = tx.clone();
let drive = tokio::spawn(async move {
for _ in 0..60 {
let _ = tx_bg.send(VitalsSnapshot {
node_id: "inttest1".into(),
..Default::default()
});
tokio::time::sleep(Duration::from_millis(200)).await;
}
});
// Drain the subscriber for up to 6 s — enough for initial discovery
// + first availability publication.
let msgs = collect_published(&mut sub_loop, Duration::from_secs(6)).await;
drive.abort();
let _ = sub.disconnect().await;
// Assertions: at least the presence + heart_rate + fall discovery
@@ -221,10 +237,23 @@ async fn privacy_mode_suppresses_biometric_discovery() {
let cfg = make_cfg(port, /* privacy_mode = */ true, "privacy");
let builder = make_builder("inttest2");
let (_tx, rx) = broadcast::channel::<VitalsSnapshot>(32);
let (tx, rx) = broadcast::channel::<VitalsSnapshot>(32);
let _handle = spawn(cfg, builder, rx);
// #898: per-node discovery is triggered by a snapshot for that node_id.
let tx_bg = tx.clone();
let drive = tokio::spawn(async move {
for _ in 0..60 {
let _ = tx_bg.send(VitalsSnapshot {
node_id: "inttest2".into(),
..Default::default()
});
tokio::time::sleep(Duration::from_millis(200)).await;
}
});
let msgs = collect_published(&mut sub_loop, Duration::from_secs(6)).await;
drive.abort();
let _ = sub.disconnect().await;
let topics: Vec<&str> = msgs.iter().map(|(t, _, _)| t.as_str()).collect();