mirror of
https://github.com/ruvnet/RuView
synced 2026-06-09 10:13:17 +00:00
Compare commits
5 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| c353255672 | |||
| 872d7593bb | |||
| 2c136aca74 | |||
| 69e61e3437 | |||
| d9e87e13b4 |
@@ -46,7 +46,10 @@ jobs:
|
||||
|
||||
- name: Run Bandit security scan
|
||||
run: |
|
||||
bandit -r src/ -f sarif -o bandit-results.sarif
|
||||
# The Python codebase lives under archive/v1/src (it moved there when
|
||||
# the runtime was rewritten in Rust). Scanning `src/` matched nothing,
|
||||
# so this SAST step was a silent no-op.
|
||||
bandit -r archive/v1/src/ -f sarif -o bandit-results.sarif
|
||||
continue-on-error: true
|
||||
|
||||
- name: Upload Bandit results to GitHub Security
|
||||
@@ -57,22 +60,20 @@ jobs:
|
||||
sarif_file: bandit-results.sarif
|
||||
category: bandit
|
||||
|
||||
- name: Run Semgrep security scan
|
||||
continue-on-error: true
|
||||
uses: returntocorp/semgrep-action@v1
|
||||
with:
|
||||
config: >-
|
||||
p/security-audit
|
||||
p/secrets
|
||||
p/python
|
||||
p/docker
|
||||
p/kubernetes
|
||||
env:
|
||||
SEMGREP_APP_TOKEN: ${{ secrets.SEMGREP_APP_TOKEN }}
|
||||
|
||||
- name: Generate Semgrep SARIF
|
||||
# Removed the deprecated `returntocorp/semgrep-action@v1` step: it was
|
||||
# redundant (the pip `semgrep --sarif` below is what feeds GitHub Security;
|
||||
# the action only pushed to the Semgrep cloud app via SEMGREP_APP_TOKEN) and
|
||||
# it pulled `returntocorp/semgrep-agent:v1` from Docker Hub on every run,
|
||||
# which intermittently timed out and turned this check red. The pip semgrep
|
||||
# (installed above) needs no Docker pull. The action's `p/docker` +
|
||||
# `p/kubernetes` rulesets are folded into the command below so coverage is
|
||||
# preserved.
|
||||
- name: Run Semgrep + generate SARIF
|
||||
run: |
|
||||
semgrep --config=p/security-audit --config=p/secrets --config=p/python --sarif --output=semgrep.sarif src/
|
||||
semgrep \
|
||||
--config=p/security-audit --config=p/secrets --config=p/python \
|
||||
--config=p/docker --config=p/kubernetes \
|
||||
--sarif --output=semgrep.sarif archive/v1/src/
|
||||
continue-on-error: true
|
||||
|
||||
- name: Upload Semgrep results to GitHub Security
|
||||
|
||||
@@ -12,6 +12,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
- **MQTT multi-node deployments now create one Home-Assistant device per node — closes #898.** After the #872 MQTT wiring landed, the JSON→`VitalsSnapshot` bridge hard-coded a single `node_id` (the MQTT client id) and the publisher used a single `OwnedDiscoveryBuilder`, so every physical node collapsed into one device (`identifiers:["wifi_densepose_wifi-densepose-1"]`), contradicting the "one device per node" docs. The bridge now emits one snapshot per node in the sensing update's `nodes[]` (each with its own `node_id` + RSSI, falling back to a single aggregate snapshot for wifi/simulate sources), and the publisher derives a per-node builder (`OwnedDiscoveryBuilder::for_node`) that publishes discovery + availability lazily on first sight of each `node_id` and routes state to per-node topics — yielding N distinct HA devices with per-node availability/LWT. Unit-tested (distinct nodes → distinct `wifi_densepose_<node>` identifiers); 71 MQTT tests pass.
|
||||
- **Person count no longer pinned to 1 — addresses #803.** The aggregate occupancy reported by the sensing server was derived from `smoothed_person_score`, an EMA-smoothed *activity* score (amplitude variance / motion / spectral energy). That score saturates near a single occupant — one moving person maxes it out — so it cannot discriminate occupancy *count* and stayed clamped at 1 across S3/C6 and the Python/Docker/Rust servers. Meanwhile the count-aware per-node estimates the ESP32 paths already compute (firmware `n_persons`, and the DynamicMinCut `corr_persons`) were stashed in `NodeState::prev_person_count` and then **discarded** by the aggregator (same dead-wiring class as #872). The aggregator now takes `max(activity_count, node_max)` via a unit-tested `aggregate_person_count` helper, so a node positively estimating 2–3 occupants is surfaced instead of overwritten. The fix can only ever *raise* the count when a node reports more people, so the single-occupant case is provably never inflated (regression-guarded by test). **Second half:** the pure-CSI per-node path itself clamped its own estimate — the DynamicMinCut occupancy (`estimate_persons_from_correlation`, 0–3) was mapped to a score via `corr_persons / 3.0`, putting 2 people at 0.667, *just under* the 0.70 up-threshold of `score_to_person_count`, so the per-node count never climbed past 1 (so `node_max` was also stuck at 1 for CSI-only nodes). Replaced it with a threshold-aligned `corr_persons_to_score` mapping (1→0.40, 2→0.74, 3→0.96) whose steady state round-trips back to the same count through the EMA + hysteresis, while still gating transient noise. A convergence test replays the exact EMA loop to prove min-cut=2 now reports 2 (and documents that the old `/3.0` mapping reported 1). Full multi-person accuracy still depends on the underlying estimator quality; this removes the two server-side clamps that masked it. 586 sensing-server tests pass.
|
||||
- **MQTT publisher now actually runs (`--mqtt`) — closes #872.** The `--mqtt*` flags were defined only in `cli::Args` (dead code, referenced nowhere) while the binary parses a *separate* `main::Args` with no mqtt fields, and `main.rs` never started the `mqtt::` publisher — so MQTT/Home-Assistant integration was completely unwired (`--mqtt` errored as an unexpected argument, and even with the Docker image's `--features mqtt` build the publisher never ran). Earlier attempts chased a Docker *rebuild*; the real cause was disconnected *code*. Extracted the flags into a shared `cli::MqttArgs` (`#[command(flatten)]` into both structs), spawn the publisher on `--mqtt`, and bridge the JSON sensing broadcast into the typed `VitalsSnapshot` stream with a defensive `serde_json::Value` mapping. Verified end-to-end against `mosquitto`: 20 HA auto-discovery entities + live state (presence/person-count/…). 577 (default) / 580 (`--features mqtt`) tests pass.
|
||||
- **Mass Casualty triage never reports a survivor with a heartbeat as Deceased (safety) — PR #926.** Both triage paths in `wifi-densepose-mat` — `TriageCalculator::calculate` (`combine_assessments(Absent, None) ⇒ Deceased`) and the detection path `EnsembleClassifier::determine_triage` (`!has_breathing && !has_movement ⇒ Deceased`) — ignored the `heartbeat` field. A survivor with a detectable **pulse** but no sensed breathing/movement (respiratory arrest — the most time-critical *savable* state, Immediate/Red) was therefore reported **Deceased (Black)** and deprioritized for rescue. The domain path was in fact only reachable *because* a heartbeat made `has_vitals()` true, so every "Deceased" was a live person. Both paths now escalate to **Immediate** when a heartbeat is present; total absence of breathing, movement *and* heartbeat is unchanged (domain → `Unknown`, ensemble → `Deceased`). 2 safety regression tests; full MAT suite (177) green.
|
||||
- **Per-node Home-Assistant devices now report each node's *own* presence/motion — PR #918.** After the one-device-per-node fan-out landed, the MQTT bridge still applied the *room-level aggregate* `classification` to every node, so in a multi-node deployment a node watching an empty corner inherited another node's "present" (and `motion_level: "absent"` was mis-mapped to full motion). Each node in the broadcast `nodes[]` already carries its own `classification`; the bridge now reads it per node (extracted into a testable `vitals_snapshots_from_sensing_json`), keeping vitals + person count room-level. 4 unit tests.
|
||||
- **`--model` gives an actionable diagnostic instead of a cryptic magic error — PR #919 (refs #894).** Passing a HuggingFace `ruvnet/wifi-densepose-pretrained` file (`model.safetensors` / `model-q4.bin` / `model.rvf.jsonl`) to `--model` produced `invalid magic at offset 0: … got 0x77455735`, then a silent fall back to heuristics. The load-failure path now detects the format (safetensors / quantized blob / JSONL manifest) and explains that those files are a different format **and** encoder architecture than the RVF binary container the progressive loader expects, pointing to #894. Pure `diagnose_model_load_error` + 4 tests.
|
||||
- **`--export-rvf` no longer silently produces a placeholder model — PR #920.** The `--export-rvf` handler ran *before* `--train`/`--pretrain` and unconditionally wrote placeholder sine-wave weights, so the documented `--train … --export-rvf <path>` workflow short-circuited to a fake model and never trained (while printing "exported successfully"). It now emits the placeholder **container-format demo** only standalone (with a clear warning), and falls through to real training when `--train`/`--pretrain` is set; docs point to `--save-rvf` for the real model. 3 guard tests.
|
||||
|
||||
### Added
|
||||
- **WiFi-CSI pose: efficiency frontier + per-room calibration service** (ADR-150 §3.2–3.6). Two beyond-SOTA results on the MM-Fi benchmark, plus the deployment mechanism that resolves real-world generalization:
|
||||
@@ -33,6 +37,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
|
||||
|
||||
### Security
|
||||
- **ESP32 OTA upload now fails closed when no PSK is provisioned** (#596 audit finding — critical, **breaking change for unprovisioned nodes**). `ota_check_auth()` previously returned `true` when `s_ota_psk[0] == '\0'`, so a freshly-flashed node would accept attacker-controlled firmware over plain HTTP on port 8032 from any host on the WiFi. No Secure Boot V2, no signed-image verification — a single LAN call could brick or backdoor a node. The fix rejects every OTA upload until a PSK is written to NVS (the OTA HTTP server still starts so operators can run `provision.py --ota-psk <hex>` over USB-CDC without reflashing). **Operators affected**: any deployment that relied on the unauthenticated OTA endpoint working out of the box now needs to provision a PSK before subsequent OTA pushes will succeed. Boot-time `ESP_LOGW` makes the new posture visible.
|
||||
- **Bearer-token auth accepts the scheme case-insensitively (RFC 6750) — PR #929.** `require_bearer` parsed the `Authorization` header with a case-sensitive `strip_prefix("Bearer ")`, so a *correct* `RUVIEW_API_TOKEN` sent as `Authorization: bearer <token>` (or `BEARER`, or with extra whitespace) was rejected with a confusing 401 — needless friction when enabling auth. The scheme is now matched with `eq_ignore_ascii_case` (per RFC 6750 §2.1 / RFC 7235 §2.1); the token compare is unchanged — still exact and constant-time (`ct_eq`) — so a wrong token or a non-Bearer scheme (`Basic …`) still returns 401. Audited the surrounding code while here: `ct_eq` correctly rejects length mismatch (no prefix-auth bypass) and the middleware fails closed. New `accepts_case_insensitive_bearer_scheme` test.
|
||||
- **Path-traversal vulnerabilities patched in five sensing-server endpoints** (closes #615 — critical). New `wifi_densepose_sensing_server::path_safety::safe_id()` enforces `[A-Za-z0-9._-]` only (no leading `.`, max 64 chars) before any user-controlled identifier reaches a `format!()` building a filesystem path. Applied at:
|
||||
- `POST /api/v1/recording/start` (`recording.rs` — `session_name`)
|
||||
- `GET /api/v1/recording/download/:id` (`recording.rs` — `id`)
|
||||
|
||||
@@ -15,6 +15,52 @@
|
||||
# MODELS_DIR — directory to scan for .rvf model files (default: data/models)
|
||||
set -e
|
||||
|
||||
# ── Issue #864: fail-closed on default posture ───────────────────────────────
|
||||
# The pre-fix default was: empty RUVIEW_API_TOKEN (auth off) + --bind-addr
|
||||
# 0.0.0.0 + docker-compose publishing :3000/:3001/:5005 → an unauthenticated
|
||||
# attacker on any reachable network segment could read /api/v1/sensing/latest
|
||||
# and the /ws/sensing live stream. That posture is unsafe on guest WiFi,
|
||||
# untrusted LANs, accidentally-port-forwarded hosts, or any reverse-proxied
|
||||
# deployment. Refuse to start with this combination.
|
||||
#
|
||||
# Escape hatches (operator must opt in explicitly):
|
||||
# * Set RUVIEW_API_TOKEN to a strong secret → auth enabled on /api/v1/*.
|
||||
# * Set RUVIEW_ALLOW_UNAUTHENTICATED=1 → preserves the pre-fix behaviour;
|
||||
# only safe on an isolated trust boundary.
|
||||
# * Set RUVIEW_BIND_ADDR to a loopback / private interface → unauth is fine
|
||||
# when the socket isn't reachable. The auto-bind nudges toward 127.0.0.1.
|
||||
#
|
||||
# This check runs only for the default sensing-server path (no args + flag-only
|
||||
# args). The `cog-ha-matter` / `homecore` routes below are excluded because
|
||||
# they own their own auth lifecycle.
|
||||
case "${1:-}" in
|
||||
cog-ha-matter|ha-matter|homecore|homecore-server) ;;
|
||||
*)
|
||||
if [ -z "${RUVIEW_API_TOKEN:-}" ] && [ "${RUVIEW_ALLOW_UNAUTHENTICATED:-}" != "1" ]; then
|
||||
# If the operator hasn't overridden the bind, refuse outright on
|
||||
# the default 0.0.0.0. If they've nailed it to loopback (or a
|
||||
# specific private address they trust), let it run.
|
||||
__bind_default="${RUVIEW_BIND_ADDR:-0.0.0.0}"
|
||||
case "$__bind_default" in
|
||||
127.*|localhost|::1)
|
||||
: ;; # loopback bind is safe even without a token
|
||||
*)
|
||||
echo "[entrypoint] ERROR: refusing to start sensing-server with default" >&2
|
||||
echo "[entrypoint] posture: RUVIEW_API_TOKEN is unset AND bind is" >&2
|
||||
echo "[entrypoint] ${__bind_default}. /ws/sensing streams live sensing" >&2
|
||||
echo "[entrypoint] frames; that data would be readable by anyone who" >&2
|
||||
echo "[entrypoint] can reach this host. Pick one:" >&2
|
||||
echo "[entrypoint] docker run -e RUVIEW_API_TOKEN=\$(openssl rand -hex 32) ..." >&2
|
||||
echo "[entrypoint] docker run -e RUVIEW_BIND_ADDR=127.0.0.1 ..." >&2
|
||||
echo "[entrypoint] docker run -e RUVIEW_ALLOW_UNAUTHENTICATED=1 ... # only on trusted network" >&2
|
||||
echo "[entrypoint] See https://github.com/ruvnet/RuView/issues/864" >&2
|
||||
exit 64
|
||||
;;
|
||||
esac
|
||||
fi
|
||||
;;
|
||||
esac
|
||||
|
||||
# Route to cog-ha-matter (ADR-116) when invoked as:
|
||||
# docker run <image> cog-ha-matter [--flags]
|
||||
# or via the short alias `ha-matter`. Strips the keyword and execs the
|
||||
@@ -48,7 +94,7 @@ if [ "${1#-}" != "$1" ] || [ -z "$1" ]; then
|
||||
--ui-path /app/ui \
|
||||
--http-port 3000 \
|
||||
--ws-port 3001 \
|
||||
--bind-addr 0.0.0.0 \
|
||||
--bind-addr "${RUVIEW_BIND_ADDR:-0.0.0.0}" \
|
||||
"$@"
|
||||
fi
|
||||
|
||||
|
||||
@@ -65,6 +65,15 @@ target_compile_definitions(${COMPONENT_LIB} PUBLIC
|
||||
d_m3LogOutput=0 # Disable WASM3 stdout logging (use ESP_LOG)
|
||||
d_m3FixedHeap=0 # Use dynamic allocation (PSRAM-friendly)
|
||||
WASM3_AVAILABLE=1 # Flag for conditional compilation
|
||||
# Issue #946: GCC 15.2.0 for Xtensa (ESP-IDF v6.0.1) rejects wasm3's
|
||||
# `M3_MUSTTAIL` aggressive tail-call attribute with
|
||||
# "cannot tail-call: machine description does not have a sibcall_epilogue
|
||||
# instruction pattern". wasm3 falls back to a regular call sequence when
|
||||
# M3_NO_MUSTTAIL is defined — slightly slower per opcode but functionally
|
||||
# identical. Forcing it off unconditionally on Xtensa is fine because the
|
||||
# tail-call optimisation was never reliable on this target anyway. Older
|
||||
# IDF/GCC builds also accept the define (it just becomes a no-op).
|
||||
M3_NO_MUSTTAIL=1
|
||||
)
|
||||
|
||||
# Suppress warnings from third-party code.
|
||||
|
||||
@@ -220,11 +220,20 @@ static void fast_loop_cb(TimerHandle_t t)
|
||||
adaptive_controller_decide(&s_cfg, s_state, &obs, &dec);
|
||||
apply_decision(&dec);
|
||||
|
||||
/* ADR-081 Layer 4/5: emit compact feature state on every fast tick
|
||||
* (default 200 ms → 5 Hz, within the 1–10 Hz spec). Replaces raw
|
||||
* ADR-018 CSI as the default upstream; raw remains available as a
|
||||
* debug stream gated by the channel plan. */
|
||||
emit_feature_state();
|
||||
/* ADR-081 Layer 4/5: emit compact feature state at 1 Hz (the spec's
|
||||
* 1–10 Hz floor). Was previously emitted on every fast tick (~5 Hz at
|
||||
* the default 200 ms fast period), which combined with CSI promiscuous
|
||||
* RX saturated the WiFi TX airtime — measured live on COM8 (S3) and
|
||||
* COM9 (C6): every adaptive cycle showed `sendto ENOMEM — backing off
|
||||
* for 100 ms`, and bumping LWIP/WiFi buffer pools to 4× had no effect
|
||||
* on the rate because the bottleneck was radio TX time, not pool size.
|
||||
* Dropping to 1 Hz (5× less feature_state traffic) frees the TX queue
|
||||
* for CSI sends and lands well within the spec. */
|
||||
static uint8_t s_emit_divider = 0;
|
||||
if (++s_emit_divider >= 5) {
|
||||
s_emit_divider = 0;
|
||||
emit_feature_state();
|
||||
}
|
||||
}
|
||||
|
||||
static void medium_loop_cb(TimerHandle_t t)
|
||||
|
||||
@@ -21,6 +21,7 @@
|
||||
#include "esp_wifi.h"
|
||||
#include "esp_mac.h"
|
||||
#include "esp_timer.h"
|
||||
#include "esp_idf_version.h"
|
||||
#include "freertos/FreeRTOS.h"
|
||||
#include "freertos/timers.h"
|
||||
#include <string.h>
|
||||
@@ -144,11 +145,27 @@ static void on_recv(const uint8_t *src_mac, const uint8_t *data, int len)
|
||||
}
|
||||
}
|
||||
|
||||
/* Issue #944: ESP-IDF v6.0 changed `esp_now_send_cb_t` from
|
||||
* void (*)(const uint8_t *mac, esp_now_send_status_t status)
|
||||
* to
|
||||
* void (*)(const esp_now_send_info_t *tx_info, esp_now_send_status_t status)
|
||||
* Both signatures ignore the address-side argument here — we only inspect
|
||||
* `status` to bump the TX-fail counter — so the body is identical; only the
|
||||
* function-pointer type differs. ESP_IDF_VERSION_MAJOR is the canonical guard.
|
||||
*/
|
||||
#if ESP_IDF_VERSION_MAJOR >= 6
|
||||
static void on_send(const esp_now_send_info_t *tx_info, esp_now_send_status_t status)
|
||||
{
|
||||
(void)tx_info;
|
||||
if (status != ESP_NOW_SEND_SUCCESS) s_tx_fail++;
|
||||
}
|
||||
#else
|
||||
static void on_send(const uint8_t *mac, esp_now_send_status_t status)
|
||||
{
|
||||
(void)mac;
|
||||
if (status != ESP_NOW_SEND_SUCCESS) s_tx_fail++;
|
||||
}
|
||||
#endif
|
||||
|
||||
static void beacon_timer_cb(TimerHandle_t t)
|
||||
{
|
||||
|
||||
@@ -12,7 +12,8 @@
|
||||
* 0xC5110003 — ADR-069 feature vector (edge_processing.h)
|
||||
* 0xC5110004 — ADR-063 fused vitals (edge_processing.h)
|
||||
* 0xC5110005 — ADR-039 compressed CSI (edge_processing.h)
|
||||
* 0xC5110006 — ADR-081 feature state (this file) ← new
|
||||
* 0xC5110006 — ADR-081 feature state (this file)
|
||||
* 0xC5110007 — ADR-040 WASM output (wasm_runtime.h, reassigned per issue #928)
|
||||
*/
|
||||
|
||||
#ifndef RV_FEATURE_STATE_H
|
||||
|
||||
@@ -23,7 +23,16 @@
|
||||
static const char *TAG = "swarm";
|
||||
|
||||
/* ---- Task parameters ---- */
|
||||
#define SWARM_TASK_STACK 3072 /**< 3 KB stack — HTTP client uses ~2.5 KB. */
|
||||
/* Issue #949: 3 KB was sized for plain HTTP (~2.5 KB). The bug reporter
|
||||
* configured `--seed-url https://…` which exercises TLS — mbedTLS handshake
|
||||
* alone needs 4-6 KB on the stack (cipher suite + cert chain + ECDH), and on
|
||||
* top of that esp_http_client adds another 1.5-2 KB. The task panicked with
|
||||
* `0xa5a5a5a5` (FreeRTOS stack-fill sentinel) immediately after "bridge init
|
||||
* OK". 8 KB comfortably fits TLS with margin for the cert chain + headers;
|
||||
* confirmed against mbedTLS's stack analyser. Plain-HTTP deployments waste
|
||||
* ~5 KB of headroom but that's <0.1 % of PSRAM, an acceptable cost for the
|
||||
* bug class this prevents. */
|
||||
#define SWARM_TASK_STACK 8192 /**< 8 KB stack — fits mbedTLS handshake. */
|
||||
#define SWARM_TASK_PRIO 3
|
||||
#define SWARM_TASK_CORE 0
|
||||
#define SWARM_HTTP_TIMEOUT 3000 /**< HTTP timeout in ms (Seed responds <100ms on LAN). */
|
||||
|
||||
@@ -43,7 +43,16 @@
|
||||
|
||||
#define WASM_MAX_MODULE_SIZE (128 * 1024) /**< Max .wasm binary size (128 KB). */
|
||||
#define WASM_STACK_SIZE (8 * 1024) /**< WASM execution stack (8 KB). */
|
||||
#define WASM_OUTPUT_MAGIC 0xC5110004 /**< WASM output packet magic. */
|
||||
/* Issue #928: WASM output was originally 0xC5110004, but that magic is
|
||||
* canonically owned by ADR-063 fused vitals (edge_processing.h). Both packets
|
||||
* were transmitted on the same magic, and the host parser only knew the WASM
|
||||
* shape, so on the ESP32-C6 + MR60BHA2 mmWave config the 48-byte fused-vitals
|
||||
* packet was being read as garbage WASM events. Reassigned to 0xC5110007 (next
|
||||
* free slot in the registry — see rv_feature_state.h). Firmware older than
|
||||
* this commit will silently lose its WASM event stream against an updated host
|
||||
* — that's the deliberate "fail loud" choice over silent misparsing.
|
||||
*/
|
||||
#define WASM_OUTPUT_MAGIC 0xC5110007 /**< WASM output packet magic (post-#928). */
|
||||
#define WASM_MAX_EVENTS 16 /**< Max events per output packet. */
|
||||
|
||||
/* ---- WASM Event (5 bytes: u8 type + f32 value) ---- */
|
||||
@@ -54,7 +63,7 @@ typedef struct __attribute__((packed)) {
|
||||
|
||||
/* ---- WASM Output Packet ---- */
|
||||
typedef struct __attribute__((packed)) {
|
||||
uint32_t magic; /**< WASM_OUTPUT_MAGIC = 0xC5110004. */
|
||||
uint32_t magic; /**< WASM_OUTPUT_MAGIC = 0xC5110007 (issue #928). */
|
||||
uint8_t node_id; /**< ESP32 node identifier. */
|
||||
uint8_t module_id; /**< Module slot index. */
|
||||
uint16_t event_count; /**< Number of events in this packet. */
|
||||
|
||||
@@ -29,6 +29,30 @@ CONFIG_LOG_DEFAULT_LEVEL_INFO=y
|
||||
# LWIP: enable extended socket options for UDP multicast
|
||||
CONFIG_LWIP_SO_RCVBUF=y
|
||||
|
||||
# Issue (sibling of #946/#949/#864 cluster): UDP `sendto` returned ENOMEM
|
||||
# in a tight loop on both ESP32-S3 (COM8) and ESP32-C6 (COM9) at the v0.7.0
|
||||
# CSI packet rate (CSI cb + status + sync + feature_state all sharing the
|
||||
# LWIP/WiFi pools). stream_sender.c has a cooldown path so the device
|
||||
# doesn't crash, but ~90 % of CSI frames were dropped before reaching the
|
||||
# host — boot trace showed `sendto ENOMEM — backing off 100 ms` repeating
|
||||
# every capture cycle. Stock IDF v5.4 defaults: UDP recv mbox=6, TCPIP
|
||||
# mbox=32, WiFi dynamic TX buffers=32 — too small once CSI promiscuous
|
||||
# mode is active. These bumps roughly quadruple the relevant pools at
|
||||
# ~3 KB extra heap cost, measured live on both targets Jun 8 2026.
|
||||
CONFIG_LWIP_UDP_RECVMBOX_SIZE=32
|
||||
CONFIG_LWIP_TCPIP_RECVMBOX_SIZE=64
|
||||
CONFIG_ESP_WIFI_DYNAMIC_TX_BUFFER_NUM=64
|
||||
# NOTE: Empirical 25 s measurements on the S3 at COM8 showed these bumps
|
||||
# eliminate the csi_collector.sendto failure path (`fail #1..5` →
|
||||
# `fail #0`) — real improvement — but do NOT eliminate the broader
|
||||
# `feature_state emit` ENOMEM at ~10/s. That residual is the WiFi
|
||||
# radio's TX airtime saturating under CSI promiscuous RX, and bigger
|
||||
# buffers cap out at the 100 ms backoff window regardless of size
|
||||
# (verified at WIFI_DYNAMIC_TX=128 + PBUF_POOL=32 — identical count).
|
||||
# The proper fix is rate-limiting adaptive_controller.c's emit cadence
|
||||
# from ~50 ms to the intended 1 Hz, which is a code refactor tracked
|
||||
# in a separate follow-up issue.
|
||||
|
||||
# FreeRTOS: increase task stack for CSI processing
|
||||
CONFIG_ESP_MAIN_TASK_STACK_SIZE=8192
|
||||
|
||||
|
||||
@@ -45,13 +45,14 @@ pub fn parse_esp32_vitals(buf: &[u8]) -> Option<Esp32VitalsPacket> {
|
||||
})
|
||||
}
|
||||
|
||||
/// Parse a WASM output packet (magic 0xC511_0004).
|
||||
/// Parse a WASM output packet (magic 0xC511_0007 — reassigned per issue #928;
|
||||
/// the original 0xC511_0004 collided with ADR-063 fused vitals).
|
||||
pub fn parse_wasm_output(buf: &[u8]) -> Option<WasmOutputPacket> {
|
||||
if buf.len() < 8 {
|
||||
return None;
|
||||
}
|
||||
let magic = u32::from_le_bytes([buf[0], buf[1], buf[2], buf[3]]);
|
||||
if magic != 0xC511_0004 {
|
||||
if magic != 0xC511_0007 {
|
||||
return None;
|
||||
}
|
||||
|
||||
|
||||
@@ -1114,7 +1114,7 @@ fn parse_esp32_vitals(buf: &[u8]) -> Option<Esp32VitalsPacket> {
|
||||
})
|
||||
}
|
||||
|
||||
// ── ADR-040: WASM Output Packet (magic 0xC511_0004) ───────────────────────────
|
||||
// ── ADR-040: WASM Output Packet (magic 0xC511_0007 — reassigned per #928) ─────
|
||||
|
||||
/// Single WASM event (type + value).
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
@@ -1131,13 +1131,14 @@ struct WasmOutputPacket {
|
||||
events: Vec<WasmEvent>,
|
||||
}
|
||||
|
||||
/// Parse a WASM output packet (magic 0xC511_0004).
|
||||
/// Parse a WASM output packet (magic 0xC511_0007 — reassigned per issue #928;
|
||||
/// the original 0xC511_0004 was a collision with ADR-063 fused vitals).
|
||||
fn parse_wasm_output(buf: &[u8]) -> Option<WasmOutputPacket> {
|
||||
if buf.len() < 8 {
|
||||
return None;
|
||||
}
|
||||
let magic = u32::from_le_bytes([buf[0], buf[1], buf[2], buf[3]]);
|
||||
if magic != 0xC511_0004 {
|
||||
if magic != 0xC511_0007 {
|
||||
return None;
|
||||
}
|
||||
|
||||
@@ -1169,6 +1170,187 @@ fn parse_wasm_output(buf: &[u8]) -> Option<WasmOutputPacket> {
|
||||
})
|
||||
}
|
||||
|
||||
// ── ADR-063: Edge Fused Vitals Packet (magic 0xC511_0004) ─────────────────────
|
||||
//
|
||||
// 48-byte packed struct emitted by the ESP32-C6 + MR60BHA2 mmWave config when
|
||||
// `mmwave_sensor_get_state().detected` is true. Byte layout from
|
||||
// `firmware/esp32-csi-node/main/edge_processing.h` line 129 — kept in lockstep
|
||||
// with the firmware's `_Static_assert(sizeof(edge_fused_vitals_pkt_t) == 48)`.
|
||||
// Issue #928 surfaced that this magic was being parsed as WASM output and the
|
||||
// fused vitals were silently lost. Adding the proper parser here.
|
||||
|
||||
#[derive(Debug, Clone, Serialize)]
|
||||
struct EdgeFusedVitalsPacket {
|
||||
node_id: u8,
|
||||
/// Bit0=presence, Bit1=fall, Bit2=motion, Bit3=mmwave_present.
|
||||
flags: u8,
|
||||
/// Fused breathing rate in BPM (firmware sends BPM*100; we scale here).
|
||||
breathing_rate_bpm: f32,
|
||||
/// Fused heartrate in BPM (firmware sends BPM*10000; we scale here).
|
||||
heartrate_bpm: f32,
|
||||
rssi: i8,
|
||||
n_persons: u8,
|
||||
/// `mmwave_type_t` enum value from firmware.
|
||||
mmwave_type: u8,
|
||||
/// 0-100 fusion quality score.
|
||||
fusion_confidence: u8,
|
||||
motion_energy: f32,
|
||||
presence_score: f32,
|
||||
timestamp_ms: u32,
|
||||
/// Raw mmWave heart rate (BPM).
|
||||
mmwave_hr_bpm: f32,
|
||||
/// Raw mmWave breathing rate (BPM).
|
||||
mmwave_br_bpm: f32,
|
||||
/// Distance to nearest target (cm).
|
||||
mmwave_distance_cm: f32,
|
||||
/// Target count from mmWave.
|
||||
mmwave_targets: u8,
|
||||
/// mmWave signal quality 0-100.
|
||||
mmwave_confidence: u8,
|
||||
}
|
||||
|
||||
/// Parse an ADR-063 edge fused vitals packet (magic 0xC511_0004, 48 bytes).
|
||||
fn parse_edge_fused_vitals(buf: &[u8]) -> Option<EdgeFusedVitalsPacket> {
|
||||
if buf.len() < 48 {
|
||||
return None;
|
||||
}
|
||||
let magic = u32::from_le_bytes([buf[0], buf[1], buf[2], buf[3]]);
|
||||
if magic != 0xC511_0004 {
|
||||
return None;
|
||||
}
|
||||
|
||||
let node_id = buf[4];
|
||||
let flags = buf[5];
|
||||
let breathing_raw = u16::from_le_bytes([buf[6], buf[7]]);
|
||||
let heartrate_raw = u32::from_le_bytes([buf[8], buf[9], buf[10], buf[11]]);
|
||||
let rssi = buf[12] as i8;
|
||||
let n_persons = buf[13];
|
||||
let mmwave_type = buf[14];
|
||||
let fusion_confidence = buf[15];
|
||||
let motion_energy = f32::from_le_bytes([buf[16], buf[17], buf[18], buf[19]]);
|
||||
let presence_score = f32::from_le_bytes([buf[20], buf[21], buf[22], buf[23]]);
|
||||
let timestamp_ms = u32::from_le_bytes([buf[24], buf[25], buf[26], buf[27]]);
|
||||
let mmwave_hr_bpm = f32::from_le_bytes([buf[28], buf[29], buf[30], buf[31]]);
|
||||
let mmwave_br_bpm = f32::from_le_bytes([buf[32], buf[33], buf[34], buf[35]]);
|
||||
let mmwave_distance_cm = f32::from_le_bytes([buf[36], buf[37], buf[38], buf[39]]);
|
||||
let mmwave_targets = buf[40];
|
||||
let mmwave_confidence = buf[41];
|
||||
// buf[42..48] are firmware reserved fields (reserved3 u16 + reserved4 u32).
|
||||
|
||||
Some(EdgeFusedVitalsPacket {
|
||||
node_id,
|
||||
flags,
|
||||
breathing_rate_bpm: breathing_raw as f32 / 100.0,
|
||||
heartrate_bpm: heartrate_raw as f32 / 10000.0,
|
||||
rssi,
|
||||
n_persons,
|
||||
mmwave_type,
|
||||
fusion_confidence,
|
||||
motion_energy,
|
||||
presence_score,
|
||||
timestamp_ms,
|
||||
mmwave_hr_bpm,
|
||||
mmwave_br_bpm,
|
||||
mmwave_distance_cm,
|
||||
mmwave_targets,
|
||||
mmwave_confidence,
|
||||
})
|
||||
}
|
||||
|
||||
#[cfg(test)]
|
||||
mod issue_928_magic_collision_tests {
|
||||
//! Issue #928 — `0xC511_0004` was being parsed as WASM output, eating the
|
||||
//! C6+mmWave fused-vitals packets. After this fix, `0xC511_0004` routes to
|
||||
//! `parse_edge_fused_vitals` and WASM output owns the freshly-allocated
|
||||
//! `0xC511_0007` slot. Tests guard both halves of the swap.
|
||||
use super::*;
|
||||
|
||||
/// Build a 48-byte synthetic fused-vitals packet matching the firmware's
|
||||
/// `edge_fused_vitals_pkt_t` layout from `edge_processing.h:129`.
|
||||
fn build_fused_vitals_packet() -> Vec<u8> {
|
||||
let mut buf = vec![0u8; 48];
|
||||
buf[0..4].copy_from_slice(&0xC511_0004u32.to_le_bytes());
|
||||
buf[4] = 9; // node_id
|
||||
buf[5] = 0b0000_1001; // flags: presence | mmwave_present
|
||||
buf[6..8].copy_from_slice(&1600u16.to_le_bytes()); // breathing 16.00 BPM
|
||||
buf[8..12].copy_from_slice(&720_000u32.to_le_bytes()); // heartrate 72.0 BPM
|
||||
buf[12] = (-55i8) as u8; // rssi
|
||||
buf[13] = 1; // n_persons
|
||||
buf[14] = 2; // mmwave_type
|
||||
buf[15] = 85; // fusion_confidence
|
||||
buf[16..20].copy_from_slice(&0.42f32.to_le_bytes()); // motion_energy
|
||||
buf[20..24].copy_from_slice(&0.95f32.to_le_bytes()); // presence_score
|
||||
buf[24..28].copy_from_slice(&1_234_567u32.to_le_bytes()); // timestamp_ms
|
||||
buf[28..32].copy_from_slice(&71.5f32.to_le_bytes()); // mmwave_hr_bpm
|
||||
buf[32..36].copy_from_slice(&15.8f32.to_le_bytes()); // mmwave_br_bpm
|
||||
buf[36..40].copy_from_slice(&182.0f32.to_le_bytes()); // mmwave_distance_cm
|
||||
buf[40] = 1; // mmwave_targets
|
||||
buf[41] = 90; // mmwave_confidence
|
||||
// bytes 42..48 — firmware reserved fields, left as zero
|
||||
buf
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_edge_fused_vitals_extracts_fields_correctly() {
|
||||
let buf = build_fused_vitals_packet();
|
||||
let pkt = parse_edge_fused_vitals(&buf).expect("must parse a well-formed packet");
|
||||
assert_eq!(pkt.node_id, 9);
|
||||
assert_eq!(pkt.flags, 0b0000_1001);
|
||||
assert!((pkt.breathing_rate_bpm - 16.0).abs() < 1e-3, "breathing scale 100");
|
||||
assert!((pkt.heartrate_bpm - 72.0).abs() < 1e-3, "heartrate scale 10000");
|
||||
assert_eq!(pkt.rssi, -55);
|
||||
assert_eq!(pkt.n_persons, 1);
|
||||
assert_eq!(pkt.mmwave_type, 2);
|
||||
assert_eq!(pkt.fusion_confidence, 85);
|
||||
assert!((pkt.motion_energy - 0.42).abs() < 1e-6);
|
||||
assert!((pkt.presence_score - 0.95).abs() < 1e-6);
|
||||
assert_eq!(pkt.timestamp_ms, 1_234_567);
|
||||
assert!((pkt.mmwave_hr_bpm - 71.5).abs() < 1e-6);
|
||||
assert!((pkt.mmwave_br_bpm - 15.8).abs() < 1e-3);
|
||||
assert!((pkt.mmwave_distance_cm - 182.0).abs() < 1e-6);
|
||||
assert_eq!(pkt.mmwave_targets, 1);
|
||||
assert_eq!(pkt.mmwave_confidence, 90);
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_edge_fused_vitals_rejects_short_buffer() {
|
||||
let buf = build_fused_vitals_packet();
|
||||
// Truncate to 47 bytes — one short of the 48-byte minimum.
|
||||
assert!(parse_edge_fused_vitals(&buf[..47]).is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_edge_fused_vitals_rejects_wrong_magic() {
|
||||
let mut buf = build_fused_vitals_packet();
|
||||
buf[0..4].copy_from_slice(&0xC511_0007u32.to_le_bytes()); // WASM magic, not fused
|
||||
assert!(parse_edge_fused_vitals(&buf).is_none());
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_wasm_output_rejects_legacy_0004_magic() {
|
||||
// The old WASM magic collided with fused vitals — must no longer be
|
||||
// accepted. A real fused-vitals packet starts with 0xC511_0004 and
|
||||
// would have been misparsed before this fix.
|
||||
let buf = build_fused_vitals_packet();
|
||||
assert!(parse_wasm_output(&buf).is_none(),
|
||||
"issue #928: WASM parser must NOT accept 0xC511_0004");
|
||||
}
|
||||
|
||||
#[test]
|
||||
fn parse_wasm_output_accepts_new_0007_magic() {
|
||||
// Build a tiny well-formed WASM output packet on the new magic.
|
||||
let mut buf = vec![0u8; 8];
|
||||
buf[0..4].copy_from_slice(&0xC511_0007u32.to_le_bytes());
|
||||
buf[4] = 5; // node_id
|
||||
buf[5] = 1; // module_id
|
||||
buf[6..8].copy_from_slice(&0u16.to_le_bytes()); // event_count = 0
|
||||
let pkt = parse_wasm_output(&buf).expect("0xC511_0007 must parse");
|
||||
assert_eq!(pkt.node_id, 5);
|
||||
assert_eq!(pkt.module_id, 1);
|
||||
assert!(pkt.events.is_empty());
|
||||
}
|
||||
}
|
||||
|
||||
// ── ESP32 UDP frame parser ───────────────────────────────────────────────────
|
||||
|
||||
fn parse_esp32_frame(buf: &[u8]) -> Option<Esp32Frame> {
|
||||
@@ -4979,7 +5161,45 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
|
||||
}
|
||||
}
|
||||
|
||||
// ADR-040: Try WASM output packet (magic 0xC511_0004).
|
||||
// ADR-063: Try edge fused vitals packet (magic 0xC511_0004).
|
||||
// Must come BEFORE the WASM parser — issue #928: these two
|
||||
// packet types shared a magic and the WASM parser was eating
|
||||
// fused-vitals frames on the C6+mmWave config. The reassign of
|
||||
// WASM_OUTPUT_MAGIC → 0xC511_0007 (firmware side) plus this
|
||||
// dedicated parser resolve the collision.
|
||||
if let Some(fused) = parse_edge_fused_vitals(&buf[..len]) {
|
||||
debug!(
|
||||
"Edge fused vitals from {src}: node={} br={:.1} hr={:.1} \
|
||||
mmwave_targets={} fusion_conf={}",
|
||||
fused.node_id, fused.breathing_rate_bpm, fused.heartrate_bpm,
|
||||
fused.mmwave_targets, fused.fusion_confidence,
|
||||
);
|
||||
let s = state.write().await;
|
||||
if let Ok(json) = serde_json::to_string(&serde_json::json!({
|
||||
"type": "edge_fused_vitals",
|
||||
"node_id": fused.node_id,
|
||||
"breathing_rate_bpm": fused.breathing_rate_bpm,
|
||||
"heartrate_bpm": fused.heartrate_bpm,
|
||||
"n_persons": fused.n_persons,
|
||||
"fusion_confidence": fused.fusion_confidence,
|
||||
"mmwave": {
|
||||
"hr_bpm": fused.mmwave_hr_bpm,
|
||||
"br_bpm": fused.mmwave_br_bpm,
|
||||
"distance_cm": fused.mmwave_distance_cm,
|
||||
"targets": fused.mmwave_targets,
|
||||
"confidence": fused.mmwave_confidence,
|
||||
"type": fused.mmwave_type,
|
||||
},
|
||||
"motion_energy": fused.motion_energy,
|
||||
"presence_score": fused.presence_score,
|
||||
"timestamp_ms": fused.timestamp_ms,
|
||||
})) {
|
||||
let _ = s.tx.send(json);
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
// ADR-040: Try WASM output packet (magic 0xC511_0007 post-#928).
|
||||
if let Some(wasm_output) = parse_wasm_output(&buf[..len]) {
|
||||
debug!(
|
||||
"WASM output from {src}: node={} module={} events={}",
|
||||
|
||||
@@ -276,6 +276,13 @@ pub struct FieldNormalMode {
|
||||
pub geometry_hash: u64,
|
||||
/// Baseline eigenvalue count above Marcenko-Pastur threshold (empty-room).
|
||||
pub baseline_eigenvalue_count: usize,
|
||||
/// Baseline noise variance estimate (median of bottom-half positive
|
||||
/// eigenvalues from the calibration covariance). Persisted so that
|
||||
/// `estimate_occupancy` can anchor its Marcenko-Pastur threshold to the
|
||||
/// calibration noise floor instead of letting it drift with the
|
||||
/// per-window sample size. Defaults to 0.0 in the diagonal-fallback path.
|
||||
/// Issue #942.
|
||||
pub baseline_noise_var: f64,
|
||||
}
|
||||
|
||||
/// Body perturbation extracted from a CSI observation.
|
||||
@@ -504,7 +511,11 @@ impl FieldModel {
|
||||
let baseline: Vec<Vec<f64>> = self.link_stats.iter().map(|ls| ls.mean_vector()).collect();
|
||||
|
||||
// --- True eigenvalue decomposition (with diagonal fallback) ---
|
||||
let (mode_energies, environmental_modes, baseline_eig_count) =
|
||||
// Returns: (energies, modes, baseline_count, baseline_noise_var).
|
||||
// The noise_var slot is 0.0 in the diagonal-fallback paths; the
|
||||
// estimation hot path treats 0.0 as "no anchored noise floor" and
|
||||
// falls back to per-window noise_var, preserving pre-#942 behavior.
|
||||
let (mode_energies, environmental_modes, baseline_eig_count, baseline_noise_var) =
|
||||
if let Some(ref cov_sum) = self.covariance_sum {
|
||||
if self.covariance_count > 1 {
|
||||
// Compute sample covariance from raw outer products:
|
||||
@@ -588,23 +599,28 @@ impl FieldModel {
|
||||
let baseline_count =
|
||||
eigenvalues.iter().filter(|&&ev| ev > mp_threshold).count();
|
||||
|
||||
(energies, modes, baseline_count)
|
||||
(energies, modes, baseline_count, noise_var)
|
||||
}
|
||||
Err(_) => {
|
||||
// Fallback to diagonal approximation on SVD failure
|
||||
diagonal_fallback(&self.link_stats, n_sc, n_modes)
|
||||
let (e, m, b) =
|
||||
diagonal_fallback(&self.link_stats, n_sc, n_modes);
|
||||
(e, m, b, 0.0_f64)
|
||||
}
|
||||
}
|
||||
// When eigenvalue feature is disabled, use diagonal fallback
|
||||
#[cfg(not(feature = "eigenvalue"))]
|
||||
{
|
||||
diagonal_fallback(&self.link_stats, n_sc, n_modes)
|
||||
let (e, m, b) = diagonal_fallback(&self.link_stats, n_sc, n_modes);
|
||||
(e, m, b, 0.0_f64)
|
||||
}
|
||||
} else {
|
||||
diagonal_fallback(&self.link_stats, n_sc, n_modes)
|
||||
let (e, m, b) = diagonal_fallback(&self.link_stats, n_sc, n_modes);
|
||||
(e, m, b, 0.0_f64)
|
||||
}
|
||||
} else {
|
||||
diagonal_fallback(&self.link_stats, n_sc, n_modes)
|
||||
let (e, m, b) = diagonal_fallback(&self.link_stats, n_sc, n_modes);
|
||||
(e, m, b, 0.0_f64)
|
||||
};
|
||||
|
||||
// Compute variance explained using the same centered covariance as modes.
|
||||
@@ -648,6 +664,7 @@ impl FieldModel {
|
||||
calibrated_at_us: timestamp_us,
|
||||
geometry_hash,
|
||||
baseline_eigenvalue_count: baseline_eig_count,
|
||||
baseline_noise_var,
|
||||
};
|
||||
|
||||
self.modes = Some(field_mode);
|
||||
@@ -794,7 +811,7 @@ impl FieldModel {
|
||||
// Marcenko-Pastur noise estimate: median of POSITIVE eigenvalues
|
||||
// in the bottom half. Excludes zeros from rank-deficient matrices
|
||||
// (common when n_subcarriers > n_frames, e.g. 56 subcarriers / 50 frames).
|
||||
let noise_var = {
|
||||
let local_noise_var = {
|
||||
let mut positive: Vec<f64> =
|
||||
eigenvalues.iter().copied().filter(|&e| e > 1e-10).collect();
|
||||
positive.sort_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));
|
||||
@@ -807,6 +824,22 @@ impl FieldModel {
|
||||
return Ok(0); // All zero eigenvalues — can't estimate
|
||||
}
|
||||
};
|
||||
|
||||
// Issue #942: anchor the noise floor to the calibration's noise_var
|
||||
// when it's available. Per-window noise_var drifts with sample size —
|
||||
// a short estimation window can produce a small local_noise_var that
|
||||
// inflates `significant` and breaks the test_estimate_occupancy_noise_only
|
||||
// invariant. The max of (calibration noise, local noise) keeps the
|
||||
// threshold from collapsing on small windows while still letting the
|
||||
// per-window noise dominate when it's the larger estimate. Falls back
|
||||
// to local_noise_var when baseline_noise_var == 0 (diagonal-fallback
|
||||
// calibration path, or pre-#942 stored modes).
|
||||
let noise_var = if modes.baseline_noise_var > 0.0 {
|
||||
local_noise_var.max(modes.baseline_noise_var)
|
||||
} else {
|
||||
local_noise_var
|
||||
};
|
||||
|
||||
let ratio = n as f64 / count as f64;
|
||||
let mp_threshold = noise_var * (1.0 + ratio.sqrt()).powi(2);
|
||||
|
||||
|
||||
Reference in New Issue
Block a user