Compare commits

...

15 Commits

Author SHA1 Message Date
ruv 2732cf9e8f Merge remote-tracking branch 'origin/main' into feat/cross-node-fusion
# Conflicts:
#	rust-port/wifi-densepose-rs/crates/wifi-densepose-sensing-server/src/main.rs
2026-03-30 21:55:40 -04:00
ruv 94e928c274 docs: update CHANGELOG with v0.5.1-v0.5.3 releases
Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 21:52:48 -04:00
ruv 10d69c1071 feat: DynamicMinCut person separation + UI lerp smoothing
- Added ruvector-mincut dependency to sensing server
- Replaced variance-based person scoring with actual graph min-cut on
  subcarrier temporal correlation matrix (Pearson correlation edges,
  DynamicMinCut exact max-flow)
- Recalibrated feature scaling for real ESP32 data ranges
- UI: client-side lerp interpolation (alpha=0.25) on keypoint positions
- Dampened procedural animation (noise, stride, extremity jitter)
- Person count thresholds retuned for mincut ratio

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 16:39:05 -04:00
ruv 3f549f4d25 fix(ui): add client-side lerp smoothing to pose renderer
Keypoints now interpolate between frames (alpha=0.25) instead of
jumping directly to new positions. This eliminates visual jitter
that persists even with server-side EMA smoothing, because the
renderer was drawing every WebSocket frame at full rate.

Applied to skeleton, keypoints, and dense body rendering paths.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 15:54:44 -04:00
rUv cd84c35f8f feat: cross-node RSSI-weighted feature fusion (benchmarked)
Adds fuse_multi_node_features() that combines CSI features across all
active ESP32 nodes using RSSI-based weighting (closer node = higher weight).

Benchmark results (2 ESP32 nodes, 30s, ~1500 frames):

  Metric               | Baseline | Fusion  | Improvement
  ---------------------|----------|---------|------------
  Variance mean        |    109.4 |    77.6 | -29% noise
  Variance std         |    154.1 |   105.4 | -32% stability
  Confidence           |    0.643 |   0.686 | +7%
  Keypoint spread std  |      4.5 |     1.3 | -72% jitter
  Presence ratio       |   93.4%  |  94.6%  | +1.3pp

Person count still fluctuates near threshold — tracked as known issue.

Verified on real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net.
2026-03-30 15:48:33 -04:00
ruv f0bdc1aa69 feat(server): cross-node RSSI-weighted feature fusion + benchmarks
Adds fuse_multi_node_features() that combines CSI features across all
active ESP32 nodes using RSSI-based weighting (closer node = higher weight).

Benchmark results (2 ESP32 nodes, 30s, ~1500 frames):

  Metric               | Baseline | Fusion  | Improvement
  ---------------------|----------|---------|------------
  Variance mean        |    109.4 |    77.6 | -29% noise
  Variance std         |    154.1 |   105.4 | -32% stability
  Confidence           |    0.643 |   0.686 | +7%
  Keypoint spread std  |      4.5 |     1.3 | -72% jitter
  Presence ratio       |   93.4%  |  94.6%  | +1.3pp

Person count still fluctuates near threshold — tracked as known issue.

Verified on real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 15:47:58 -04:00
rUv dd45160cc5 fix: skeleton jitter + person count stability (hardware-verified)
* chore: update vendored ruvector to latest main (v2.1.0-40)

Was at v2.0.5-172 (f8f2c600a), now at v2.1.0-40 (050c3fe6f).
316 commits with new crates: ruvector-coherence, sona, ruvector-core,
ruvector-gnn improvements, and security hardening.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: RuVector Phases 2+3 — temporal smoothing, kinematic constraints, coherence gating

Phase 2 (sensing server):
- Temporal keypoint smoothing via EMA (alpha=0.3) with coherence-adaptive blending
- Coherence scoring: running variance of motion_energy over 20 frames
  - Low coherence → reduce alpha to 0.1 (trust measurements less)
- Per-node prev_keypoints for frame-to-frame smoothing
- Bone length clamping (±20%) in derive_single_person_pose

Phase 3 (signal crate):
- SkeletonConstraints: Jakobsen relaxation (3 iterations) on 12-bone
  COCO-17 kinematic tree — prevents impossible skeletons
- CompressedPoseHistory: two-tier storage (hot f32 + warm i16 quantized)
  for trajectory matching and re-ID
- 8 new tests for constraints + history

Vendored ruvector updated to v2.1.0-40 (latest main, 316 commits).
Workspace deps remain at v2.0.4 (crates.io) until v2.1.0 is published.

647 tests pass across both crates (0 failures).

Refs #296

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(server): use max instead of sum for multi-node person aggregation

With nodes in the same room, each node sees the same people. Summing
per-node counts double-counted (2 nodes × 1 person = 2 persons).
Now uses max() so 2 nodes × 1 person = 1 person.

Verified on real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net,
estimated_persons=1 with 1 person in room.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(server): reduce skeleton jitter + raise person count thresholds

- EMA alpha 0.3→0.15, low-coherence 0.1→0.05
- Remove tick-based noise (main jitter source)
- Breathing 5x slower, extremity jitter 3x smaller, stride 2x smaller
- Person count 1→2 threshold 0.65→0.80
- Aggregation sum→max for same-room nodes

Verified on COM6+COM9: 1 person stable.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 15:17:48 -04:00
rUv 5e5781b28a feat: RuVector all phases — temporal smoothing + kinematic constraints + coherence
* chore: update vendored ruvector to latest main (v2.1.0-40)

Was at v2.0.5-172 (f8f2c600a), now at v2.1.0-40 (050c3fe6f).
316 commits with new crates: ruvector-coherence, sona, ruvector-core,
ruvector-gnn improvements, and security hardening.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: RuVector Phases 2+3 — temporal smoothing, kinematic constraints, coherence gating

Phase 2 (sensing server):
- Temporal keypoint smoothing via EMA (alpha=0.3) with coherence-adaptive blending
- Coherence scoring: running variance of motion_energy over 20 frames
  - Low coherence → reduce alpha to 0.1 (trust measurements less)
- Per-node prev_keypoints for frame-to-frame smoothing
- Bone length clamping (±20%) in derive_single_person_pose

Phase 3 (signal crate):
- SkeletonConstraints: Jakobsen relaxation (3 iterations) on 12-bone
  COCO-17 kinematic tree — prevents impossible skeletons
- CompressedPoseHistory: two-tier storage (hot f32 + warm i16 quantized)
  for trajectory matching and re-ID
- 8 new tests for constraints + history

Vendored ruvector updated to v2.1.0-40 (latest main, 316 commits).
Workspace deps remain at v2.0.4 (crates.io) until v2.1.0 is published.

647 tests pass across both crates (0 failures).

Refs #296

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 14:23:21 -04:00
rUv 6f23e89909 fix: deep review optimizations — firmware + server
* feat(signal): subcarrier importance weighting via mincut partition (Phase 1)

Adds subcarrier_importance_weights() to ruvector signal crate — converts
mincut partition into per-subcarrier float weights (>1.0 for sensitive,
0.5 for insensitive subcarriers).

Sensing server now uses weighted mean/variance in extract_features_from_frame
instead of treating all 56 subcarriers equally. This emphasizes body-motion-
sensitive subcarriers and reduces noise from static multipath.

Expected: ~26% reduction in keypoint jitter (±15cm → ±11cm RMS).

284 tests pass (191 trainer + 51 lib + 18 vital_signs + 16 dataset + 8 multi_node).

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(firmware): stack overflow risk + tick-rate independence (review findings)

Critical fixes from deep review:

1. **Stack overflow prevention**: Moved BPM scratch buffers (br_buf, hr_buf)
   from stack to static storage in both process_frame() and
   update_multi_person_vitals(). Combined stack was ~6.5-7.5 KB of 8 KB
   limit — now reduced by ~4 KB to safe margins.

2. **Tick-rate independence**: Post-batch yield now uses
   pdMS_TO_TICKS(20) with min-1 guard instead of raw vTaskDelay(2).
   Previously assumed 100Hz tick rate.

3. **EDGE_BATCH_LIMIT to header**: Moved from local const to
   edge_processing.h #define for configurability.

Firmware builds clean at 843 KB.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(server): stale node eviction, remove unsafe pointer (review findings)

Critical fixes from deep review:

1. **Stale node eviction**: node_states HashMap now evicts nodes with no
   frame for >60 seconds, every 100 ticks. Prevents unbounded memory
   growth and stale smoothing data when nodes are replaced.

2. **Remove unsafe raw pointer**: Replaced the unsafe raw pointer to
   adaptive_model (used to break borrow checker deadlock with
   node_states) with a safe .clone() before the mutable borrow.
   AdaptiveModel derives Clone so this is a clean copy.

284 tests pass, zero failures.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-30 13:31:07 -04:00
rUv 1dcf5d42eb feat(signal): subcarrier importance weighting — RuVector Phase 1
Adds subcarrier_importance_weights() to ruvector signal crate — converts
mincut partition into per-subcarrier float weights (>1.0 for sensitive,
0.5 for insensitive subcarriers).

Sensing server now uses weighted mean/variance in extract_features_from_frame
instead of treating all 56 subcarriers equally. This emphasizes body-motion-
sensitive subcarriers and reduces noise from static multipath.

Expected: ~26% reduction in keypoint jitter (±15cm → ±11cm RMS).

284 tests pass (191 trainer + 51 lib + 18 vital_signs + 16 dataset + 8 multi_node).
2026-03-30 13:20:05 -04:00
rUv 9814d2bc62 fix(server): correct RSSI byte offset in frame parser (#332)
The server parsed rssi from buf[14] and noise_floor from buf[15], but
the firmware (csi_collector.c) packs them at buf[16] and buf[17]:

  Firmware:  n_subcarriers=u16(6-7) freq=u32(8-11) seq=u32(12-15) rssi=i8(16)
  Server:    n_subcarriers=u8(6)    freq=u16(8-9)  seq=u32(10-13) rssi=i8(14) ← WRONG

This caused RSSI to read the high byte of the sequence counter instead
of the actual signed RSSI value, producing positive values (e.g., +9)
instead of the correct negative values (e.g., -46 dBm).

Added inline documentation of the frame layout matching csi_collector.c.

Closes #332
2026-03-30 11:54:03 -04:00
ruv 7f02c87c6f test(server): add multi-node mesh integration tests (ADR-068)
8 tests covering per-node state pipeline:
- Frame builder validity (CSI + vitals packet formats)
- Different nodes produce different I/Q patterns
- Multi-node UDP send (1/3/5/7/11 nodes)
- Mesh simulation with variable rates and node dropout
- Large mesh: 100 nodes x 10 frames = 1,000 frames
- Max scale: 255 unique node_ids

All 26 server tests pass (8 new + 18 existing vital signs).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 11:06:57 -04:00
ruv 9a074bdf4f fix(ci): upgrade Firmware CI to IDF v5.4, replace xxd with od (#327)
- Container: espressif/idf:v5.2 → v5.4 (matches QEMU workflow)
- Replace xxd calls with od (xxd not available in IDF container)
- Add ota_data_initial.bin to artifact upload
- Extend artifact retention to 90 days

The xxd:not-found error was blocking all Firmware CI builds since the
container migration. This unblocks binary artifact generation for
release assets.

Closes #327

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-28 11:01:44 -04:00
rUv 3c02f6cfb0 feat(server): per-node state pipeline for multi-node sensing (#249)
* docs(adr): ADR-068 per-node state pipeline for multi-node sensing (#249)

Documents the architectural change from single shared state to per-node
HashMap<u8, NodeState> in the sensing server. Includes scaling analysis
(256 nodes < 13 MB), QEMU validation plan, and aggregation strategy.

Also links README hero image to the explainer video.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(server): per-node state pipeline for multi-node sensing (ADR-068, #249)

Replaces the single shared state pipeline with per-node HashMap<u8, NodeState>.
Each ESP32 node now gets independent:
- frame_history (temporal analysis)
- smoothed_person_score / prev_person_count
- smoothed_motion / baseline / debounce state
- vital sign detector + smoothing buffers
- RSSI history

Multi-node aggregation:
- Person count = sum of per-node counts for active nodes (seen <10s)
- SensingUpdate.nodes includes all active nodes
- estimated_persons reflects cross-node aggregate

Single-node deployments behave identically (HashMap has one entry).
Simulated data path unchanged for backward compatibility.

Closes #249
Refs #237, #276, #282

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-27 17:52:51 -04:00
ruv 23dedecf0c docs(adr): ADR-068 per-node state pipeline for multi-node sensing (#249)
Documents the architectural change from single shared state to per-node
HashMap<u8, NodeState> in the sensing server. Includes scaling analysis
(256 nodes < 13 MB), QEMU validation plan, and aggregation strategy.

Also links README hero image to the explainer video.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-27 17:45:43 -04:00
28 changed files with 4288 additions and 135 deletions
+1
View File
@@ -0,0 +1 @@
{"intelligence":7,"timestamp":1774922079152}
+6 -4
View File
@@ -15,7 +15,7 @@ jobs:
name: Build ESP32-S3 Firmware
runs-on: ubuntu-latest
container:
image: espressif/idf:v5.2
image: espressif/idf:v5.4
steps:
- uses: actions/checkout@v4
@@ -54,9 +54,10 @@ jobs:
fi
# Check partition table magic (0xAA50 at offset 0).
# Use od instead of xxd (xxd not available in espressif/idf container).
PT=build/partition_table/partition-table.bin
if [ -f "$PT" ]; then
MAGIC=$(xxd -l2 -p "$PT")
MAGIC=$(od -A n -t x1 -N 2 "$PT" | tr -d ' ')
if [ "$MAGIC" != "aa50" ]; then
echo "::warning::Partition table magic mismatch: $MAGIC (expected aa50)"
ERRORS=$((ERRORS + 1))
@@ -71,7 +72,7 @@ jobs:
fi
# Verify non-zero data in binary (not all 0xFF padding).
NONZERO=$(xxd -l 1024 -p "$BIN" | tr -d 'f' | wc -c)
NONZERO=$(od -A n -t x1 -N 1024 "$BIN" | tr -d ' f\n' | wc -c)
if [ "$NONZERO" -lt 100 ]; then
echo "::error::Binary appears to be mostly padding (non-zero chars: $NONZERO)"
ERRORS=$((ERRORS + 1))
@@ -97,4 +98,5 @@ jobs:
firmware/esp32-csi-node/build/esp32-csi-node.bin
firmware/esp32-csi-node/build/bootloader/bootloader.bin
firmware/esp32-csi-node/build/partition_table/partition-table.bin
retention-days: 30
firmware/esp32-csi-node/build/ota_data_initial.bin
retention-days: 90
+59
View File
@@ -5,6 +5,65 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [v0.5.3-esp32] — 2026-03-30
### Added
- **Cross-node RSSI-weighted feature fusion** — Multiple ESP32 nodes fuse CSI features using RSSI-based weighting. Closer node gets higher weight. Reduces variance noise by 29%, keypoint jitter by 72%.
- **DynamicMinCut person separation** — Uses `ruvector_mincut::DynamicMinCut` on the subcarrier temporal correlation graph to detect independent motion clusters. Replaces variance-based heuristic for multi-person counting.
- **RSSI-based position tracking** — Skeleton position driven by RSSI differential between nodes. Walk between ESP32s and the skeleton follows you.
- **Per-node state pipeline (ADR-068)** — Each ESP32 node gets independent `HashMap<u8, NodeState>` with frame history, classification, vitals, and person count. Fixes #249 (the #1 user-reported issue).
- **RuVector Phase 1-3 integration** — Subcarrier importance weighting, temporal keypoint smoothing (EMA), coherence gating, skeleton kinematic constraints (Jakobsen relaxation), compressed pose history.
- **Client-side lerp smoothing** — UI keypoints interpolate between frames (alpha=0.15) for fluid skeleton movement.
- **Multi-node mesh tests** — 8 integration tests covering 1-255 node configurations.
- **`wifi_densepose` Python package** — `from wifi_densepose import WiFiDensePose` now works (#314).
### Fixed
- **Watchdog crash on busy LANs (#321)** — Batch-limited edge_dsp to 4 frames before 20ms yield. Fixed idle-path busy-spin (`pdMS_TO_TICKS(5)==0`).
- **No detection from edge vitals (#323)** — Server now generates `sensing_update` from Tier 2+ vitals packets.
- **RSSI byte offset mismatch (#332)** — Server parsed RSSI from wrong byte (was reading sequence counter).
- **Stack overflow risk** — Moved 4KB of BPM scratch buffers from stack to static storage.
- **Stale node memory leak** — `node_states` HashMap evicts nodes inactive >60s.
- **Unsafe raw pointer removed** — Replaced with safe `.clone()` for adaptive model borrow.
- **Firmware CI** — Upgraded to IDF v5.4, replaced `xxd` with `od` (#327).
- **Person count double-counting** — Multi-node aggregation changed from `sum` to `max`.
- **Skeleton jitter** — Removed tick-based noise, dampened procedural animation, recalibrated feature scaling for real ESP32 data.
### Changed
- Motion-responsive skeleton: arm swing (0-80px) driven by CSI variance, leg kick (0-50px) by motion_band_power, vertical bob when walking.
- Person count thresholds recalibrated for real ESP32 hardware (1→2 at 0.70, EMA alpha 0.04).
- Vital sign filtering: larger median window (31), faster EMA (0.05), looser HR jump filter (15 BPM).
- Vendored ruvector updated to v2.1.0-40 (316 commits ahead).
### Benchmarks (2-node mesh, COM6 + COM9, 30s)
| Metric | Baseline | v0.5.3 | Improvement |
|--------|----------|--------|-------------|
| Variance noise | 109.4 | 77.6 | **-29%** |
| Feature stability | std=154.1 | std=105.4 | **-32%** |
| Keypoint jitter | std=4.5px | std=1.3px | **-72%** |
| Confidence | 0.643 | 0.686 | **+7%** |
| Presence accuracy | 93.4% | 94.6% | **+1.3pp** |
### Verified
- Real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net WiFi
- All 284 Rust tests pass, 352 signal crate tests pass
- Firmware builds clean at 843 KB
- QEMU CI: 11/11 jobs green
## [v0.5.2-esp32] — 2026-03-28
### Fixed
- RSSI byte offset in frame parser (#332)
- Per-node state pipeline for multi-node sensing (#249)
- Firmware CI upgraded to IDF v5.4 (#327)
## [v0.5.1-esp32] — 2026-03-27
### Fixed
- Watchdog crash on busy LANs (#321)
- No detection from edge vitals (#323)
- `wifi_densepose` Python package import (#314)
- Pre-compiled firmware binaries added to release
## [v0.5.0-esp32] — 2026-03-15
### Added
+1 -1
View File
@@ -1,7 +1,7 @@
# π RuView
<p align="center">
<a href="https://ruvnet.github.io/RuView/">
<a href="https://x.com/rUv/status/2037556932802761004">
<img src="assets/ruview-small-gemini.jpg" alt="RuView - WiFi DensePose" width="100%">
</a>
</p>
File diff suppressed because one or more lines are too long
+182
View File
@@ -0,0 +1,182 @@
# ADR-068: Per-Node State Pipeline for Multi-Node Sensing
| Field | Value |
|------------|-------------------------------------|
| Status | Accepted |
| Date | 2026-03-27 |
| Authors | rUv, claude-flow |
| Drivers | #249, #237, #276, #282 |
| Supersedes | — |
## Context
The sensing server (`wifi-densepose-sensing-server`) was originally designed for
single-node operation. When multiple ESP32 nodes send CSI frames simultaneously,
all data is mixed into a single shared pipeline:
- **One** `frame_history` VecDeque for all nodes
- **One** `smoothed_person_score` / `smoothed_motion` / vital sign buffers
- **One** baseline and debounce state
This means the classification, person count, and vital signs reported to the UI
are an uncontrolled aggregate of all nodes' data. The result: the detection
window shows identical output regardless of how many nodes are deployed, where
people stand, or how many people are in the room (#249 — 24 comments, the most
reported issue).
### Root Cause Verified
Investigation of `AppStateInner` (main.rs lines 279-367) confirmed:
| Shared field | Impact |
|---------------------------|--------------------------------------------|
| `frame_history` | Temporal analysis mixes all nodes' CSI data |
| `smoothed_person_score` | Person count aggregates all nodes |
| `smoothed_motion` | Motion classification undifferentiated |
| `smoothed_hr` / `br` | Vital signs are global, not per-node |
| `baseline_motion` | Adaptive baseline learned from mixed data |
| `debounce_counter` | All nodes share debounce state |
## Decision
Introduce **per-node state tracking** via a `HashMap<u8, NodeState>` in
`AppStateInner`. Each ESP32 node (identified by its `node_id` byte) gets an
independent sensing pipeline with its own temporal history, smoothing buffers,
baseline, and classification state.
### Architecture
```
┌─────────────────────────────────────────┐
UDP frames │ AppStateInner │
───────────► │ │
node_id=1 ──► │ node_states: HashMap<u8, NodeState> │
node_id=2 ──► │ ├── 1: NodeState { frame_history, │
node_id=3 ──► │ │ smoothed_motion, vitals, ... }│
│ ├── 2: NodeState { ... } │
│ └── 3: NodeState { ... } │
│ │
│ ┌── Per-Node Pipeline ──┐ │
│ │ extract_features() │ │
│ │ smooth_and_classify() │ │
│ │ smooth_vitals() │ │
│ │ score_to_person_count()│ │
│ └────────────────────────┘ │
│ │
│ ┌── Multi-Node Fusion ──┐ │
│ │ Aggregate person count │ │
│ │ Per-node classification│ │
│ │ All-nodes WebSocket msg│ │
│ └────────────────────────┘ │
│ │
│ ──► WebSocket broadcast (sensing_update) │
└─────────────────────────────────────────┘
```
### NodeState Struct
```rust
struct NodeState {
frame_history: VecDeque<Vec<f64>>,
smoothed_person_score: f64,
prev_person_count: usize,
smoothed_motion: f64,
current_motion_level: String,
debounce_counter: u32,
debounce_candidate: String,
baseline_motion: f64,
baseline_frames: u64,
smoothed_hr: f64,
smoothed_br: f64,
smoothed_hr_conf: f64,
smoothed_br_conf: f64,
hr_buffer: VecDeque<f64>,
br_buffer: VecDeque<f64>,
rssi_history: VecDeque<f64>,
vital_detector: VitalSignDetector,
latest_vitals: VitalSigns,
last_frame_time: Option<std::time::Instant>,
edge_vitals: Option<Esp32VitalsPacket>,
}
```
### Multi-Node Aggregation
- **Person count**: Sum of per-node `prev_person_count` for active nodes
(seen within last 10 seconds).
- **Classification**: Per-node classification included in `SensingUpdate.nodes`.
- **Vital signs**: Per-node vital signs; UI can render per-node or aggregate.
- **Signal field**: Generated from the most-recently-updated node's features.
- **Stale nodes**: Nodes with no frame for >10 seconds are excluded from
aggregation and marked offline (consistent with PR #300).
### Backward Compatibility
- The simulated data path (`simulated_data_task`) continues using global state.
- Single-node deployments behave identically (HashMap has one entry).
- The WebSocket message format (`sensing_update`) remains the same but the
`nodes` array now contains all active nodes, and `estimated_persons` reflects
the cross-node aggregate.
- The edge vitals path (#323 fix) also uses per-node state.
## Scaling Characteristics
| Nodes | Per-Node Memory | Total Overhead | Notes |
|-------|----------------|----------------|-------|
| 1 | ~50 KB | ~50 KB | Identical to current |
| 3 | ~50 KB | ~150 KB | Typical home setup |
| 10 | ~50 KB | ~500 KB | Small office |
| 50 | ~50 KB | ~2.5 MB | Building floor |
| 100 | ~50 KB | ~5 MB | Large deployment |
| 256 | ~50 KB | ~12.8 MB | Max (u8 node_id) |
Memory is dominated by `frame_history` (100 frames x ~500 bytes each = ~50 KB
per node). This scales linearly and fits comfortably in server memory even at
256 nodes.
## QEMU Validation
The existing QEMU swarm infrastructure (ADR-062, `scripts/qemu_swarm.py`)
supports multi-node simulation with configurable topologies:
- `star`: Central coordinator + sensor nodes
- `mesh`: Fully connected peer network
- `line`: Sequential chain
- `ring`: Circular topology
Each QEMU instance runs with a unique `node_id` via NVS provisioning. The
swarm health validator (`scripts/swarm_health.py`) checks per-node UART output.
Validation plan:
1. QEMU swarm with 3-5 nodes in mesh topology
2. Verify server produces distinct per-node classifications
3. Verify aggregate person count reflects multi-node contributions
4. Verify stale-node eviction after timeout
## Consequences
### Positive
- Each node's CSI data is processed independently — no cross-contamination
- Person count scales with the number of deployed nodes
- Vital signs are per-node, enabling room-level health monitoring
- Foundation for spatial localization (per-node positions + triangulation)
- Scales to 256 nodes with <13 MB memory overhead
### Negative
- Slightly more memory per node (~50 KB each)
- `smooth_and_classify_node` function duplicates some logic from global version
- Per-node `VitalSignDetector` instances add CPU cost proportional to node count
### Risks
- Node ID collisions (mitigated by NVS persistence since v0.5.0)
- HashMap growth without cleanup (mitigated by stale-node eviction)
## References
- Issue #249: Detection window same regardless (24 comments)
- Issue #237: Same display for 0/1/2 people (12 comments)
- Issue #276: Only one can be detected (8 comments)
- Issue #282: Detection fail (5 comments)
- PR #295: Hysteresis smoothing (partial mitigation)
- PR #300: ESP32 offline detection after 5s
- ADR-062: QEMU Swarm Configurator
+21 -20
View File
@@ -43,6 +43,12 @@ static const char *TAG = "edge_proc";
static edge_ring_buf_t s_ring;
static uint32_t s_ring_drops; /* Frames dropped due to full ring buffer. */
/* Scratch buffers for BPM estimation — moved from stack to static to avoid
* stack overflow. process_frame + update_multi_person_vitals combined used
* ~6.5-7.5 KB of the 8 KB task stack. These save ~4 KB of stack. */
static float s_scratch_br[EDGE_PHASE_HISTORY_LEN];
static float s_scratch_hr[EDGE_PHASE_HISTORY_LEN];
static inline bool ring_push(const uint8_t *iq, uint16_t len,
int8_t rssi, uint8_t channel)
{
@@ -513,20 +519,18 @@ static void update_multi_person_vitals(const uint8_t *iq_data, uint16_t n_sc,
/* Estimate BPM when we have enough history. */
if (pv->history_len >= 64) {
/* Build contiguous buffer for zero-crossing. */
float br_buf[EDGE_PHASE_HISTORY_LEN];
float hr_buf[EDGE_PHASE_HISTORY_LEN];
/* Build contiguous buffer (reuse static scratch to save ~2 KB stack). */
uint16_t buf_len = pv->history_len;
for (uint16_t i = 0; i < buf_len; i++) {
uint16_t ri = (pv->history_idx + EDGE_PHASE_HISTORY_LEN
- buf_len + i) % EDGE_PHASE_HISTORY_LEN;
br_buf[i] = s_person_br_filt[p][ri];
hr_buf[i] = s_person_hr_filt[p][ri];
s_scratch_br[i] = s_person_br_filt[p][ri];
s_scratch_hr[i] = s_person_hr_filt[p][ri];
}
float br = estimate_bpm_zero_crossing(br_buf, buf_len, sample_rate);
float hr = estimate_bpm_zero_crossing(hr_buf, buf_len, sample_rate);
float br = estimate_bpm_zero_crossing(s_scratch_br, buf_len, sample_rate);
float hr = estimate_bpm_zero_crossing(s_scratch_hr, buf_len, sample_rate);
/* Sanity clamp. */
if (br >= 6.0f && br <= 40.0f) pv->breathing_bpm = br;
@@ -690,20 +694,18 @@ static void process_frame(const edge_ring_slot_t *slot)
/* --- Step 7: BPM estimation (zero-crossing) --- */
if (s_history_len >= 64) {
/* Build contiguous buffers from ring. */
float br_buf[EDGE_PHASE_HISTORY_LEN];
float hr_buf[EDGE_PHASE_HISTORY_LEN];
/* Build contiguous buffers from ring (using static scratch to save stack). */
uint16_t buf_len = s_history_len;
for (uint16_t i = 0; i < buf_len; i++) {
uint16_t ri = (s_history_idx + EDGE_PHASE_HISTORY_LEN
- buf_len + i) % EDGE_PHASE_HISTORY_LEN;
br_buf[i] = s_breathing_filtered[ri];
hr_buf[i] = s_heartrate_filtered[ri];
s_scratch_br[i] = s_breathing_filtered[ri];
s_scratch_hr[i] = s_heartrate_filtered[ri];
}
float br_bpm = estimate_bpm_zero_crossing(br_buf, buf_len, sample_rate);
float hr_bpm = estimate_bpm_zero_crossing(hr_buf, buf_len, sample_rate);
float br_bpm = estimate_bpm_zero_crossing(s_scratch_br, buf_len, sample_rate);
float hr_bpm = estimate_bpm_zero_crossing(s_scratch_hr, buf_len, sample_rate);
/* Sanity clamp: breathing 6-40 BPM, heart rate 40-180 BPM. */
if (br_bpm >= 6.0f && br_bpm <= 40.0f) s_breathing_bpm = br_bpm;
@@ -839,12 +841,11 @@ static void edge_task(void *arg)
* Without a batch limit the task processes frames back-to-back with
* only 1-tick yields, which on high frame rates can still starve
* IDLE1 enough to trip the 5-second task watchdog. See #266, #321. */
const uint8_t BATCH_LIMIT = 4;
while (1) {
uint8_t processed = 0;
while (processed < BATCH_LIMIT && ring_pop(&slot)) {
while (processed < EDGE_BATCH_LIMIT && ring_pop(&slot)) {
process_frame(&slot);
processed++;
/* 1-tick yield between frames within a batch. */
@@ -852,10 +853,10 @@ static void edge_task(void *arg)
}
if (processed > 0) {
/* Post-batch yield: 2 ticks (~20 ms at 100 Hz) so IDLE1 can
* run and feed the Core 1 watchdog even under sustained load.
* This is intentionally longer than the 1-tick inter-frame yield. */
vTaskDelay(2);
/* Post-batch yield: ~20 ms so IDLE1 can run and feed the
* Core 1 watchdog even under sustained load. Uses pdMS_TO_TICKS
* for tick-rate independence (minimum 1 tick). */
{ TickType_t d = pdMS_TO_TICKS(20); vTaskDelay(d > 0 ? d : 1); }
} else {
/* No frames available — sleep one full tick.
* NOTE: pdMS_TO_TICKS(5) == 0 at 100 Hz, which would busy-spin. */
@@ -46,6 +46,9 @@
#define EDGE_FALL_COOLDOWN_MS 5000 /**< Minimum ms between fall alerts (debounce). */
#define EDGE_FALL_CONSEC_MIN 3 /**< Consecutive frames above threshold to trigger. */
/* ---- DSP task tuning ---- */
#define EDGE_BATCH_LIMIT 4 /**< Max frames per batch before longer yield. */
/* ---- SPSC ring buffer slot ---- */
typedef struct {
uint8_t iq_data[EDGE_MAX_IQ_BYTES]; /**< Raw I/Q bytes from CSI callback. */
Binary file not shown.
File diff suppressed because one or more lines are too long
@@ -0,0 +1,33 @@
# ESP32-S3 CSI Node — Default SDK Configuration
# This file is applied automatically by idf.py when no sdkconfig exists.
# Target: ESP32-S3
CONFIG_IDF_TARGET="esp32s3"
# Use custom partition table (8MB flash with OTA — ADR-045)
CONFIG_PARTITION_TABLE_CUSTOM=y
CONFIG_PARTITION_TABLE_CUSTOM_FILENAME="partitions_display.csv"
# Flash configuration: 8MB (Quad SPI)
CONFIG_ESPTOOLPY_FLASHSIZE_8MB=y
CONFIG_ESPTOOLPY_FLASHSIZE="8MB"
# Compiler optimization: optimize for size to reduce binary
CONFIG_COMPILER_OPTIMIZATION_SIZE=y
# Enable CSI (Channel State Information) in WiFi driver
CONFIG_ESP_WIFI_CSI_ENABLED=y
# NVS encryption disabled by default (requires eFuse provisioning).
# Enable only after burning HMAC key to eFuse block.
# CONFIG_NVS_ENCRYPTION is not set
# Disable unused features to reduce binary size
CONFIG_BOOTLOADER_LOG_LEVEL_WARN=y
CONFIG_LOG_DEFAULT_LEVEL_INFO=y
# LWIP: enable extended socket options for UDP multicast
CONFIG_LWIP_SO_RCVBUF=y
# FreeRTOS: increase task stack for CSI processing
CONFIG_ESP_MAIN_TASK_STACK_SIZE=8192
File diff suppressed because it is too large Load Diff
@@ -0,0 +1 @@
{"intelligence":35,"timestamp":1774903706609}
+1
View File
@@ -117,6 +117,7 @@ midstreamer-temporal-compare = "0.1.0"
midstreamer-attractor = "0.1.0"
# ruvector integration (published on crates.io)
# Vendored at v2.1.0 in vendor/ruvector; using crates.io versions until published.
ruvector-mincut = "2.0.4"
ruvector-attn-mincut = "2.0.4"
ruvector-temporal-tensor = "2.0.4"
@@ -21,3 +21,4 @@ pub use bvp::attention_weighted_bvp;
pub use fresnel::solve_fresnel_geometry;
pub use spectrogram::gate_spectrogram;
pub use subcarrier::mincut_subcarrier_partition;
pub use subcarrier::subcarrier_importance_weights;
@@ -142,6 +142,29 @@ pub fn mincut_subcarrier_partition(sensitivity: &[f32]) -> (Vec<usize>, Vec<usiz
}
}
/// Convert a mincut partition into per-subcarrier importance weights.
///
/// Sensitive subcarriers (high body-motion correlation) get weight > 1.0,
/// insensitive ones get weight 0.5. This allows downstream feature extraction
/// to emphasise the most informative subcarriers.
pub fn subcarrier_importance_weights(sensitivity: &[f32]) -> Vec<f32> {
if sensitivity.is_empty() {
return vec![];
}
let (sensitive, _insensitive) = mincut_subcarrier_partition(sensitivity);
let max_sens = sensitivity
.iter()
.cloned()
.fold(f32::NEG_INFINITY, f32::max)
.max(1e-9);
let mut weights = vec![0.5f32; sensitivity.len()];
for &idx in &sensitive {
weights[idx] = 1.0 + (sensitivity[idx] / max_sens).min(1.0);
}
weights
}
#[cfg(test)]
mod tests {
use super::*;
@@ -175,4 +198,38 @@ mod tests {
assert_eq!(s, vec![0]);
assert!(i.is_empty());
}
#[test]
fn test_importance_weights_empty() {
let w = subcarrier_importance_weights(&[]);
assert!(w.is_empty());
}
#[test]
fn test_importance_weights_all_equal() {
let sensitivity = vec![1.0f32; 8];
let w = subcarrier_importance_weights(&sensitivity);
assert_eq!(w.len(), 8);
// All subcarriers have identical sensitivity so all should be classified
// the same way (either all sensitive or all insensitive after mincut).
// At minimum, no weight should exceed 2.0 or be negative.
for &wt in &w {
assert!(wt >= 0.5 && wt <= 2.0, "weight {wt} out of range");
}
}
#[test]
fn test_importance_weights_sensitive_higher() {
// First 5 subcarriers have high sensitivity, last 5 low.
let sensitivity: Vec<f32> = (0..10).map(|i| if i < 5 { 0.9 } else { 0.1 }).collect();
let w = subcarrier_importance_weights(&sensitivity);
assert_eq!(w.len(), 10);
let mean_high: f32 = w[..5].iter().sum::<f32>() / 5.0;
let mean_low: f32 = w[5..].iter().sum::<f32>() / 5.0;
assert!(
mean_high > mean_low,
"sensitive subcarriers should have higher mean weight ({mean_high}) than insensitive ({mean_low})"
);
}
}
@@ -43,5 +43,8 @@ clap = { workspace = true }
# Multi-BSSID WiFi scanning pipeline (ADR-022 Phase 3)
wifi-densepose-wifiscan = { version = "0.3.0", path = "../wifi-densepose-wifiscan" }
# RuVector graph min-cut for person separation (ADR-068)
ruvector-mincut = { workspace = true }
[dev-dependencies]
tempfile = "3.10"
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,233 @@
//! Integration test: multi-node per-node state isolation (ADR-068, #249).
//!
//! Sends simulated ESP32 CSI frames from multiple node IDs to the server's
//! UDP port and verifies that:
//! 1. Each node gets independent state (no cross-contamination)
//! 2. Person count aggregates across active nodes
//! 3. Stale nodes are excluded from aggregation
//!
//! This does NOT require QEMU — it sends raw UDP packets directly.
use std::net::UdpSocket;
use std::time::Duration;
/// Build a minimal valid ESP32 CSI frame (magic 0xC511_0001).
///
/// Format (ADR-018):
/// [0..3] magic: 0xC511_0001 (LE)
/// [4] node_id
/// [5] n_antennas (1)
/// [6] n_subcarriers (e.g., 32)
/// [7] reserved
/// [8..9] freq_mhz (2437 = channel 6)
/// [10..13] sequence (LE u32)
/// [14] rssi (signed)
/// [15] noise_floor
/// [16..19] reserved
/// [20..] I/Q pairs (n_antennas * n_subcarriers * 2 bytes)
fn build_csi_frame(node_id: u8, seq: u32, rssi: i8, n_sub: u8) -> Vec<u8> {
let n_pairs = n_sub as usize;
let mut buf = vec![0u8; 20 + n_pairs * 2];
// Magic
let magic: u32 = 0xC511_0001;
buf[0..4].copy_from_slice(&magic.to_le_bytes());
buf[4] = node_id;
buf[5] = 1; // n_antennas
buf[6] = n_sub;
buf[7] = 0;
// freq = 2437 MHz (channel 6)
let freq: u16 = 2437;
buf[8..10].copy_from_slice(&freq.to_le_bytes());
// sequence
buf[10..14].copy_from_slice(&seq.to_le_bytes());
buf[14] = rssi as u8;
buf[15] = (-90i8) as u8; // noise floor
// Generate I/Q pairs with node-specific patterns.
// Different nodes produce different amplitude patterns so the server
// computes different features for each.
for i in 0..n_pairs {
let phase = (i as f64 + node_id as f64 * 0.5) * 0.3;
let amplitude = 20.0 + (node_id as f64) * 5.0 + (phase.sin() * 10.0);
let i_val = (amplitude * phase.cos()) as i8;
let q_val = (amplitude * phase.sin()) as i8;
buf[20 + i * 2] = i_val as u8;
buf[20 + i * 2 + 1] = q_val as u8;
}
buf
}
/// Build an edge vitals packet (magic 0xC511_0002).
fn build_vitals_packet(node_id: u8, presence: bool, n_persons: u8, rssi: i8) -> Vec<u8> {
let mut buf = vec![0u8; 32];
let magic: u32 = 0xC511_0002;
buf[0..4].copy_from_slice(&magic.to_le_bytes());
buf[4] = node_id;
buf[5] = if presence { 0x01 } else { 0x00 }; // flags
// breathing_rate (u16 LE) = 15.0 * 100 = 1500
buf[6..8].copy_from_slice(&1500u16.to_le_bytes());
// heartrate (u32 LE) = 72.0 * 10000 = 720000
buf[8..12].copy_from_slice(&720000u32.to_le_bytes());
buf[12] = rssi as u8;
buf[13] = n_persons;
// bytes 14-15: reserved
// motion_energy (f32 LE)
let me: f32 = if presence { 0.5 } else { 0.0 };
buf[16..20].copy_from_slice(&me.to_le_bytes());
// presence_score (f32 LE)
let ps: f32 = if presence { 0.8 } else { 0.0 };
buf[20..24].copy_from_slice(&ps.to_le_bytes());
// timestamp_ms (u32 LE)
buf[24..28].copy_from_slice(&1000u32.to_le_bytes());
buf
}
#[test]
fn test_csi_frame_builder_valid() {
let frame = build_csi_frame(1, 0, -50, 32);
assert_eq!(frame.len(), 20 + 32 * 2);
assert_eq!(u32::from_le_bytes([frame[0], frame[1], frame[2], frame[3]]), 0xC511_0001);
assert_eq!(frame[4], 1); // node_id
assert_eq!(frame[5], 1); // n_antennas
assert_eq!(frame[6], 32); // n_subcarriers
}
#[test]
fn test_vitals_packet_builder_valid() {
let pkt = build_vitals_packet(2, true, 1, -45);
assert_eq!(pkt.len(), 32);
assert_eq!(u32::from_le_bytes([pkt[0], pkt[1], pkt[2], pkt[3]]), 0xC511_0002);
assert_eq!(pkt[4], 2); // node_id
assert_eq!(pkt[5], 0x01); // flags: presence
assert_eq!(pkt[13], 1); // n_persons
}
#[test]
fn test_different_nodes_produce_different_frames() {
let frame1 = build_csi_frame(1, 0, -50, 32);
let frame2 = build_csi_frame(2, 0, -50, 32);
// I/Q data should differ due to node_id-based amplitude offset
assert_ne!(&frame1[20..], &frame2[20..]);
}
/// Send multiple frames from different nodes to a UDP port.
/// This test verifies the packet format is accepted by a real server
/// if one is running, but doesn't fail if no server is available.
#[test]
fn test_multi_node_udp_send() {
// Try to bind to a random port and send to localhost:5005
// This is a smoke test — it verifies frames can be sent without panic.
let sock = UdpSocket::bind("0.0.0.0:0").expect("bind");
sock.set_write_timeout(Some(Duration::from_millis(100))).ok();
let n_sub = 32u8;
let node_ids = [1u8, 2, 3, 5, 7];
for &nid in &node_ids {
for seq in 0..10u32 {
let frame = build_csi_frame(nid, seq, -50 + nid as i8, n_sub);
// Send to localhost:5005 (won't fail even if nothing is listening)
let _ = sock.send_to(&frame, "127.0.0.1:5005");
}
}
// Also send vitals packets
for &nid in &node_ids {
let pkt = build_vitals_packet(nid, true, 1, -45);
let _ = sock.send_to(&pkt, "127.0.0.1:5005");
}
// If we get here without panic, the frame builders work correctly
assert!(true, "Multi-node UDP send completed without errors");
}
/// Verify that the frame builder produces frames of the correct minimum
/// size for various subcarrier counts (boundary testing).
#[test]
fn test_frame_sizes() {
for n_sub in [1u8, 16, 32, 52, 56, 64, 128] {
let frame = build_csi_frame(1, 0, -50, n_sub);
let expected = 20 + (n_sub as usize) * 2;
assert_eq!(frame.len(), expected, "wrong size for n_sub={n_sub}");
}
}
/// Simulate a mesh of N nodes sending frames at different rates.
/// Nodes 1-3 send every "tick", node 4 sends every other tick,
/// node 5 stops after 5 ticks (simulating going offline).
#[test]
fn test_mesh_simulation_pattern() {
let sock = UdpSocket::bind("0.0.0.0:0").expect("bind");
sock.set_write_timeout(Some(Duration::from_millis(50))).ok();
let mut total_sent = 0u32;
for tick in 0..20u32 {
// Nodes 1-3: every tick
for nid in 1..=3u8 {
let frame = build_csi_frame(nid, tick, -50, 32);
let _ = sock.send_to(&frame, "127.0.0.1:5005");
total_sent += 1;
}
// Node 4: every other tick
if tick % 2 == 0 {
let frame = build_csi_frame(4, tick / 2, -55, 32);
let _ = sock.send_to(&frame, "127.0.0.1:5005");
total_sent += 1;
}
// Node 5: stops after tick 5
if tick < 5 {
let frame = build_csi_frame(5, tick, -60, 32);
let _ = sock.send_to(&frame, "127.0.0.1:5005");
total_sent += 1;
}
}
// Expected: 3*20 + 10 + 5 = 75 frames
assert_eq!(total_sent, 75, "unexpected frame count");
}
/// Large mesh: simulate 100 nodes each sending 10 frames.
/// Verifies the frame builder scales without issues.
#[test]
fn test_large_mesh_100_nodes() {
let sock = UdpSocket::bind("0.0.0.0:0").expect("bind");
sock.set_write_timeout(Some(Duration::from_millis(50))).ok();
let mut total = 0u32;
for nid in 1..=100u8 {
for seq in 0..10u32 {
let frame = build_csi_frame(nid, seq, -50 + (nid % 30) as i8, 32);
let _ = sock.send_to(&frame, "127.0.0.1:5005");
total += 1;
}
}
assert_eq!(total, 1000);
}
/// Max mesh: simulate 255 nodes (max u8 node_id) with 1 frame each.
#[test]
fn test_max_nodes_255() {
let sock = UdpSocket::bind("0.0.0.0:0").expect("bind");
sock.set_write_timeout(Some(Duration::from_millis(100))).ok();
for nid in 1..=255u8 {
let frame = build_csi_frame(nid, 0, -50, 16);
let _ = sock.send_to(&frame, "127.0.0.1:5005");
}
// 255 unique node_ids — the HashMap should handle this fine
assert!(true);
}
@@ -61,7 +61,10 @@ pub use coherence_gate::{GateDecision, GatePolicy};
pub use multiband::MultiBandCsiFrame;
pub use multistatic::FusedSensingFrame;
pub use phase_align::{PhaseAligner, PhaseAlignError};
pub use pose_tracker::{KeypointState, PoseTrack, TrackLifecycleState};
pub use pose_tracker::{
CompressedPoseHistory, KeypointState, PoseTrack, SkeletonConstraints,
TemporalKeypointAttention, TrackLifecycleState,
};
/// Number of keypoints in a full-body pose skeleton (COCO-17).
pub const NUM_KEYPOINTS: usize = 17;
@@ -26,6 +26,8 @@
//!
//! - `ruvector-mincut` -> Person separation and track assignment
use std::collections::VecDeque;
use super::{TrackId, NUM_KEYPOINTS};
/// Errors from the pose tracker.
@@ -648,6 +650,365 @@ impl PoseDetection {
}
}
// ---------------------------------------------------------------------------
// Skeleton kinematic constraints (RuVector Phase 3)
// ---------------------------------------------------------------------------
/// Expected bone lengths in normalized coordinates (parent_idx, child_idx, length).
///
/// These define the COCO-17 kinematic tree edges with approximate proportions
/// derived from anthropometric averages. Used by [`SkeletonConstraints`] to
/// reject impossible poses (e.g., arm longer than torso).
const BONE_LENGTHS: &[(usize, usize, f32)] = &[
(5, 7, 0.15), // L shoulder -> L elbow
(7, 9, 0.14), // L elbow -> L wrist
(6, 8, 0.15), // R shoulder -> R elbow
(8, 10, 0.14), // R elbow -> R wrist
(5, 11, 0.25), // L shoulder -> L hip
(6, 12, 0.25), // R shoulder -> R hip
(11, 13, 0.22), // L hip -> L knee
(13, 15, 0.22), // L knee -> L ankle
(12, 14, 0.22), // R hip -> R knee
(14, 16, 0.22), // R knee -> R ankle
(5, 6, 0.18), // L shoulder -> R shoulder
(11, 12, 0.15), // L hip -> R hip
];
/// Skeleton kinematic constraint enforcer using Jakobsen relaxation.
///
/// Iteratively projects bone lengths toward their expected values so that
/// the resulting skeleton obeys basic anthropometric limits. Bones that
/// deviate more than [`Self::TOLERANCE`] (30 %) from their rest length are
/// corrected over [`Self::ITERATIONS`] passes.
pub struct SkeletonConstraints;
impl SkeletonConstraints {
/// Maximum allowed fractional deviation before correction kicks in.
const TOLERANCE: f32 = 0.30;
/// Number of Jakobsen relaxation iterations.
const ITERATIONS: usize = 3;
/// Enforce kinematic constraints in-place on `keypoints`.
///
/// Each element is `[x, y, z]`. The method runs several iterations of
/// distance-constraint projection (Jakobsen method) over the edges
/// defined in [`BONE_LENGTHS`].
pub fn enforce_constraints(keypoints: &mut [[f32; 3]; 17]) {
for _ in 0..Self::ITERATIONS {
for &(a, b, rest_len) in BONE_LENGTHS {
let dx = keypoints[b][0] - keypoints[a][0];
let dy = keypoints[b][1] - keypoints[a][1];
let dz = keypoints[b][2] - keypoints[a][2];
let current_len = (dx * dx + dy * dy + dz * dz).sqrt();
// Skip degenerate / zero-length bones (e.g. all-zero pose).
if current_len < 1e-9 {
continue;
}
let ratio = current_len / rest_len;
// Only correct if deviation exceeds tolerance.
if ratio < (1.0 - Self::TOLERANCE) || ratio > (1.0 + Self::TOLERANCE) {
let correction = (rest_len - current_len) / current_len * 0.5;
let cx = dx * correction;
let cy = dy * correction;
let cz = dz * correction;
keypoints[a][0] -= cx;
keypoints[a][1] -= cy;
keypoints[a][2] -= cz;
keypoints[b][0] += cx;
keypoints[b][1] += cy;
keypoints[b][2] += cz;
}
}
}
}
}
// ---------------------------------------------------------------------------
// Compressed pose history (RuVector Phase 3 -- temporal tensor)
// ---------------------------------------------------------------------------
/// Two-tier compressed pose history.
///
/// Recent poses are stored at full `f32` precision in the *hot* ring buffer.
/// Once the hot buffer is full the oldest pose is quantised to `i16` and
/// pushed into the *warm* tier, keeping memory usage bounded while still
/// allowing similarity queries against a longer temporal window.
pub struct CompressedPoseHistory {
/// Recent poses at full precision.
hot: VecDeque<[[f32; 3]; 17]>,
/// Older poses quantised to i16.
warm: VecDeque<[[i16; 3]; 17]>,
/// Scale factor used for warm quantisation (divide f32, multiply to
/// reconstruct).
scale: f32,
max_hot: usize,
max_warm: usize,
}
impl CompressedPoseHistory {
/// Create a new history with the given tier sizes.
///
/// `scale` controls the fixed-point quantisation: warm values are stored
/// as `(value / scale).round() as i16`.
pub fn new(max_hot: usize, max_warm: usize, scale: f32) -> Self {
Self {
hot: VecDeque::with_capacity(max_hot),
warm: VecDeque::with_capacity(max_warm),
scale: if scale.abs() < 1e-12 { 1.0 } else { scale },
max_hot,
max_warm,
}
}
/// Push a new pose into the history.
///
/// When the hot tier is full the oldest entry is quantised and moved to
/// the warm tier. When the warm tier overflows the oldest warm entry is
/// discarded.
pub fn push(&mut self, pose: &[[f32; 3]; 17]) {
if self.hot.len() >= self.max_hot {
if let Some(evicted) = self.hot.pop_front() {
let quantised = self.quantise(&evicted);
if self.warm.len() >= self.max_warm {
self.warm.pop_front();
}
self.warm.push_back(quantised);
}
}
self.hot.push_back(*pose);
}
/// Cosine similarity between `pose` and the most recent stored pose.
///
/// Both poses are flattened to 51-element vectors before the dot-product
/// is computed. Returns 0.0 when the history is empty or either vector
/// has zero norm.
pub fn similarity(&self, pose: &[[f32; 3]; 17]) -> f32 {
let recent = match self.hot.back() {
Some(r) => r,
None => return 0.0,
};
let mut dot = 0.0_f32;
let mut norm_a = 0.0_f32;
let mut norm_b = 0.0_f32;
for kp in 0..17 {
for d in 0..3 {
let a = recent[kp][d];
let b = pose[kp][d];
dot += a * b;
norm_a += a * a;
norm_b += b * b;
}
}
let denom = (norm_a * norm_b).sqrt();
if denom < 1e-12 {
return 0.0;
}
(dot / denom).clamp(-1.0, 1.0)
}
/// Total number of stored poses (hot + warm).
pub fn len(&self) -> usize {
self.hot.len() + self.warm.len()
}
/// Returns `true` when the history contains no poses.
pub fn is_empty(&self) -> bool {
self.hot.is_empty() && self.warm.is_empty()
}
// -- internal helpers ---------------------------------------------------
fn quantise(&self, pose: &[[f32; 3]; 17]) -> [[i16; 3]; 17] {
let inv = 1.0 / self.scale;
let mut out = [[0_i16; 3]; 17];
for kp in 0..17 {
for d in 0..3 {
out[kp][d] = (pose[kp][d] * inv)
.round()
.clamp(i16::MIN as f32, i16::MAX as f32)
as i16;
}
}
out
}
}
impl Default for CompressedPoseHistory {
fn default() -> Self {
Self::new(10, 50, 0.001)
}
}
// ---------------------------------------------------------------------------
// Temporal Keypoint Attention (RuVector Phase 2)
// ---------------------------------------------------------------------------
/// Sliding-window temporal smoother for 17-keypoint pose estimates.
///
/// Maintains a ring buffer of the last `WINDOW_SIZE` pose frames and applies
/// exponential-decay weighted averaging to produce temporally coherent output.
/// Additionally enforces kinematic constraints: bone lengths cannot change by
/// more than 20% between consecutive frames.
///
/// This is a lightweight inline implementation that mirrors the algorithm in
/// `ruvector-attention` without pulling the crate into the sensing server.
pub struct TemporalKeypointAttention {
/// Ring buffer of recent pose frames (newest at back).
window: std::collections::VecDeque<[[f32; 3]; NUM_KEYPOINTS]>,
/// Maximum number of frames to retain.
window_size: usize,
/// Exponential decay factor per frame (e.g., 0.7 means frame t-1 has
/// weight 0.7, frame t-2 has weight 0.49, etc.).
decay: f32,
}
impl TemporalKeypointAttention {
/// Default window size (10 frames at 10-20 Hz = 0.5-1.0 s look-back).
pub const DEFAULT_WINDOW: usize = 10;
/// Default decay factor.
pub const DEFAULT_DECAY: f32 = 0.7;
/// Maximum allowed bone-length change ratio between consecutive frames.
pub const MAX_BONE_CHANGE: f32 = 0.20;
/// Create a new temporal attention smoother with default parameters.
pub fn new() -> Self {
Self {
window: std::collections::VecDeque::with_capacity(Self::DEFAULT_WINDOW),
window_size: Self::DEFAULT_WINDOW,
decay: Self::DEFAULT_DECAY,
}
}
/// Create with custom window size and decay.
pub fn with_params(window_size: usize, decay: f32) -> Self {
Self {
window: std::collections::VecDeque::with_capacity(window_size),
window_size,
decay: decay.clamp(0.0, 1.0),
}
}
/// Smooth the current keypoint estimate using the temporal window.
///
/// 1. Pushes `current` into the window (evicting oldest if full).
/// 2. Computes exponential-decay weighted average across all frames.
/// 3. Enforces bone-length constraints against the previous frame.
pub fn smooth_keypoints(
&mut self,
current: &[[f32; 3]; NUM_KEYPOINTS],
) -> [[f32; 3]; NUM_KEYPOINTS] {
// Grab the previous frame (before pushing current) for bone clamping.
let prev_frame = self.window.back().copied();
// Push current frame into the window.
if self.window.len() >= self.window_size {
self.window.pop_front();
}
self.window.push_back(*current);
// Compute weighted average with exponential decay (newest = highest weight).
let n = self.window.len();
let mut result = [[0.0_f32; 3]; NUM_KEYPOINTS];
let mut total_weight = 0.0_f32;
for (age, frame) in self.window.iter().rev().enumerate() {
let w = self.decay.powi(age as i32);
total_weight += w;
for kp in 0..NUM_KEYPOINTS {
for dim in 0..3 {
result[kp][dim] += w * frame[kp][dim];
}
}
}
if total_weight > 0.0 {
for kp in 0..NUM_KEYPOINTS {
for dim in 0..3 {
result[kp][dim] /= total_weight;
}
}
}
// Enforce bone-length constraints: no bone can change >20% from prev frame.
if let Some(prev) = prev_frame {
if n >= 2 {
Self::clamp_bone_lengths(&mut result, &prev);
}
}
result
}
/// Clamp bone lengths so they don't change by more than MAX_BONE_CHANGE
/// compared to the previous frame.
fn clamp_bone_lengths(
pose: &mut [[f32; 3]; NUM_KEYPOINTS],
prev: &[[f32; 3]; NUM_KEYPOINTS],
) {
for &(parent, child, _) in BONE_LENGTHS {
let prev_len = Self::bone_len(prev, parent, child);
if prev_len < 1e-6 {
continue; // skip degenerate bones
}
let cur_len = Self::bone_len(pose, parent, child);
if cur_len < 1e-6 {
continue;
}
let ratio = cur_len / prev_len;
let lo = 1.0 - Self::MAX_BONE_CHANGE;
let hi = 1.0 + Self::MAX_BONE_CHANGE;
if ratio < lo || ratio > hi {
// Scale the child position toward/away from parent to clamp.
let target_len = prev_len * ratio.clamp(lo, hi);
let scale = target_len / cur_len;
for dim in 0..3 {
let diff = pose[child][dim] - pose[parent][dim];
pose[child][dim] = pose[parent][dim] + diff * scale;
}
}
}
}
/// Euclidean distance between two keypoints in a pose.
fn bone_len(pose: &[[f32; 3]; NUM_KEYPOINTS], a: usize, b: usize) -> f32 {
let dx = pose[b][0] - pose[a][0];
let dy = pose[b][1] - pose[a][1];
let dz = pose[b][2] - pose[a][2];
(dx * dx + dy * dy + dz * dz).sqrt()
}
/// Number of frames currently in the window.
pub fn len(&self) -> usize {
self.window.len()
}
/// Whether the window is empty.
pub fn is_empty(&self) -> bool {
self.window.is_empty()
}
/// Clear the window (e.g., on track reset).
pub fn clear(&mut self) {
self.window.clear();
}
}
impl Default for TemporalKeypointAttention {
fn default() -> Self {
Self::new()
}
}
#[cfg(test)]
mod tests {
use super::*;
@@ -940,4 +1301,223 @@ mod tests {
track.mark_lost(); // Should not override Terminated
assert_eq!(track.lifecycle, TrackLifecycleState::Terminated);
}
// -----------------------------------------------------------------------
// SkeletonConstraints tests
// -----------------------------------------------------------------------
/// Build a plausible standing skeleton in normalised coordinates.
fn valid_skeleton() -> [[f32; 3]; 17] {
let mut kps = [[0.0_f32; 3]; 17];
// Head / face (indices 0-4) clustered near top.
kps[0] = [0.0, 1.0, 0.0]; // nose
kps[1] = [-0.02, 1.02, 0.0]; // left eye
kps[2] = [0.02, 1.02, 0.0]; // right eye
kps[3] = [-0.04, 1.0, 0.0]; // left ear
kps[4] = [0.04, 1.0, 0.0]; // right ear
// Torso
kps[5] = [-0.09, 0.85, 0.0]; // L shoulder
kps[6] = [0.09, 0.85, 0.0]; // R shoulder
kps[7] = [-0.09, 0.70, 0.0]; // L elbow (dist ~0.15 from shoulder)
kps[8] = [0.09, 0.70, 0.0]; // R elbow
kps[9] = [-0.09, 0.56, 0.0]; // L wrist (dist ~0.14 from elbow)
kps[10] = [0.09, 0.56, 0.0]; // R wrist
kps[11] = [-0.075, 0.60, 0.0]; // L hip (dist ~0.25 from shoulder)
kps[12] = [0.075, 0.60, 0.0]; // R hip
kps[13] = [-0.075, 0.38, 0.0]; // L knee (dist ~0.22 from hip)
kps[14] = [0.075, 0.38, 0.0]; // R knee
kps[15] = [-0.075, 0.16, 0.0]; // L ankle (dist ~0.22 from knee)
kps[16] = [0.075, 0.16, 0.0]; // R ankle
kps
}
#[test]
fn test_valid_skeleton_unchanged() {
let mut kps = valid_skeleton();
let before = kps;
SkeletonConstraints::enforce_constraints(&mut kps);
// Each keypoint should move by less than 0.02 (small perturbation
// from iterative relaxation on an already-valid skeleton).
for i in 0..17 {
let d = ((kps[i][0] - before[i][0]).powi(2)
+ (kps[i][1] - before[i][1]).powi(2)
+ (kps[i][2] - before[i][2]).powi(2))
.sqrt();
assert!(
d < 0.05,
"keypoint {} moved {:.4}, expected < 0.05",
i,
d
);
}
}
#[test]
fn test_stretched_bone_corrected() {
let mut kps = valid_skeleton();
// Stretch L shoulder -> L elbow to 2x expected (0.30 instead of 0.15).
kps[7] = [-0.09, 0.55, 0.0]; // push elbow far down
let dist_before = {
let dx = kps[7][0] - kps[5][0];
let dy = kps[7][1] - kps[5][1];
let dz = kps[7][2] - kps[5][2];
(dx * dx + dy * dy + dz * dz).sqrt()
};
assert!(
dist_before > 0.25,
"pre-condition: bone should be stretched, got {}",
dist_before
);
SkeletonConstraints::enforce_constraints(&mut kps);
let dist_after = {
let dx = kps[7][0] - kps[5][0];
let dy = kps[7][1] - kps[5][1];
let dz = kps[7][2] - kps[5][2];
(dx * dx + dy * dy + dz * dz).sqrt()
};
// After enforcement the bone should be much closer to the rest
// length of 0.15 (within tolerance band 0.105 .. 0.195).
assert!(
dist_after < dist_before,
"bone should be shorter after correction: before={:.4}, after={:.4}",
dist_before,
dist_after
);
}
#[test]
fn test_zero_skeleton_handled() {
// All-zero keypoints must not panic.
let mut kps = [[0.0_f32; 3]; 17];
SkeletonConstraints::enforce_constraints(&mut kps);
// Just assert it didn't panic; the result should still be all-zero
// since zero-length bones are skipped.
for kp in &kps {
assert!(kp[0].is_finite());
assert!(kp[1].is_finite());
assert!(kp[2].is_finite());
}
}
// -----------------------------------------------------------------------
// CompressedPoseHistory tests
// -----------------------------------------------------------------------
#[test]
fn compressed_history_push_and_len() {
let mut hist = CompressedPoseHistory::new(3, 5, 0.001);
assert!(hist.is_empty());
assert_eq!(hist.len(), 0);
let pose = valid_skeleton();
hist.push(&pose);
assert_eq!(hist.len(), 1);
assert!(!hist.is_empty());
// Fill hot
hist.push(&pose);
hist.push(&pose);
assert_eq!(hist.len(), 3); // 3 hot, 0 warm
// Overflow hot -> warm promotion
hist.push(&pose);
assert_eq!(hist.len(), 4); // 3 hot, 1 warm
}
#[test]
fn compressed_history_warm_overflow() {
let mut hist = CompressedPoseHistory::new(2, 2, 0.001);
let pose = valid_skeleton();
// Push 6 poses: hot=2, warm should cap at 2
for _ in 0..6 {
hist.push(&pose);
}
// hot=2, warm capped at 2
assert_eq!(hist.len(), 4);
}
#[test]
fn compressed_history_similarity_identical() {
let mut hist = CompressedPoseHistory::default();
let pose = valid_skeleton();
hist.push(&pose);
let sim = hist.similarity(&pose);
assert!(
(sim - 1.0).abs() < 1e-5,
"identical pose should have similarity ~1.0, got {}",
sim
);
}
#[test]
fn compressed_history_similarity_empty() {
let hist = CompressedPoseHistory::default();
let pose = valid_skeleton();
assert_eq!(hist.similarity(&pose), 0.0);
}
#[test]
fn compressed_history_default() {
let hist = CompressedPoseHistory::default();
assert_eq!(hist.max_hot, 10);
assert_eq!(hist.max_warm, 50);
assert!((hist.scale - 0.001).abs() < 1e-9);
}
// ── TemporalKeypointAttention tests (RuVector Phase 2) ─────────────
#[test]
fn temporal_attention_empty_returns_input() {
let mut attn = TemporalKeypointAttention::new();
let input: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|i| [i as f32, 0.0, 0.0]);
let out = attn.smooth_keypoints(&input);
// First frame: no history, so output should equal input.
for i in 0..NUM_KEYPOINTS {
assert!((out[i][0] - input[i][0]).abs() < 1e-5);
}
}
#[test]
fn temporal_attention_smooths_jitter() {
let mut attn = TemporalKeypointAttention::new();
let base: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|_| [100.0, 200.0, 0.0]);
// Feed stable frames first.
for _ in 0..5 {
attn.smooth_keypoints(&base);
}
// Now feed a jittery frame.
let jittery: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|_| [110.0, 210.0, 0.0]);
let out = attn.smooth_keypoints(&jittery);
// Output should be closer to base than to jittery (smoothed).
assert!(out[0][0] < 110.0, "Expected smoothing, got {}", out[0][0]);
assert!(out[0][0] > 100.0, "Expected some movement, got {}", out[0][0]);
}
#[test]
fn temporal_attention_window_size_capped() {
let mut attn = TemporalKeypointAttention::with_params(3, 0.7);
let frame: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|_| [1.0, 1.0, 1.0]);
for _ in 0..10 {
attn.smooth_keypoints(&frame);
}
assert_eq!(attn.len(), 3);
}
#[test]
fn temporal_attention_clear() {
let mut attn = TemporalKeypointAttention::new();
let frame = zero_positions();
attn.smooth_keypoints(&frame);
assert!(!attn.is_empty());
attn.clear();
assert!(attn.is_empty());
}
}
@@ -0,0 +1 @@
{"intelligence":60,"timestamp":1774039923051}
+44 -8
View File
@@ -56,10 +56,47 @@ export class PoseRenderer {
[11, 13], [12, 14], [13, 15], [14, 16] // Legs
];
// Client-side keypoint smoothing: lerp between frames to reduce jitter.
// Maps person index → array of {x, y} for each keypoint.
this._smoothedKeypoints = new Map();
this._lerpAlpha = 0.25; // 0 = frozen, 1 = instant (no smoothing)
// Initialize rendering context
this.initializeContext();
}
// Lerp a single value toward target
_lerp(current, target, alpha) {
return current + (target - current) * alpha;
}
// Get smoothed keypoint positions for a person
_getSmoothedKeypoints(personIdx, keypoints) {
if (!this.config.enableSmoothing || !keypoints || keypoints.length === 0) {
return keypoints;
}
let prev = this._smoothedKeypoints.get(personIdx);
if (!prev || prev.length !== keypoints.length) {
// First frame or keypoint count changed — initialize
prev = keypoints.map(kp => ({ x: kp.x, y: kp.y, z: kp.z || 0, confidence: kp.confidence, name: kp.name }));
this._smoothedKeypoints.set(personIdx, prev);
return keypoints;
}
const alpha = this._lerpAlpha;
const smoothed = keypoints.map((kp, i) => ({
...kp,
x: this._lerp(prev[i].x, kp.x, alpha),
y: this._lerp(prev[i].y, kp.y, alpha),
}));
// Update stored positions
this._smoothedKeypoints.set(personIdx, smoothed.map(kp => ({ x: kp.x, y: kp.y, z: kp.z || 0, confidence: kp.confidence, name: kp.name })));
return smoothed;
}
createLogger() {
return {
debug: (...args) => console.debug('[RENDERER-DEBUG]', new Date().toISOString(), ...args),
@@ -150,18 +187,17 @@ export class PoseRenderer {
return; // Skip low confidence detections
}
console.log(`✅ [RENDERER] Rendering person ${index} with confidence: ${person.confidence}`);
// Apply client-side lerp smoothing to reduce visual jitter
const smoothedKps = this._getSmoothedKeypoints(index, person.keypoints);
// Render skeleton connections
if (this.config.showSkeleton && person.keypoints) {
console.log(`🦴 [RENDERER] Rendering skeleton for person ${index}`);
this.renderSkeleton(person.keypoints, person.confidence);
if (this.config.showSkeleton && smoothedKps) {
this.renderSkeleton(smoothedKps, person.confidence);
}
// Render keypoints
if (this.config.showKeypoints && person.keypoints) {
console.log(`🔴 [RENDERER] Rendering keypoints for person ${index}`);
this.renderKeypoints(person.keypoints, person.confidence);
if (this.config.showKeypoints && smoothedKps) {
this.renderKeypoints(smoothedKps, person.confidence);
}
// Render bounding box
@@ -265,7 +301,7 @@ export class PoseRenderer {
persons.forEach((person, personIdx) => {
if (person.confidence < this.config.confidenceThreshold || !person.keypoints) return;
const kps = person.keypoints;
const kps = this._getSmoothedKeypoints(personIdx, person.keypoints);
bodyParts.forEach((part) => {
// Collect valid keypoints for this body part