Merge remote-tracking branch 'origin/main' into feat/cross-node-fusion

# Conflicts: # rust-port/wifi-densepose-rs/crates/wifi-densepose-sensing-server/src/main.rs
docs: update CHANGELOG with v0.5.1-v0.5.3 releases
2026-06-13 10:53:20 +00:00 · 2026-03-30 21:55:40 -04:00 · 2026-03-30 21:52:48 -04:00 · 2026-03-30 16:39:05 -04:00 · 2026-03-30 15:54:44 -04:00 · 2026-03-30 15:48:33 -04:00
24 changed files with 3598 additions and 94 deletions
@@ -0,0 +1 @@
+{"intelligence":7,"timestamp":1774922079152}
@@ -5,6 +5,65 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

+## [v0.5.3-esp32] — 2026-03-30
+
+### Added
+- **Cross-node RSSI-weighted feature fusion** — Multiple ESP32 nodes fuse CSI features using RSSI-based weighting. Closer node gets higher weight. Reduces variance noise by 29%, keypoint jitter by 72%.
+- **DynamicMinCut person separation** — Uses `ruvector_mincut::DynamicMinCut` on the subcarrier temporal correlation graph to detect independent motion clusters. Replaces variance-based heuristic for multi-person counting.
+- **RSSI-based position tracking** — Skeleton position driven by RSSI differential between nodes. Walk between ESP32s and the skeleton follows you.
+- **Per-node state pipeline (ADR-068)** — Each ESP32 node gets independent `HashMap<u8, NodeState>` with frame history, classification, vitals, and person count. Fixes #249 (the #1 user-reported issue).
+- **RuVector Phase 1-3 integration** — Subcarrier importance weighting, temporal keypoint smoothing (EMA), coherence gating, skeleton kinematic constraints (Jakobsen relaxation), compressed pose history.
+- **Client-side lerp smoothing** — UI keypoints interpolate between frames (alpha=0.15) for fluid skeleton movement.
+- **Multi-node mesh tests** — 8 integration tests covering 1-255 node configurations.
+- **`wifi_densepose` Python package** — `from wifi_densepose import WiFiDensePose` now works (#314).
+
+### Fixed
+- **Watchdog crash on busy LANs (#321)** — Batch-limited edge_dsp to 4 frames before 20ms yield. Fixed idle-path busy-spin (`pdMS_TO_TICKS(5)==0`).
+- **No detection from edge vitals (#323)** — Server now generates `sensing_update` from Tier 2+ vitals packets.
+- **RSSI byte offset mismatch (#332)** — Server parsed RSSI from wrong byte (was reading sequence counter).
+- **Stack overflow risk** — Moved 4KB of BPM scratch buffers from stack to static storage.
+- **Stale node memory leak** — `node_states` HashMap evicts nodes inactive >60s.
+- **Unsafe raw pointer removed** — Replaced with safe `.clone()` for adaptive model borrow.
+- **Firmware CI** — Upgraded to IDF v5.4, replaced `xxd` with `od` (#327).
+- **Person count double-counting** — Multi-node aggregation changed from `sum` to `max`.
+- **Skeleton jitter** — Removed tick-based noise, dampened procedural animation, recalibrated feature scaling for real ESP32 data.
+
+### Changed
+- Motion-responsive skeleton: arm swing (0-80px) driven by CSI variance, leg kick (0-50px) by motion_band_power, vertical bob when walking.
+- Person count thresholds recalibrated for real ESP32 hardware (1→2 at 0.70, EMA alpha 0.04).
+- Vital sign filtering: larger median window (31), faster EMA (0.05), looser HR jump filter (15 BPM).
+- Vendored ruvector updated to v2.1.0-40 (316 commits ahead).
+
+### Benchmarks (2-node mesh, COM6 + COM9, 30s)
+| Metric | Baseline | v0.5.3 | Improvement |
+|--------|----------|--------|-------------|
+| Variance noise | 109.4 | 77.6 | **-29%** |
+| Feature stability | std=154.1 | std=105.4 | **-32%** |
+| Keypoint jitter | std=4.5px | std=1.3px | **-72%** |
+| Confidence | 0.643 | 0.686 | **+7%** |
+| Presence accuracy | 93.4% | 94.6% | **+1.3pp** |
+
+### Verified
+- Real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net WiFi
+- All 284 Rust tests pass, 352 signal crate tests pass
+- Firmware builds clean at 843 KB
+- QEMU CI: 11/11 jobs green
+
+## [v0.5.2-esp32] — 2026-03-28
+
+### Fixed
+- RSSI byte offset in frame parser (#332)
+- Per-node state pipeline for multi-node sensing (#249)
+- Firmware CI upgraded to IDF v5.4 (#327)
+
+## [v0.5.1-esp32] — 2026-03-27
+
+### Fixed
+- Watchdog crash on busy LANs (#321)
+- No detection from edge vitals (#323)
+- `wifi_densepose` Python package import (#314)
+- Pre-compiled firmware binaries added to release
+
 ## [v0.5.0-esp32] — 2026-03-15

 ### Added
@@ -43,6 +43,12 @@ static const char *TAG = "edge_proc";
 static edge_ring_buf_t s_ring;
 static uint32_t s_ring_drops;  /* Frames dropped due to full ring buffer. */

+/* Scratch buffers for BPM estimation — moved from stack to static to avoid
+ * stack overflow.  process_frame + update_multi_person_vitals combined used
+ * ~6.5-7.5 KB of the 8 KB task stack.  These save ~4 KB of stack. */
+static float s_scratch_br[EDGE_PHASE_HISTORY_LEN];
+static float s_scratch_hr[EDGE_PHASE_HISTORY_LEN];
+
 static inline bool ring_push(const uint8_t *iq, uint16_t len,
                             int8_t rssi, uint8_t channel)
 {
@@ -513,20 +519,18 @@ static void update_multi_person_vitals(const uint8_t *iq_data, uint16_t n_sc,

        /* Estimate BPM when we have enough history. */
        if (pv->history_len >= 64) {
-            /* Build contiguous buffer for zero-crossing. */
-            float br_buf[EDGE_PHASE_HISTORY_LEN];
-            float hr_buf[EDGE_PHASE_HISTORY_LEN];
+            /* Build contiguous buffer (reuse static scratch to save ~2 KB stack). */
            uint16_t buf_len = pv->history_len;

            for (uint16_t i = 0; i < buf_len; i++) {
                uint16_t ri = (pv->history_idx + EDGE_PHASE_HISTORY_LEN
                               - buf_len + i) % EDGE_PHASE_HISTORY_LEN;
-                br_buf[i] = s_person_br_filt[p][ri];
-                hr_buf[i] = s_person_hr_filt[p][ri];
+                s_scratch_br[i] = s_person_br_filt[p][ri];
+                s_scratch_hr[i] = s_person_hr_filt[p][ri];
            }

-            float br = estimate_bpm_zero_crossing(br_buf, buf_len, sample_rate);
-            float hr = estimate_bpm_zero_crossing(hr_buf, buf_len, sample_rate);
+            float br = estimate_bpm_zero_crossing(s_scratch_br, buf_len, sample_rate);
+            float hr = estimate_bpm_zero_crossing(s_scratch_hr, buf_len, sample_rate);

            /* Sanity clamp. */
            if (br >= 6.0f && br <= 40.0f) pv->breathing_bpm = br;
@@ -690,20 +694,18 @@ static void process_frame(const edge_ring_slot_t *slot)

    /* --- Step 7: BPM estimation (zero-crossing) --- */
    if (s_history_len >= 64) {
-        /* Build contiguous buffers from ring. */
-        float br_buf[EDGE_PHASE_HISTORY_LEN];
-        float hr_buf[EDGE_PHASE_HISTORY_LEN];
+        /* Build contiguous buffers from ring (using static scratch to save stack). */
        uint16_t buf_len = s_history_len;

        for (uint16_t i = 0; i < buf_len; i++) {
            uint16_t ri = (s_history_idx + EDGE_PHASE_HISTORY_LEN
                           - buf_len + i) % EDGE_PHASE_HISTORY_LEN;
-            br_buf[i] = s_breathing_filtered[ri];
-            hr_buf[i] = s_heartrate_filtered[ri];
+            s_scratch_br[i] = s_breathing_filtered[ri];
+            s_scratch_hr[i] = s_heartrate_filtered[ri];
        }

-        float br_bpm = estimate_bpm_zero_crossing(br_buf, buf_len, sample_rate);
-        float hr_bpm = estimate_bpm_zero_crossing(hr_buf, buf_len, sample_rate);
+        float br_bpm = estimate_bpm_zero_crossing(s_scratch_br, buf_len, sample_rate);
+        float hr_bpm = estimate_bpm_zero_crossing(s_scratch_hr, buf_len, sample_rate);

        /* Sanity clamp: breathing 6-40 BPM, heart rate 40-180 BPM. */
        if (br_bpm >= 6.0f && br_bpm <= 40.0f) s_breathing_bpm = br_bpm;
@@ -839,12 +841,11 @@ static void edge_task(void *arg)
     * Without a batch limit the task processes frames back-to-back with
     * only 1-tick yields, which on high frame rates can still starve
     * IDLE1 enough to trip the 5-second task watchdog.  See #266, #321. */
-    const uint8_t BATCH_LIMIT = 4;

    while (1) {
        uint8_t processed = 0;

-        while (processed < BATCH_LIMIT && ring_pop(&slot)) {
+        while (processed < EDGE_BATCH_LIMIT && ring_pop(&slot)) {
            process_frame(&slot);
            processed++;
            /* 1-tick yield between frames within a batch. */
@@ -852,10 +853,10 @@ static void edge_task(void *arg)
        }

        if (processed > 0) {
-            /* Post-batch yield: 2 ticks (~20 ms at 100 Hz) so IDLE1 can
-             * run and feed the Core 1 watchdog even under sustained load.
-             * This is intentionally longer than the 1-tick inter-frame yield. */
-            vTaskDelay(2);
+            /* Post-batch yield: ~20 ms so IDLE1 can run and feed the
+             * Core 1 watchdog even under sustained load.  Uses pdMS_TO_TICKS
+             * for tick-rate independence (minimum 1 tick). */
+            { TickType_t d = pdMS_TO_TICKS(20); vTaskDelay(d > 0 ? d : 1); }
        } else {
            /* No frames available — sleep one full tick.
             * NOTE: pdMS_TO_TICKS(5) == 0 at 100 Hz, which would busy-spin. */
@@ -46,6 +46,9 @@
 #define EDGE_FALL_COOLDOWN_MS 5000  /**< Minimum ms between fall alerts (debounce). */
 #define EDGE_FALL_CONSEC_MIN  3     /**< Consecutive frames above threshold to trigger. */

+/* ---- DSP task tuning ---- */
+#define EDGE_BATCH_LIMIT      4     /**< Max frames per batch before longer yield. */
+
 /* ---- SPSC ring buffer slot ---- */
 typedef struct {
    uint8_t  iq_data[EDGE_MAX_IQ_BYTES]; /**< Raw I/Q bytes from CSI callback. */
@@ -0,0 +1,33 @@
+# ESP32-S3 CSI Node — Default SDK Configuration
+# This file is applied automatically by idf.py when no sdkconfig exists.
+
+# Target: ESP32-S3
+CONFIG_IDF_TARGET="esp32s3"
+
+# Use custom partition table (8MB flash with OTA — ADR-045)
+CONFIG_PARTITION_TABLE_CUSTOM=y
+CONFIG_PARTITION_TABLE_CUSTOM_FILENAME="partitions_display.csv"
+
+# Flash configuration: 8MB (Quad SPI)
+CONFIG_ESPTOOLPY_FLASHSIZE_8MB=y
+CONFIG_ESPTOOLPY_FLASHSIZE="8MB"
+
+# Compiler optimization: optimize for size to reduce binary
+CONFIG_COMPILER_OPTIMIZATION_SIZE=y
+
+# Enable CSI (Channel State Information) in WiFi driver
+CONFIG_ESP_WIFI_CSI_ENABLED=y
+
+# NVS encryption disabled by default (requires eFuse provisioning).
+# Enable only after burning HMAC key to eFuse block.
+# CONFIG_NVS_ENCRYPTION is not set
+
+# Disable unused features to reduce binary size
+CONFIG_BOOTLOADER_LOG_LEVEL_WARN=y
+CONFIG_LOG_DEFAULT_LEVEL_INFO=y
+
+# LWIP: enable extended socket options for UDP multicast
+CONFIG_LWIP_SO_RCVBUF=y
+
+# FreeRTOS: increase task stack for CSI processing
+CONFIG_ESP_MAIN_TASK_STACK_SIZE=8192
@@ -0,0 +1 @@
+{"intelligence":35,"timestamp":1774903706609}
@@ -117,6 +117,7 @@ midstreamer-temporal-compare = "0.1.0"
 midstreamer-attractor = "0.1.0"

 # ruvector integration (published on crates.io)
+# Vendored at v2.1.0 in vendor/ruvector; using crates.io versions until published.
 ruvector-mincut = "2.0.4"
 ruvector-attn-mincut = "2.0.4"
 ruvector-temporal-tensor = "2.0.4"
@@ -21,3 +21,4 @@ pub use bvp::attention_weighted_bvp;
 pub use fresnel::solve_fresnel_geometry;
 pub use spectrogram::gate_spectrogram;
 pub use subcarrier::mincut_subcarrier_partition;
+pub use subcarrier::subcarrier_importance_weights;
@@ -142,6 +142,29 @@ pub fn mincut_subcarrier_partition(sensitivity: &[f32]) -> (Vec<usize>, Vec<usiz
    }
 }

+/// Convert a mincut partition into per-subcarrier importance weights.
+///
+/// Sensitive subcarriers (high body-motion correlation) get weight > 1.0,
+/// insensitive ones get weight 0.5. This allows downstream feature extraction
+/// to emphasise the most informative subcarriers.
+pub fn subcarrier_importance_weights(sensitivity: &[f32]) -> Vec<f32> {
+    if sensitivity.is_empty() {
+        return vec![];
+    }
+    let (sensitive, _insensitive) = mincut_subcarrier_partition(sensitivity);
+    let max_sens = sensitivity
+        .iter()
+        .cloned()
+        .fold(f32::NEG_INFINITY, f32::max)
+        .max(1e-9);
+
+    let mut weights = vec![0.5f32; sensitivity.len()];
+    for &idx in &sensitive {
+        weights[idx] = 1.0 + (sensitivity[idx] / max_sens).min(1.0);
+    }
+    weights
+}
+
 #[cfg(test)]
 mod tests {
    use super::*;
@@ -175,4 +198,38 @@ mod tests {
        assert_eq!(s, vec![0]);
        assert!(i.is_empty());
    }
+
+    #[test]
+    fn test_importance_weights_empty() {
+        let w = subcarrier_importance_weights(&[]);
+        assert!(w.is_empty());
+    }
+
+    #[test]
+    fn test_importance_weights_all_equal() {
+        let sensitivity = vec![1.0f32; 8];
+        let w = subcarrier_importance_weights(&sensitivity);
+        assert_eq!(w.len(), 8);
+        // All subcarriers have identical sensitivity so all should be classified
+        // the same way (either all sensitive or all insensitive after mincut).
+        // At minimum, no weight should exceed 2.0 or be negative.
+        for &wt in &w {
+            assert!(wt >= 0.5 && wt <= 2.0, "weight {wt} out of range");
+        }
+    }
+
+    #[test]
+    fn test_importance_weights_sensitive_higher() {
+        // First 5 subcarriers have high sensitivity, last 5 low.
+        let sensitivity: Vec<f32> = (0..10).map(|i| if i < 5 { 0.9 } else { 0.1 }).collect();
+        let w = subcarrier_importance_weights(&sensitivity);
+        assert_eq!(w.len(), 10);
+
+        let mean_high: f32 = w[..5].iter().sum::<f32>() / 5.0;
+        let mean_low: f32 = w[5..].iter().sum::<f32>() / 5.0;
+        assert!(
+            mean_high > mean_low,
+            "sensitive subcarriers should have higher mean weight ({mean_high}) than insensitive ({mean_low})"
+        );
+    }
 }
@@ -43,5 +43,8 @@ clap = { workspace = true }
 # Multi-BSSID WiFi scanning pipeline (ADR-022 Phase 3)
 wifi-densepose-wifiscan = { version = "0.3.0", path = "../wifi-densepose-wifiscan" }

+# RuVector graph min-cut for person separation (ADR-068)
+ruvector-mincut = { workspace = true }
+
 [dev-dependencies]
 tempfile = "3.10"
@@ -17,6 +17,7 @@ mod vital_signs;
 use wifi_densepose_sensing_server::{graph_transformer, trainer, dataset, embedding};

 use std::collections::{HashMap, VecDeque};
+use ruvector_mincut::{DynamicMinCut, MinCutBuilder};
 use std::net::SocketAddr;
 use std::path::PathBuf;
 use std::sync::Arc;
@@ -299,8 +300,30 @@ struct NodeState {
    latest_vitals: VitalSigns,
    last_frame_time: Option<std::time::Instant>,
    edge_vitals: Option<Esp32VitalsPacket>,
+    /// Latest extracted features for cross-node fusion.
+    latest_features: Option<FeatureInfo>,
+    // ── RuVector Phase 2: Temporal smoothing & coherence gating ──
+    /// Previous frame's smoothed keypoint positions for EMA temporal smoothing.
+    prev_keypoints: Option<Vec<[f64; 3]>>,
+    /// Rolling buffer of motion_energy values for coherence scoring (last 20 frames).
+    motion_energy_history: VecDeque<f64>,
+    /// Coherence score [0.0, 1.0]: low variance in motion_energy = high coherence.
+    coherence_score: f64,
 }

+/// Default EMA alpha for temporal keypoint smoothing (RuVector Phase 2).
+/// Lower = smoother (more history, less jitter). 0.15 balances responsiveness
+/// with stability for WiFi CSI where per-frame noise is high.
+const TEMPORAL_EMA_ALPHA_DEFAULT: f64 = 0.15;
+/// Reduced EMA alpha when coherence is low (trust measurements less).
+const TEMPORAL_EMA_ALPHA_LOW_COHERENCE: f64 = 0.05;
+/// Coherence threshold below which we reduce EMA alpha.
+const COHERENCE_LOW_THRESHOLD: f64 = 0.3;
+/// Maximum allowed bone-length change ratio between frames (20%).
+const MAX_BONE_CHANGE_RATIO: f64 = 0.20;
+/// Number of motion_energy frames to track for coherence scoring.
+const COHERENCE_WINDOW: usize = 20;
+
 impl NodeState {
    fn new() -> Self {
        Self {
@@ -324,6 +347,44 @@ impl NodeState {
            latest_vitals: VitalSigns::default(),
            last_frame_time: None,
            edge_vitals: None,
+            latest_features: None,
+            prev_keypoints: None,
+            motion_energy_history: VecDeque::with_capacity(COHERENCE_WINDOW),
+            coherence_score: 1.0, // assume stable initially
+        }
+    }
+
+    /// Update the coherence score from the latest motion_energy value.
+    ///
+    /// Coherence is computed as 1.0 / (1.0 + running_variance) so that
+    /// low motion-energy variance maps to high coherence ([0, 1]).
+    fn update_coherence(&mut self, motion_energy: f64) {
+        if self.motion_energy_history.len() >= COHERENCE_WINDOW {
+            self.motion_energy_history.pop_front();
+        }
+        self.motion_energy_history.push_back(motion_energy);
+
+        let n = self.motion_energy_history.len();
+        if n < 2 {
+            self.coherence_score = 1.0;
+            return;
+        }
+
+        let mean: f64 = self.motion_energy_history.iter().sum::<f64>() / n as f64;
+        let variance: f64 = self.motion_energy_history.iter()
+            .map(|v| (v - mean) * (v - mean))
+            .sum::<f64>() / (n - 1) as f64;
+
+        // Map variance to [0, 1] coherence: higher variance = lower coherence.
+        self.coherence_score = (1.0 / (1.0 + variance)).clamp(0.0, 1.0);
+    }
+
+    /// Choose the EMA alpha based on current coherence score.
+    fn ema_alpha(&self) -> f64 {
+        if self.coherence_score < COHERENCE_LOW_THRESHOLD {
+            TEMPORAL_EMA_ALPHA_LOW_COHERENCE
+        } else {
+            TEMPORAL_EMA_ALPHA_DEFAULT
        }
    }
 }
@@ -565,13 +626,25 @@ fn parse_esp32_frame(buf: &[u8]) -> Option<Esp32Frame> {
        return None;
    }

+    // Frame layout (must match firmware csi_collector.c):
+    //   [0..3]   magic (u32 LE)
+    //   [4]      node_id (u8)
+    //   [5]      n_antennas (u8)
+    //   [6..7]   n_subcarriers (u16 LE)
+    //   [8..11]  freq_mhz (u32 LE)
+    //   [12..15] sequence (u32 LE)
+    //   [16]     rssi (i8)
+    //   [17]     noise_floor (i8)
+    //   [18..19] reserved
+    //   [20..]   I/Q data
    let node_id = buf[4];
    let n_antennas = buf[5];
-    let n_subcarriers = buf[6];
-    let freq_mhz = u16::from_le_bytes([buf[8], buf[9]]);
-    let sequence = u32::from_le_bytes([buf[10], buf[11], buf[12], buf[13]]);
-    let rssi = buf[14] as i8;
-    let noise_floor = buf[15] as i8;
+    let n_subcarriers_u16 = u16::from_le_bytes([buf[6], buf[7]]);
+    let n_subcarriers = n_subcarriers_u16 as u8; // truncate to u8 for Esp32Frame compat
+    let freq_mhz = u16::from_le_bytes([buf[8], buf[9]]); // low 16 bits of u32
+    let sequence = u32::from_le_bytes([buf[12], buf[13], buf[14], buf[15]]);
+    let rssi = buf[16] as i8;       // #332: was buf[14], 2 bytes off
+    let noise_floor = buf[17] as i8; // #332: was buf[15], 2 bytes off

    let iq_start = 20;
    let n_pairs = n_antennas as usize * n_subcarriers as usize;
@@ -792,6 +865,40 @@ fn estimate_breathing_rate_hz(frame_history: &VecDeque<Vec<f64>>, sample_rate_hz
 /// For each subcarrier index `k`, returns `Var[A_k]` over all stored frames.
 /// This captures spatial signal variation; subcarriers whose amplitude fluctuates
 /// heavily across time correspond to directions with motion.
+/// Compute per-subcarrier importance weights using a simple sensitivity split.
+///
+/// Subcarriers whose sensitivity (amplitude magnitude) is above the median are
+/// considered "sensitive" and receive weight `1.0 + (sens / max_sens)` (range 1.0–2.0).
+/// The rest receive a baseline weight of 0.5. This mirrors the RuVector mincut
+/// partition logic without requiring the graph dependency.
+fn compute_subcarrier_importance_weights(sensitivity: &[f64]) -> Vec<f64> {
+    let n = sensitivity.len();
+    if n == 0 {
+        return vec![];
+    }
+    let max_sens = sensitivity.iter().cloned().fold(f64::NEG_INFINITY, f64::max).max(1e-9);
+
+    // Compute median via a sorted copy.
+    let mut sorted = sensitivity.to_vec();
+    sorted.sort_by(|a, b| a.partial_cmp(b).unwrap_or(std::cmp::Ordering::Equal));
+    let median = if n % 2 == 0 {
+        (sorted[n / 2 - 1] + sorted[n / 2]) / 2.0
+    } else {
+        sorted[n / 2]
+    };
+
+    sensitivity
+        .iter()
+        .map(|&s| {
+            if s >= median {
+                1.0 + (s / max_sens).min(1.0)
+            } else {
+                0.5
+            }
+        })
+        .collect()
+}
+
 fn compute_subcarrier_variances(frame_history: &VecDeque<Vec<f64>>, n_sub: usize) -> Vec<f64> {
    if frame_history.is_empty() || n_sub == 0 {
        return vec![0.0; n_sub];
@@ -840,13 +947,34 @@ fn extract_features_from_frame(
 ) -> (FeatureInfo, ClassificationInfo, f64, Vec<f64>, f64) {
    let n_sub = frame.amplitudes.len().max(1);
    let n = n_sub as f64;
-    let mean_amp: f64 = frame.amplitudes.iter().sum::<f64>() / n;
    let mean_rssi = frame.rssi as f64;

-    // ── Intra-frame subcarrier variance (spatial spread across subcarriers) ──
-    let intra_variance: f64 = frame.amplitudes.iter()
-        .map(|a| (a - mean_amp).powi(2))
-        .sum::<f64>() / n;
+    // ── RuVector Phase 1: subcarrier importance weighting ──
+    // Compute per-subcarrier sensitivity from amplitude magnitude, then weight
+    // sensitive subcarriers higher (>1.0) and insensitive ones lower (0.5).
+    // This emphasises body-motion-correlated subcarriers in all downstream metrics.
+    let sub_sensitivity: Vec<f64> = frame.amplitudes.iter().map(|a| a.abs()).collect();
+    let importance_weights = compute_subcarrier_importance_weights(&sub_sensitivity);
+
+    let weight_sum: f64 = importance_weights.iter().sum::<f64>();
+    let mean_amp: f64 = if weight_sum > 0.0 {
+        frame.amplitudes.iter().zip(importance_weights.iter())
+            .map(|(a, w)| a * w)
+            .sum::<f64>() / weight_sum
+    } else {
+        frame.amplitudes.iter().sum::<f64>() / n
+    };
+
+    // ── Intra-frame subcarrier variance (weighted by importance) ──
+    let intra_variance: f64 = if weight_sum > 0.0 {
+        frame.amplitudes.iter().zip(importance_weights.iter())
+            .map(|(a, w)| w * (a - mean_amp).powi(2))
+            .sum::<f64>() / weight_sum
+    } else {
+        frame.amplitudes.iter()
+            .map(|a| (a - mean_amp).powi(2))
+            .sum::<f64>() / n
+    };

    // ── Temporal (sliding-window) per-subcarrier variance ──
    let sub_variances = compute_subcarrier_variances(frame_history, n_sub);
@@ -1864,6 +1992,61 @@ async fn latest(State(state): State<SharedState>) -> Json<serde_json::Value> {
 /// with a stride-swing pattern applied to arms and legs.
 // ── Multi-person estimation (issue #97) ──────────────────────────────────────

+/// Fuse features across all active nodes for higher SNR.
+///
+/// When multiple ESP32 nodes observe the same room, their CSI features
+/// can be combined:
+/// - Variance: use max (most sensitive node dominates)
+/// - Motion/breathing/spectral power: weighted average by RSSI (closer node = higher weight)
+/// - Dominant frequency: weighted average
+/// - Change points: keep current node's value (not meaningful to average)
+/// - Mean RSSI: use max (best signal)
+fn fuse_multi_node_features(
+    current_features: &FeatureInfo,
+    node_states: &HashMap<u8, NodeState>,
+) -> FeatureInfo {
+    let now = std::time::Instant::now();
+    let active: Vec<(&FeatureInfo, f64)> = node_states.values()
+        .filter(|ns| ns.last_frame_time.map_or(false, |t| now.duration_since(t).as_secs() < 10))
+        .filter_map(|ns| {
+            let feat = ns.latest_features.as_ref()?;
+            let rssi = ns.rssi_history.back().copied().unwrap_or(-80.0);
+            Some((feat, rssi))
+        })
+        .collect();
+
+    if active.len() <= 1 {
+        return current_features.clone();
+    }
+
+    // RSSI-based weights: higher RSSI = closer to person = more weight.
+    // Map RSSI relative to best node into [0.1, 1.0].
+    let max_rssi = active.iter().map(|(_, r)| *r).fold(f64::NEG_INFINITY, f64::max);
+    let weights: Vec<f64> = active.iter()
+        .map(|(_, r)| (1.0 + (r - max_rssi + 20.0) / 20.0).clamp(0.1, 1.0))
+        .collect();
+    let w_sum: f64 = weights.iter().sum::<f64>().max(1e-9);
+
+    FeatureInfo {
+        // Weighted average variance (not max — max inflates person score
+        // and causes count flips between 1↔2 persons).
+        variance: active.iter().zip(&weights)
+            .map(|((f, _), w)| f.variance * w).sum::<f64>() / w_sum,
+        // Weighted average for motion/breathing/spectral
+        motion_band_power: active.iter().zip(&weights)
+            .map(|((f, _), w)| f.motion_band_power * w).sum::<f64>() / w_sum,
+        breathing_band_power: active.iter().zip(&weights)
+            .map(|((f, _), w)| f.breathing_band_power * w).sum::<f64>() / w_sum,
+        spectral_power: active.iter().zip(&weights)
+            .map(|((f, _), w)| f.spectral_power * w).sum::<f64>() / w_sum,
+        dominant_freq_hz: active.iter().zip(&weights)
+            .map(|((f, _), w)| f.dominant_freq_hz * w).sum::<f64>() / w_sum,
+        change_points: current_features.change_points, // keep current node's value
+        // Best RSSI across nodes
+        mean_rssi: active.iter().map(|(f, _)| f.mean_rssi).fold(f64::NEG_INFINITY, f64::max),
+    }
+}
+
 /// Estimate person count from CSI features using a weighted composite heuristic.
 ///
 /// Single ESP32 link limitations: variance-based detection can reliably detect
@@ -1872,27 +2055,137 @@ async fn latest(State(state): State<SharedState>) -> Json<serde_json::Value> {
 /// Returns a raw score (0.0..1.0) that the caller converts to person count
 /// after temporal smoothing.
 fn compute_person_score(feat: &FeatureInfo) -> f64 {
-    // Normalize each feature to [0, 1] using calibrated ranges:
-    //
-    //   variance: intra-frame amp variance. 1-person ~2-15, 2-person ~15-60,
-    //     real ESP32 can go higher. Use 30.0 as scaling midpoint.
-    let var_norm = (feat.variance / 30.0).clamp(0.0, 1.0);
-
-    //   change_points: threshold crossings in 56 subcarriers. 1-person ~5-15,
-    //     2-person ~15-30. Scale by 30.0 (half of max 55).
+    // Normalize each feature to [0, 1] using ranges calibrated from real
+    // ESP32 hardware (COM6/COM9 on ruv.net, March 2026).
+    let var_norm = (feat.variance / 300.0).clamp(0.0, 1.0);
    let cp_norm = (feat.change_points as f64 / 30.0).clamp(0.0, 1.0);
-
-    //   motion_band_power: upper-half subcarrier variance. 1-person ~1-8,
-    //     2-person ~8-25. Scale by 20.0.
-    let motion_norm = (feat.motion_band_power / 20.0).clamp(0.0, 1.0);
-
-    //   spectral_power: mean squared amplitude. Highly variable (~100-1000+).
-    //     Use relative change indicator: high spectral_power with high variance
-    //     suggests multiple reflectors. Scale by 500.0.
+    let motion_norm = (feat.motion_band_power / 250.0).clamp(0.0, 1.0);
    let sp_norm = (feat.spectral_power / 500.0).clamp(0.0, 1.0);
+    var_norm * 0.40 + cp_norm * 0.20 + motion_norm * 0.25 + sp_norm * 0.15
+}

-    // Weighted composite — variance and change_points carry the most signal.
-    var_norm * 0.35 + cp_norm * 0.30 + motion_norm * 0.20 + sp_norm * 0.15
+/// Estimate person count via ruvector DynamicMinCut on the subcarrier
+/// temporal correlation graph.
+///
+/// Builds a graph where:
+/// - Nodes = active subcarriers (variance > noise floor)
+/// - Edges = Pearson correlation between subcarrier time series
+///   (weight = correlation coefficient; high correlation = heavy edge)
+/// - Source = virtual node connected to the most active subcarrier
+/// - Sink = virtual node connected to the least correlated subcarrier
+///
+/// The min-cut value indicates how many independent motion clusters exist:
+/// - High min-cut (relative to total edge weight) → one tightly coupled
+///   group → 1 person
+/// - Low min-cut → two loosely coupled groups → 2 persons
+///
+/// Uses `ruvector_mincut::DynamicMinCut` for O(V²E) exact max-flow.
+fn estimate_persons_from_correlation(frame_history: &VecDeque<Vec<f64>>) -> usize {
+    let n_frames = frame_history.len();
+    if n_frames < 10 {
+        return 1;
+    }
+
+    let window: Vec<&Vec<f64>> = frame_history.iter().rev().take(20).collect();
+    let n_sub = window[0].len().min(56);
+    if n_sub < 4 {
+        return 1;
+    }
+    let k = window.len() as f64;
+
+    // Per-subcarrier mean and variance
+    let mut means = vec![0.0f64; n_sub];
+    let mut variances = vec![0.0f64; n_sub];
+    for frame in &window {
+        for sc in 0..n_sub.min(frame.len()) {
+            means[sc] += frame[sc] / k;
+        }
+    }
+    for frame in &window {
+        for sc in 0..n_sub.min(frame.len()) {
+            variances[sc] += (frame[sc] - means[sc]).powi(2) / k;
+        }
+    }
+
+    // Active subcarriers: variance above noise floor
+    let noise_floor = 1.0;
+    let active: Vec<usize> = (0..n_sub).filter(|&sc| variances[sc] > noise_floor).collect();
+    let m = active.len();
+    if m < 3 {
+        return if m == 0 { 0 } else { 1 };
+    }
+
+    // Build correlation graph edges between active subcarriers.
+    // Edge weight = |Pearson correlation|. High correlation → same person.
+    let mut edges: Vec<(u64, u64, f64)> = Vec::new();
+    let source = m as u64;
+    let sink = (m + 1) as u64;
+
+    // Precompute std devs
+    let stds: Vec<f64> = active.iter().map(|&sc| variances[sc].sqrt().max(1e-9)).collect();
+
+    for i in 0..m {
+        for j in (i + 1)..m {
+            // Pearson correlation between subcarriers i and j
+            let mut cov = 0.0f64;
+            for frame in &window {
+                let si = active[i];
+                let sj = active[j];
+                if si < frame.len() && sj < frame.len() {
+                    cov += (frame[si] - means[si]) * (frame[sj] - means[sj]) / k;
+                }
+            }
+            let corr = (cov / (stds[i] * stds[j])).abs();
+            if corr > 0.1 {
+                // Bidirectional edges for flow network
+                let weight = corr * 10.0; // Scale up for integer-like flow
+                edges.push((i as u64, j as u64, weight));
+                edges.push((j as u64, i as u64, weight));
+            }
+        }
+    }
+
+    // Source → highest-variance subcarrier, Sink → lowest-variance
+    let (max_var_idx, _) = active.iter().enumerate()
+        .max_by(|(_, &a), (_, &b)| variances[a].partial_cmp(&variances[b]).unwrap())
+        .unwrap_or((0, &0));
+    let (min_var_idx, _) = active.iter().enumerate()
+        .min_by(|(_, &a), (_, &b)| variances[a].partial_cmp(&variances[b]).unwrap())
+        .unwrap_or((0, &0));
+
+    if max_var_idx == min_var_idx {
+        return 1;
+    }
+
+    edges.push((source, max_var_idx as u64, 100.0));
+    edges.push((min_var_idx as u64, sink, 100.0));
+
+    // Run min-cut
+    let mc: DynamicMinCut = match MinCutBuilder::new().exact().with_edges(edges.clone()).build() {
+        Ok(mc) => mc,
+        Err(_) => return 1,
+    };
+
+    let cut_value = mc.min_cut_value();
+    let total_edge_weight: f64 = edges.iter()
+        .filter(|(s, t, _)| *s != source && *s != sink && *t != source && *t != sink)
+        .map(|(_, _, w)| w)
+        .sum::<f64>() / 2.0; // bidirectional → halve
+
+    if total_edge_weight < 1e-9 {
+        return 1;
+    }
+
+    // Normalized cut ratio: low = easy to split = multiple people
+    let cut_ratio = cut_value / total_edge_weight;
+
+    if cut_ratio > 0.4 {
+        1 // Tightly coupled — one person
+    } else if cut_ratio > 0.15 {
+        2 // Moderately separable — two people
+    } else {
+        3 // Highly separable — three+ people
+    }
 }

 /// Convert smoothed person score to discrete count with hysteresis.
@@ -1902,25 +2195,26 @@ fn compute_person_score(feat: &FeatureInfo) -> f64 {
 /// (the #1 user-reported issue — see #237, #249, #280, #292).
 fn score_to_person_count(smoothed_score: f64, prev_count: usize) -> usize {
    // Up-thresholds (must exceed to increase count):
-    //   1→2: 0.65  (raised from 0.50 — multipath in small rooms hit 0.50 easily)
-    //   2→3: 0.85  (raised from 0.80 — 3 persons needs strong sustained signal)
+    //   1→2: 0.80  (raised from 0.65 — single-person movement in multipath
+    //               rooms easily hits 0.65, causing false 2-person detection)
+    //   2→3: 0.92  (raised from 0.85 — 3 persons needs very strong signal)
    // Down-thresholds (must drop below to decrease count):
-    //   2→1: 0.45  (hysteresis gap of 0.20)
-    //   3→2: 0.70  (hysteresis gap of 0.15)
+    //   2→1: 0.55  (hysteresis gap of 0.25)
+    //   3→2: 0.78  (hysteresis gap of 0.14)
    match prev_count {
        0 | 1 => {
            if smoothed_score > 0.85 {
                3
-            } else if smoothed_score > 0.65 {
+            } else if smoothed_score > 0.70 {
                2
            } else {
                1
            }
        }
        2 => {
-            if smoothed_score > 0.85 {
+            if smoothed_score > 0.92 {
                3
-            } else if smoothed_score < 0.45 {
+            } else if smoothed_score < 0.55 {
                1
            } else {
                2 // hold — within hysteresis band
@@ -1928,9 +2222,9 @@ fn score_to_person_count(smoothed_score: f64, prev_count: usize) -> usize {
        }
        _ => {
            // prev_count >= 3
-            if smoothed_score < 0.45 {
+            if smoothed_score < 0.55 {
                1
-            } else if smoothed_score < 0.70 {
+            } else if smoothed_score < 0.78 {
                2
            } else {
                3 // hold
@@ -1970,23 +2264,27 @@ fn derive_single_person_pose(
    let breath_phase = if let Some(ref vs) = update.vital_signs {
        let bpm = vs.breathing_rate_bpm.unwrap_or(15.0);
        let freq = (bpm / 60.0).clamp(0.1, 0.5);
-        (update.tick as f64 * freq * 0.1 * std::f64::consts::TAU + phase_offset).sin()
+        // Slow tick rate (0.02) for gentle breathing, not jerky oscillation.
+        (update.tick as f64 * freq * 0.02 * std::f64::consts::TAU + phase_offset).sin()
    } else {
-        (update.tick as f64 * 0.08 + feat.breathing_band_power + phase_offset).sin()
+        (update.tick as f64 * 0.02 + phase_offset).sin()
    };

    let lean_x = (feat.dominant_freq_hz / 5.0 - 1.0).clamp(-1.0, 1.0) * 18.0;

    let stride_x = if is_walking {
-        let stride_phase = (feat.motion_band_power * 0.7 + update.tick as f64 * 0.12 + phase_offset).sin();
-        stride_phase * 45.0 * motion_score
+        let stride_phase = (feat.motion_band_power * 0.7 + update.tick as f64 * 0.06 + phase_offset).sin();
+        stride_phase * 20.0 * motion_score
    } else {
        0.0
    };

-    let burst = (feat.change_points as f64 / 8.0).clamp(0.0, 1.0);
+    // Dampen burst and noise to reduce jitter.  The original used
+    // tick*17.3 which changed wildly every frame.  Now use slow tick
+    // rate and minimal burst scaling for a stable skeleton.
+    let burst = (feat.change_points as f64 / 20.0).clamp(0.0, 0.3);

-    let noise_seed = feat.variance * 31.7 + update.tick as f64 * 17.3 + person_idx as f64 * 97.1;
+    let noise_seed = person_idx as f64 * 97.1; // stable per-person, no tick
    let noise_val = (noise_seed.sin() * 43758.545).fract();

    let snr_factor = ((feat.variance - 0.5) / 10.0).clamp(0.0, 1.0);
@@ -2047,9 +2345,10 @@ fn derive_single_person_pose(

            let extremity_jitter = if EXTREMITY_KP.contains(&i) {
                let phase = noise_seed + i as f64 * 2.399;
+                // Dampened from 12/8 to 4/3 to reduce visual jumping.
                (
-                    phase.sin() * burst * motion_score * 12.0,
-                    (phase * 1.31).cos() * burst * motion_score * 8.0,
+                    phase.sin() * burst * motion_score * 4.0,
+                    (phase * 1.31).cos() * burst * motion_score * 3.0,
                )
            } else {
                (0.0, 0.0)
@@ -2128,6 +2427,95 @@ fn derive_pose_from_sensing(update: &SensingUpdate) -> Vec<PersonDetection> {
        .collect()
 }

+// ── RuVector Phase 2: Temporal EMA smoothing for keypoints ──────────────────
+
+/// Expected bone lengths in pixel-space for the COCO-17 skeleton as used by
+/// `derive_single_person_pose`. Pairs are (parent_idx, child_idx).
+const POSE_BONE_PAIRS: &[(usize, usize)] = &[
+    (5, 7), (7, 9), (6, 8), (8, 10),   // arms
+    (5, 11), (6, 12),                     // torso
+    (11, 13), (13, 15), (12, 14), (14, 16), // legs
+    (5, 6), (11, 12),                     // shoulders, hips
+];
+
+/// Apply temporal EMA smoothing and bone-length clamping to person detections.
+///
+/// For the *first* person (index 0) this uses the per-node `prev_keypoints`
+/// state. Multi-person smoothing is left for a future phase.
+fn apply_temporal_smoothing(persons: &mut [PersonDetection], ns: &mut NodeState) {
+    if persons.is_empty() {
+        return;
+    }
+
+    let alpha = ns.ema_alpha();
+    let person = &mut persons[0]; // smooth primary person only
+
+    let current_kps: Vec<[f64; 3]> = person.keypoints.iter()
+        .map(|kp| [kp.x, kp.y, kp.z])
+        .collect();
+
+    let smoothed = if let Some(ref prev) = ns.prev_keypoints {
+        let mut out = Vec::with_capacity(current_kps.len());
+        for (cur, prv) in current_kps.iter().zip(prev.iter()) {
+            out.push([
+                alpha * cur[0] + (1.0 - alpha) * prv[0],
+                alpha * cur[1] + (1.0 - alpha) * prv[1],
+                alpha * cur[2] + (1.0 - alpha) * prv[2],
+            ]);
+        }
+        // Clamp bone lengths to ±20% of previous frame.
+        clamp_bone_lengths_f64(&mut out, prev);
+        out
+    } else {
+        current_kps.clone()
+    };
+
+    // Write smoothed keypoints back into the person detection.
+    for (kp, s) in person.keypoints.iter_mut().zip(smoothed.iter()) {
+        kp.x = s[0];
+        kp.y = s[1];
+        kp.z = s[2];
+    }
+
+    ns.prev_keypoints = Some(smoothed);
+}
+
+/// Clamp bone lengths so no bone changes by more than MAX_BONE_CHANGE_RATIO
+/// compared to the previous frame.
+fn clamp_bone_lengths_f64(pose: &mut Vec<[f64; 3]>, prev: &[[f64; 3]]) {
+    for &(p, c) in POSE_BONE_PAIRS {
+        if p >= pose.len() || c >= pose.len() {
+            continue;
+        }
+        let prev_len = dist_f64(&prev[p], &prev[c]);
+        if prev_len < 1e-6 {
+            continue;
+        }
+        let cur_len = dist_f64(&pose[p], &pose[c]);
+        if cur_len < 1e-6 {
+            continue;
+        }
+        let ratio = cur_len / prev_len;
+        let lo = 1.0 - MAX_BONE_CHANGE_RATIO;
+        let hi = 1.0 + MAX_BONE_CHANGE_RATIO;
+        if ratio < lo || ratio > hi {
+            let target = prev_len * ratio.clamp(lo, hi);
+            let scale = target / cur_len;
+            for dim in 0..3 {
+                let diff = pose[c][dim] - pose[p][dim];
+                pose[c][dim] = pose[p][dim] + diff * scale;
+            }
+        }
+    }
+}
+
+fn dist_f64(a: &[f64; 3], b: &[f64; 3]) -> f64 {
+    let dx = b[0] - a[0];
+    let dy = b[1] - a[1];
+    let dz = b[2] - a[2];
+    (dx * dx + dy * dy + dz * dz).sqrt()
+}
+
 // ── DensePose-compatible REST endpoints ─────────────────────────────────────

 async fn health_live(State(state): State<SharedState>) -> Json<serde_json::Value> {
@@ -2999,11 +3387,14 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        else { 0.05 };

                    // Aggregate person count across all active nodes.
+                    // Use max (not sum) because nodes in the same room see the
+                    // same people — summing would double-count.
                    let now = std::time::Instant::now();
                    let total_persons: usize = s.node_states.values()
                        .filter(|n| n.last_frame_time.map_or(false, |t| now.duration_since(t).as_secs() < 10))
                        .map(|n| n.prev_person_count)
-                        .sum();
+                        .max()
+                        .unwrap_or(0);

                    // Build nodes array with all active nodes.
                    let active_nodes: Vec<NodeInfo> = s.node_states.iter()
@@ -3026,13 +3417,31 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        change_points: 0,
                        spectral_power: vitals.motion_energy as f64,
                    };
-                    let classification = ClassificationInfo {
+
+                    // Store latest features on node for cross-node fusion.
+                    s.node_states.get_mut(&node_id)
+                        .map(|ns| ns.latest_features = Some(features.clone()));
+
+                    // Cross-node fusion: combine features from all active nodes.
+                    let fused_features = fuse_multi_node_features(&features, &s.node_states);
+
+                    let mut classification = ClassificationInfo {
                        motion_level: motion_level.to_string(),
                        presence: vitals.presence,
                        confidence: vitals.presence_score as f64,
                    };
+
+                    // Boost classification confidence with multi-node coverage.
+                    let n_active = s.node_states.values()
+                        .filter(|ns| ns.last_frame_time.map_or(false, |t| now.duration_since(t).as_secs() < 10))
+                        .count();
+                    if n_active > 1 {
+                        classification.confidence = (classification.confidence
+                            * (1.0 + 0.15 * (n_active as f64 - 1.0))).clamp(0.0, 1.0);
+                    }
+
                    let signal_field = generate_signal_field(
-                        vitals.rssi as f64, motion_score, vitals.breathing_rate_bpm / 60.0,
+                        fused_features.mean_rssi, motion_score, vitals.breathing_rate_bpm / 60.0,
                        (vitals.presence_score as f64).min(1.0), &[],
                    );

@@ -3042,7 +3451,7 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        source: "esp32".to_string(),
                        tick,
                        nodes: active_nodes,
-                        features: features.clone(),
+                        features: fused_features.clone(),
                        classification,
                        signal_field,
                        vital_signs: Some(VitalSigns {
@@ -3064,7 +3473,13 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        estimated_persons: if total_persons > 0 { Some(total_persons) } else { None },
                    };

-                    let persons = derive_pose_from_sensing(&update);
+                    let mut persons = derive_pose_from_sensing(&update);
+                    // RuVector Phase 2: temporal smoothing + coherence gating
+                    {
+                        let ns = s.node_states.entry(node_id).or_insert_with(NodeState::new);
+                        ns.update_coherence(vitals.motion_energy as f64);
+                        apply_temporal_smoothing(&mut persons, ns);
+                    }
                    if !persons.is_empty() {
                        update.persons = Some(persons);
                    }
@@ -3117,7 +3532,10 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                    // We scope the mutable borrow of node_states so we can
                    // access other AppStateInner fields afterward.
                    let node_id = frame.node_id;
-                    let adaptive_model_ref = s.adaptive_model.as_ref().map(|m| m as *const _);
+                    // Clone adaptive model before mutable borrow of node_states
+                    // to avoid unsafe raw pointer (review finding #2).
+                    let adaptive_model_clone = s.adaptive_model.clone();
+
                    let ns = s.node_states.entry(node_id).or_insert_with(NodeState::new);
                    ns.last_frame_time = Some(std::time::Instant::now());

@@ -3131,12 +3549,8 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        extract_features_from_frame(&frame, &ns.frame_history, sample_rate_hz);
                    smooth_and_classify_node(ns, &mut classification, raw_motion);

-                    // SAFETY: adaptive_model_ref points into s which we hold
-                    // via write lock; the model is not mutated here. We use a
-                    // raw pointer to break the borrow-checker deadlock between
-                    // node_states and adaptive_model (both inside s).
-                    if let Some(model_ptr) = adaptive_model_ref {
-                        let model: &adaptive_classifier::AdaptiveModel = unsafe { &*model_ptr };
+                    // Adaptive override using cloned model (safe, no raw pointers).
+                    if let Some(ref model) = adaptive_model_clone {
                        let amps = ns.frame_history.back()
                            .map(|v| v.as_slice())
                            .unwrap_or(&[]);
@@ -3170,8 +3584,10 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                    let vitals = smooth_vitals_node(ns, &raw_vitals);
                    ns.latest_vitals = vitals.clone();

-                    let raw_score = compute_person_score(&features);
-                    ns.smoothed_person_score = ns.smoothed_person_score * 0.90 + raw_score * 0.10;
+                    // DynamicMinCut person estimation from subcarrier correlation.
+                    let corr_persons = estimate_persons_from_correlation(&ns.frame_history);
+                    let raw_score = corr_persons as f64 / 3.0;
+                    ns.smoothed_person_score = ns.smoothed_person_score * 0.92 + raw_score * 0.08;
                    if classification.presence {
                        let count = score_to_person_count(ns.smoothed_person_score, ns.prev_person_count);
                        ns.prev_person_count = count;
@@ -3179,6 +3595,9 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        ns.prev_person_count = 0;
                    }

+                    // Store latest features on node for cross-node fusion.
+                    ns.latest_features = Some(features.clone());
+
                    // Done with per-node mutable borrow; now read aggregated
                    // state from all nodes (the borrow of `ns` ends here).
                    // (We re-borrow node_states immutably via `s` below.)
@@ -3189,6 +3608,9 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                    }
                    s.latest_vitals = vitals.clone();

+                    // Cross-node fusion: combine features from all active nodes.
+                    let fused_features = fuse_multi_node_features(&features, &s.node_states);
+
                    s.tick += 1;
                    let tick = s.tick;

@@ -3197,11 +3619,23 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        else { 0.05 };

                    // Aggregate person count across all active nodes.
+                    // Use max (not sum) because nodes in the same room see the
+                    // same people — summing would double-count.
                    let now = std::time::Instant::now();
                    let total_persons: usize = s.node_states.values()
                        .filter(|n| n.last_frame_time.map_or(false, |t| now.duration_since(t).as_secs() < 10))
                        .map(|n| n.prev_person_count)
-                        .sum();
+                        .max()
+                        .unwrap_or(0);
+
+                    // Boost classification confidence with multi-node coverage.
+                    let n_active = s.node_states.values()
+                        .filter(|ns| ns.last_frame_time.map_or(false, |t| now.duration_since(t).as_secs() < 10))
+                        .count();
+                    if n_active > 1 {
+                        classification.confidence = (classification.confidence
+                            * (1.0 + 0.15 * (n_active as f64 - 1.0))).clamp(0.0, 1.0);
+                    }

                    // Build nodes array with all active nodes.
                    let active_nodes: Vec<NodeInfo> = s.node_states.iter()
@@ -3223,11 +3657,11 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        source: "esp32".to_string(),
                        tick,
                        nodes: active_nodes,
-                        features: features.clone(),
+                        features: fused_features.clone(),
                        classification,
                        signal_field: generate_signal_field(
-                            features.mean_rssi, motion_score, breathing_rate_hz,
-                            features.variance.min(1.0), &sub_variances,
+                            fused_features.mean_rssi, motion_score, breathing_rate_hz,
+                            fused_features.variance.min(1.0), &sub_variances,
                        ),
                        vital_signs: Some(vitals),
                        enhanced_motion: None,
@@ -3242,7 +3676,13 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        estimated_persons: if total_persons > 0 { Some(total_persons) } else { None },
                    };

-                    let persons = derive_pose_from_sensing(&update);
+                    let mut persons = derive_pose_from_sensing(&update);
+                    // RuVector Phase 2: temporal smoothing + coherence gating
+                    {
+                        let ns = s.node_states.entry(node_id).or_insert_with(NodeState::new);
+                        ns.update_coherence(features.motion_band_power);
+                        apply_temporal_smoothing(&mut persons, ns);
+                    }
                    if !persons.is_empty() {
                        update.persons = Some(persons);
                    }
@@ -3251,6 +3691,19 @@ async fn udp_receiver_task(state: SharedState, udp_port: u16) {
                        let _ = s.tx.send(json);
                    }
                    s.latest_update = Some(update);
+
+                    // Evict stale nodes every 100 ticks to prevent memory leak.
+                    if tick % 100 == 0 {
+                        let stale = Duration::from_secs(60);
+                        let before = s.node_states.len();
+                        s.node_states.retain(|_id, ns| {
+                            ns.last_frame_time.map_or(false, |t| now.duration_since(t) < stale)
+                        });
+                        let evicted = before - s.node_states.len();
+                        if evicted > 0 {
+                            info!("Evicted {} stale node(s), {} active", evicted, s.node_states.len());
+                        }
+                    }
                }
            }
            Err(e) => {
@@ -61,7 +61,10 @@ pub use coherence_gate::{GateDecision, GatePolicy};
 pub use multiband::MultiBandCsiFrame;
 pub use multistatic::FusedSensingFrame;
 pub use phase_align::{PhaseAligner, PhaseAlignError};
-pub use pose_tracker::{KeypointState, PoseTrack, TrackLifecycleState};
+pub use pose_tracker::{
+    CompressedPoseHistory, KeypointState, PoseTrack, SkeletonConstraints,
+    TemporalKeypointAttention, TrackLifecycleState,
+};

 /// Number of keypoints in a full-body pose skeleton (COCO-17).
 pub const NUM_KEYPOINTS: usize = 17;
@@ -26,6 +26,8 @@
 //!
 //! - `ruvector-mincut` -> Person separation and track assignment

+use std::collections::VecDeque;
+
 use super::{TrackId, NUM_KEYPOINTS};

 /// Errors from the pose tracker.
@@ -648,6 +650,365 @@ impl PoseDetection {
    }
 }

+// ---------------------------------------------------------------------------
+// Skeleton kinematic constraints (RuVector Phase 3)
+// ---------------------------------------------------------------------------
+
+/// Expected bone lengths in normalized coordinates (parent_idx, child_idx, length).
+///
+/// These define the COCO-17 kinematic tree edges with approximate proportions
+/// derived from anthropometric averages. Used by [`SkeletonConstraints`] to
+/// reject impossible poses (e.g., arm longer than torso).
+const BONE_LENGTHS: &[(usize, usize, f32)] = &[
+    (5, 7, 0.15),   // L shoulder -> L elbow
+    (7, 9, 0.14),   // L elbow -> L wrist
+    (6, 8, 0.15),   // R shoulder -> R elbow
+    (8, 10, 0.14),  // R elbow -> R wrist
+    (5, 11, 0.25),  // L shoulder -> L hip
+    (6, 12, 0.25),  // R shoulder -> R hip
+    (11, 13, 0.22), // L hip -> L knee
+    (13, 15, 0.22), // L knee -> L ankle
+    (12, 14, 0.22), // R hip -> R knee
+    (14, 16, 0.22), // R knee -> R ankle
+    (5, 6, 0.18),   // L shoulder -> R shoulder
+    (11, 12, 0.15), // L hip -> R hip
+];
+
+/// Skeleton kinematic constraint enforcer using Jakobsen relaxation.
+///
+/// Iteratively projects bone lengths toward their expected values so that
+/// the resulting skeleton obeys basic anthropometric limits. Bones that
+/// deviate more than [`Self::TOLERANCE`] (30 %) from their rest length are
+/// corrected over [`Self::ITERATIONS`] passes.
+pub struct SkeletonConstraints;
+
+impl SkeletonConstraints {
+    /// Maximum allowed fractional deviation before correction kicks in.
+    const TOLERANCE: f32 = 0.30;
+
+    /// Number of Jakobsen relaxation iterations.
+    const ITERATIONS: usize = 3;
+
+    /// Enforce kinematic constraints in-place on `keypoints`.
+    ///
+    /// Each element is `[x, y, z]`. The method runs several iterations of
+    /// distance-constraint projection (Jakobsen method) over the edges
+    /// defined in [`BONE_LENGTHS`].
+    pub fn enforce_constraints(keypoints: &mut [[f32; 3]; 17]) {
+        for _ in 0..Self::ITERATIONS {
+            for &(a, b, rest_len) in BONE_LENGTHS {
+                let dx = keypoints[b][0] - keypoints[a][0];
+                let dy = keypoints[b][1] - keypoints[a][1];
+                let dz = keypoints[b][2] - keypoints[a][2];
+                let current_len = (dx * dx + dy * dy + dz * dz).sqrt();
+
+                // Skip degenerate / zero-length bones (e.g. all-zero pose).
+                if current_len < 1e-9 {
+                    continue;
+                }
+
+                let ratio = current_len / rest_len;
+                // Only correct if deviation exceeds tolerance.
+                if ratio < (1.0 - Self::TOLERANCE) || ratio > (1.0 + Self::TOLERANCE) {
+                    let correction = (rest_len - current_len) / current_len * 0.5;
+                    let cx = dx * correction;
+                    let cy = dy * correction;
+                    let cz = dz * correction;
+
+                    keypoints[a][0] -= cx;
+                    keypoints[a][1] -= cy;
+                    keypoints[a][2] -= cz;
+                    keypoints[b][0] += cx;
+                    keypoints[b][1] += cy;
+                    keypoints[b][2] += cz;
+                }
+            }
+        }
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Compressed pose history (RuVector Phase 3 -- temporal tensor)
+// ---------------------------------------------------------------------------
+
+/// Two-tier compressed pose history.
+///
+/// Recent poses are stored at full `f32` precision in the *hot* ring buffer.
+/// Once the hot buffer is full the oldest pose is quantised to `i16` and
+/// pushed into the *warm* tier, keeping memory usage bounded while still
+/// allowing similarity queries against a longer temporal window.
+pub struct CompressedPoseHistory {
+    /// Recent poses at full precision.
+    hot: VecDeque<[[f32; 3]; 17]>,
+    /// Older poses quantised to i16.
+    warm: VecDeque<[[i16; 3]; 17]>,
+    /// Scale factor used for warm quantisation (divide f32, multiply to
+    /// reconstruct).
+    scale: f32,
+    max_hot: usize,
+    max_warm: usize,
+}
+
+impl CompressedPoseHistory {
+    /// Create a new history with the given tier sizes.
+    ///
+    /// `scale` controls the fixed-point quantisation: warm values are stored
+    /// as `(value / scale).round() as i16`.
+    pub fn new(max_hot: usize, max_warm: usize, scale: f32) -> Self {
+        Self {
+            hot: VecDeque::with_capacity(max_hot),
+            warm: VecDeque::with_capacity(max_warm),
+            scale: if scale.abs() < 1e-12 { 1.0 } else { scale },
+            max_hot,
+            max_warm,
+        }
+    }
+
+    /// Push a new pose into the history.
+    ///
+    /// When the hot tier is full the oldest entry is quantised and moved to
+    /// the warm tier. When the warm tier overflows the oldest warm entry is
+    /// discarded.
+    pub fn push(&mut self, pose: &[[f32; 3]; 17]) {
+        if self.hot.len() >= self.max_hot {
+            if let Some(evicted) = self.hot.pop_front() {
+                let quantised = self.quantise(&evicted);
+                if self.warm.len() >= self.max_warm {
+                    self.warm.pop_front();
+                }
+                self.warm.push_back(quantised);
+            }
+        }
+        self.hot.push_back(*pose);
+    }
+
+    /// Cosine similarity between `pose` and the most recent stored pose.
+    ///
+    /// Both poses are flattened to 51-element vectors before the dot-product
+    /// is computed. Returns 0.0 when the history is empty or either vector
+    /// has zero norm.
+    pub fn similarity(&self, pose: &[[f32; 3]; 17]) -> f32 {
+        let recent = match self.hot.back() {
+            Some(r) => r,
+            None => return 0.0,
+        };
+
+        let mut dot = 0.0_f32;
+        let mut norm_a = 0.0_f32;
+        let mut norm_b = 0.0_f32;
+
+        for kp in 0..17 {
+            for d in 0..3 {
+                let a = recent[kp][d];
+                let b = pose[kp][d];
+                dot += a * b;
+                norm_a += a * a;
+                norm_b += b * b;
+            }
+        }
+
+        let denom = (norm_a * norm_b).sqrt();
+        if denom < 1e-12 {
+            return 0.0;
+        }
+        (dot / denom).clamp(-1.0, 1.0)
+    }
+
+    /// Total number of stored poses (hot + warm).
+    pub fn len(&self) -> usize {
+        self.hot.len() + self.warm.len()
+    }
+
+    /// Returns `true` when the history contains no poses.
+    pub fn is_empty(&self) -> bool {
+        self.hot.is_empty() && self.warm.is_empty()
+    }
+
+    // -- internal helpers ---------------------------------------------------
+
+    fn quantise(&self, pose: &[[f32; 3]; 17]) -> [[i16; 3]; 17] {
+        let inv = 1.0 / self.scale;
+        let mut out = [[0_i16; 3]; 17];
+        for kp in 0..17 {
+            for d in 0..3 {
+                out[kp][d] = (pose[kp][d] * inv)
+                    .round()
+                    .clamp(i16::MIN as f32, i16::MAX as f32)
+                    as i16;
+            }
+        }
+        out
+    }
+}
+
+impl Default for CompressedPoseHistory {
+    fn default() -> Self {
+        Self::new(10, 50, 0.001)
+    }
+}
+
+// ---------------------------------------------------------------------------
+// Temporal Keypoint Attention (RuVector Phase 2)
+// ---------------------------------------------------------------------------
+
+/// Sliding-window temporal smoother for 17-keypoint pose estimates.
+///
+/// Maintains a ring buffer of the last `WINDOW_SIZE` pose frames and applies
+/// exponential-decay weighted averaging to produce temporally coherent output.
+/// Additionally enforces kinematic constraints: bone lengths cannot change by
+/// more than 20% between consecutive frames.
+///
+/// This is a lightweight inline implementation that mirrors the algorithm in
+/// `ruvector-attention` without pulling the crate into the sensing server.
+pub struct TemporalKeypointAttention {
+    /// Ring buffer of recent pose frames (newest at back).
+    window: std::collections::VecDeque<[[f32; 3]; NUM_KEYPOINTS]>,
+    /// Maximum number of frames to retain.
+    window_size: usize,
+    /// Exponential decay factor per frame (e.g., 0.7 means frame t-1 has
+    /// weight 0.7, frame t-2 has weight 0.49, etc.).
+    decay: f32,
+}
+
+impl TemporalKeypointAttention {
+    /// Default window size (10 frames at 10-20 Hz = 0.5-1.0 s look-back).
+    pub const DEFAULT_WINDOW: usize = 10;
+    /// Default decay factor.
+    pub const DEFAULT_DECAY: f32 = 0.7;
+    /// Maximum allowed bone-length change ratio between consecutive frames.
+    pub const MAX_BONE_CHANGE: f32 = 0.20;
+
+    /// Create a new temporal attention smoother with default parameters.
+    pub fn new() -> Self {
+        Self {
+            window: std::collections::VecDeque::with_capacity(Self::DEFAULT_WINDOW),
+            window_size: Self::DEFAULT_WINDOW,
+            decay: Self::DEFAULT_DECAY,
+        }
+    }
+
+    /// Create with custom window size and decay.
+    pub fn with_params(window_size: usize, decay: f32) -> Self {
+        Self {
+            window: std::collections::VecDeque::with_capacity(window_size),
+            window_size,
+            decay: decay.clamp(0.0, 1.0),
+        }
+    }
+
+    /// Smooth the current keypoint estimate using the temporal window.
+    ///
+    /// 1. Pushes `current` into the window (evicting oldest if full).
+    /// 2. Computes exponential-decay weighted average across all frames.
+    /// 3. Enforces bone-length constraints against the previous frame.
+    pub fn smooth_keypoints(
+        &mut self,
+        current: &[[f32; 3]; NUM_KEYPOINTS],
+    ) -> [[f32; 3]; NUM_KEYPOINTS] {
+        // Grab the previous frame (before pushing current) for bone clamping.
+        let prev_frame = self.window.back().copied();
+
+        // Push current frame into the window.
+        if self.window.len() >= self.window_size {
+            self.window.pop_front();
+        }
+        self.window.push_back(*current);
+
+        // Compute weighted average with exponential decay (newest = highest weight).
+        let n = self.window.len();
+        let mut result = [[0.0_f32; 3]; NUM_KEYPOINTS];
+        let mut total_weight = 0.0_f32;
+
+        for (age, frame) in self.window.iter().rev().enumerate() {
+            let w = self.decay.powi(age as i32);
+            total_weight += w;
+            for kp in 0..NUM_KEYPOINTS {
+                for dim in 0..3 {
+                    result[kp][dim] += w * frame[kp][dim];
+                }
+            }
+        }
+
+        if total_weight > 0.0 {
+            for kp in 0..NUM_KEYPOINTS {
+                for dim in 0..3 {
+                    result[kp][dim] /= total_weight;
+                }
+            }
+        }
+
+        // Enforce bone-length constraints: no bone can change >20% from prev frame.
+        if let Some(prev) = prev_frame {
+            if n >= 2 {
+                Self::clamp_bone_lengths(&mut result, &prev);
+            }
+        }
+
+        result
+    }
+
+    /// Clamp bone lengths so they don't change by more than MAX_BONE_CHANGE
+    /// compared to the previous frame.
+    fn clamp_bone_lengths(
+        pose: &mut [[f32; 3]; NUM_KEYPOINTS],
+        prev: &[[f32; 3]; NUM_KEYPOINTS],
+    ) {
+        for &(parent, child, _) in BONE_LENGTHS {
+            let prev_len = Self::bone_len(prev, parent, child);
+            if prev_len < 1e-6 {
+                continue; // skip degenerate bones
+            }
+            let cur_len = Self::bone_len(pose, parent, child);
+            if cur_len < 1e-6 {
+                continue;
+            }
+
+            let ratio = cur_len / prev_len;
+            let lo = 1.0 - Self::MAX_BONE_CHANGE;
+            let hi = 1.0 + Self::MAX_BONE_CHANGE;
+
+            if ratio < lo || ratio > hi {
+                // Scale the child position toward/away from parent to clamp.
+                let target_len = prev_len * ratio.clamp(lo, hi);
+                let scale = target_len / cur_len;
+                for dim in 0..3 {
+                    let diff = pose[child][dim] - pose[parent][dim];
+                    pose[child][dim] = pose[parent][dim] + diff * scale;
+                }
+            }
+        }
+    }
+
+    /// Euclidean distance between two keypoints in a pose.
+    fn bone_len(pose: &[[f32; 3]; NUM_KEYPOINTS], a: usize, b: usize) -> f32 {
+        let dx = pose[b][0] - pose[a][0];
+        let dy = pose[b][1] - pose[a][1];
+        let dz = pose[b][2] - pose[a][2];
+        (dx * dx + dy * dy + dz * dz).sqrt()
+    }
+
+    /// Number of frames currently in the window.
+    pub fn len(&self) -> usize {
+        self.window.len()
+    }
+
+    /// Whether the window is empty.
+    pub fn is_empty(&self) -> bool {
+        self.window.is_empty()
+    }
+
+    /// Clear the window (e.g., on track reset).
+    pub fn clear(&mut self) {
+        self.window.clear();
+    }
+}
+
+impl Default for TemporalKeypointAttention {
+    fn default() -> Self {
+        Self::new()
+    }
+}
+
 #[cfg(test)]
 mod tests {
    use super::*;
@@ -940,4 +1301,223 @@ mod tests {
        track.mark_lost(); // Should not override Terminated
        assert_eq!(track.lifecycle, TrackLifecycleState::Terminated);
    }
+
+    // -----------------------------------------------------------------------
+    // SkeletonConstraints tests
+    // -----------------------------------------------------------------------
+
+    /// Build a plausible standing skeleton in normalised coordinates.
+    fn valid_skeleton() -> [[f32; 3]; 17] {
+        let mut kps = [[0.0_f32; 3]; 17];
+        // Head / face (indices 0-4) clustered near top.
+        kps[0] = [0.0, 1.0, 0.0];   // nose
+        kps[1] = [-0.02, 1.02, 0.0]; // left eye
+        kps[2] = [0.02, 1.02, 0.0];  // right eye
+        kps[3] = [-0.04, 1.0, 0.0];  // left ear
+        kps[4] = [0.04, 1.0, 0.0];   // right ear
+        // Torso
+        kps[5] = [-0.09, 0.85, 0.0]; // L shoulder
+        kps[6] = [0.09, 0.85, 0.0];  // R shoulder
+        kps[7] = [-0.09, 0.70, 0.0]; // L elbow  (dist ~0.15 from shoulder)
+        kps[8] = [0.09, 0.70, 0.0];  // R elbow
+        kps[9] = [-0.09, 0.56, 0.0]; // L wrist  (dist ~0.14 from elbow)
+        kps[10] = [0.09, 0.56, 0.0]; // R wrist
+        kps[11] = [-0.075, 0.60, 0.0]; // L hip  (dist ~0.25 from shoulder)
+        kps[12] = [0.075, 0.60, 0.0];  // R hip
+        kps[13] = [-0.075, 0.38, 0.0]; // L knee (dist ~0.22 from hip)
+        kps[14] = [0.075, 0.38, 0.0];  // R knee
+        kps[15] = [-0.075, 0.16, 0.0]; // L ankle (dist ~0.22 from knee)
+        kps[16] = [0.075, 0.16, 0.0];  // R ankle
+        kps
+    }
+
+    #[test]
+    fn test_valid_skeleton_unchanged() {
+        let mut kps = valid_skeleton();
+        let before = kps;
+        SkeletonConstraints::enforce_constraints(&mut kps);
+
+        // Each keypoint should move by less than 0.02 (small perturbation
+        // from iterative relaxation on an already-valid skeleton).
+        for i in 0..17 {
+            let d = ((kps[i][0] - before[i][0]).powi(2)
+                + (kps[i][1] - before[i][1]).powi(2)
+                + (kps[i][2] - before[i][2]).powi(2))
+            .sqrt();
+            assert!(
+                d < 0.05,
+                "keypoint {} moved {:.4}, expected < 0.05",
+                i,
+                d
+            );
+        }
+    }
+
+    #[test]
+    fn test_stretched_bone_corrected() {
+        let mut kps = valid_skeleton();
+
+        // Stretch L shoulder -> L elbow to 2x expected (0.30 instead of 0.15).
+        kps[7] = [-0.09, 0.55, 0.0]; // push elbow far down
+
+        let dist_before = {
+            let dx = kps[7][0] - kps[5][0];
+            let dy = kps[7][1] - kps[5][1];
+            let dz = kps[7][2] - kps[5][2];
+            (dx * dx + dy * dy + dz * dz).sqrt()
+        };
+        assert!(
+            dist_before > 0.25,
+            "pre-condition: bone should be stretched, got {}",
+            dist_before
+        );
+
+        SkeletonConstraints::enforce_constraints(&mut kps);
+
+        let dist_after = {
+            let dx = kps[7][0] - kps[5][0];
+            let dy = kps[7][1] - kps[5][1];
+            let dz = kps[7][2] - kps[5][2];
+            (dx * dx + dy * dy + dz * dz).sqrt()
+        };
+
+        // After enforcement the bone should be much closer to the rest
+        // length of 0.15 (within tolerance band 0.105 .. 0.195).
+        assert!(
+            dist_after < dist_before,
+            "bone should be shorter after correction: before={:.4}, after={:.4}",
+            dist_before,
+            dist_after
+        );
+    }
+
+    #[test]
+    fn test_zero_skeleton_handled() {
+        // All-zero keypoints must not panic.
+        let mut kps = [[0.0_f32; 3]; 17];
+        SkeletonConstraints::enforce_constraints(&mut kps);
+        // Just assert it didn't panic; the result should still be all-zero
+        // since zero-length bones are skipped.
+        for kp in &kps {
+            assert!(kp[0].is_finite());
+            assert!(kp[1].is_finite());
+            assert!(kp[2].is_finite());
+        }
+    }
+
+    // -----------------------------------------------------------------------
+    // CompressedPoseHistory tests
+    // -----------------------------------------------------------------------
+
+    #[test]
+    fn compressed_history_push_and_len() {
+        let mut hist = CompressedPoseHistory::new(3, 5, 0.001);
+        assert!(hist.is_empty());
+        assert_eq!(hist.len(), 0);
+
+        let pose = valid_skeleton();
+        hist.push(&pose);
+        assert_eq!(hist.len(), 1);
+        assert!(!hist.is_empty());
+
+        // Fill hot
+        hist.push(&pose);
+        hist.push(&pose);
+        assert_eq!(hist.len(), 3); // 3 hot, 0 warm
+
+        // Overflow hot -> warm promotion
+        hist.push(&pose);
+        assert_eq!(hist.len(), 4); // 3 hot, 1 warm
+    }
+
+    #[test]
+    fn compressed_history_warm_overflow() {
+        let mut hist = CompressedPoseHistory::new(2, 2, 0.001);
+        let pose = valid_skeleton();
+
+        // Push 6 poses: hot=2, warm should cap at 2
+        for _ in 0..6 {
+            hist.push(&pose);
+        }
+        // hot=2, warm capped at 2
+        assert_eq!(hist.len(), 4);
+    }
+
+    #[test]
+    fn compressed_history_similarity_identical() {
+        let mut hist = CompressedPoseHistory::default();
+        let pose = valid_skeleton();
+        hist.push(&pose);
+
+        let sim = hist.similarity(&pose);
+        assert!(
+            (sim - 1.0).abs() < 1e-5,
+            "identical pose should have similarity ~1.0, got {}",
+            sim
+        );
+    }
+
+    #[test]
+    fn compressed_history_similarity_empty() {
+        let hist = CompressedPoseHistory::default();
+        let pose = valid_skeleton();
+        assert_eq!(hist.similarity(&pose), 0.0);
+    }
+
+    #[test]
+    fn compressed_history_default() {
+        let hist = CompressedPoseHistory::default();
+        assert_eq!(hist.max_hot, 10);
+        assert_eq!(hist.max_warm, 50);
+        assert!((hist.scale - 0.001).abs() < 1e-9);
+    }
+
+    // ── TemporalKeypointAttention tests (RuVector Phase 2) ─────────────
+
+    #[test]
+    fn temporal_attention_empty_returns_input() {
+        let mut attn = TemporalKeypointAttention::new();
+        let input: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|i| [i as f32, 0.0, 0.0]);
+        let out = attn.smooth_keypoints(&input);
+        // First frame: no history, so output should equal input.
+        for i in 0..NUM_KEYPOINTS {
+            assert!((out[i][0] - input[i][0]).abs() < 1e-5);
+        }
+    }
+
+    #[test]
+    fn temporal_attention_smooths_jitter() {
+        let mut attn = TemporalKeypointAttention::new();
+        let base: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|_| [100.0, 200.0, 0.0]);
+        // Feed stable frames first.
+        for _ in 0..5 {
+            attn.smooth_keypoints(&base);
+        }
+        // Now feed a jittery frame.
+        let jittery: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|_| [110.0, 210.0, 0.0]);
+        let out = attn.smooth_keypoints(&jittery);
+        // Output should be closer to base than to jittery (smoothed).
+        assert!(out[0][0] < 110.0, "Expected smoothing, got {}", out[0][0]);
+        assert!(out[0][0] > 100.0, "Expected some movement, got {}", out[0][0]);
+    }
+
+    #[test]
+    fn temporal_attention_window_size_capped() {
+        let mut attn = TemporalKeypointAttention::with_params(3, 0.7);
+        let frame: [[f32; 3]; NUM_KEYPOINTS] = std::array::from_fn(|_| [1.0, 1.0, 1.0]);
+        for _ in 0..10 {
+            attn.smooth_keypoints(&frame);
+        }
+        assert_eq!(attn.len(), 3);
+    }
+
+    #[test]
+    fn temporal_attention_clear() {
+        let mut attn = TemporalKeypointAttention::new();
+        let frame = zero_positions();
+        attn.smooth_keypoints(&frame);
+        assert!(!attn.is_empty());
+        attn.clear();
+        assert!(attn.is_empty());
+    }
 }
@@ -0,0 +1 @@
+{"intelligence":60,"timestamp":1774039923051}
@@ -56,10 +56,47 @@ export class PoseRenderer {
      [11, 13], [12, 14], [13, 15], [14, 16] // Legs
    ];
    
+    // Client-side keypoint smoothing: lerp between frames to reduce jitter.
+    // Maps person index → array of {x, y} for each keypoint.
+    this._smoothedKeypoints = new Map();
+    this._lerpAlpha = 0.25; // 0 = frozen, 1 = instant (no smoothing)
+
    // Initialize rendering context
    this.initializeContext();
  }

+  // Lerp a single value toward target
+  _lerp(current, target, alpha) {
+    return current + (target - current) * alpha;
+  }
+
+  // Get smoothed keypoint positions for a person
+  _getSmoothedKeypoints(personIdx, keypoints) {
+    if (!this.config.enableSmoothing || !keypoints || keypoints.length === 0) {
+      return keypoints;
+    }
+
+    let prev = this._smoothedKeypoints.get(personIdx);
+    if (!prev || prev.length !== keypoints.length) {
+      // First frame or keypoint count changed — initialize
+      prev = keypoints.map(kp => ({ x: kp.x, y: kp.y, z: kp.z || 0, confidence: kp.confidence, name: kp.name }));
+      this._smoothedKeypoints.set(personIdx, prev);
+      return keypoints;
+    }
+
+    const alpha = this._lerpAlpha;
+    const smoothed = keypoints.map((kp, i) => ({
+      ...kp,
+      x: this._lerp(prev[i].x, kp.x, alpha),
+      y: this._lerp(prev[i].y, kp.y, alpha),
+    }));
+
+    // Update stored positions
+    this._smoothedKeypoints.set(personIdx, smoothed.map(kp => ({ x: kp.x, y: kp.y, z: kp.z || 0, confidence: kp.confidence, name: kp.name })));
+
+    return smoothed;
+  }
+
  createLogger() {
    return {
      debug: (...args) => console.debug('[RENDERER-DEBUG]', new Date().toISOString(), ...args),
@@ -150,18 +187,17 @@ export class PoseRenderer {
        return; // Skip low confidence detections
      }

-      console.log(`✅ [RENDERER] Rendering person ${index} with confidence: ${person.confidence}`);
+      // Apply client-side lerp smoothing to reduce visual jitter
+      const smoothedKps = this._getSmoothedKeypoints(index, person.keypoints);

      // Render skeleton connections
-      if (this.config.showSkeleton && person.keypoints) {
-        console.log(`🦴 [RENDERER] Rendering skeleton for person ${index}`);
-        this.renderSkeleton(person.keypoints, person.confidence);
+      if (this.config.showSkeleton && smoothedKps) {
+        this.renderSkeleton(smoothedKps, person.confidence);
      }

      // Render keypoints
-      if (this.config.showKeypoints && person.keypoints) {
-        console.log(`🔴 [RENDERER] Rendering keypoints for person ${index}`);
-        this.renderKeypoints(person.keypoints, person.confidence);
+      if (this.config.showKeypoints && smoothedKps) {
+        this.renderKeypoints(smoothedKps, person.confidence);
      }

      // Render bounding box
@@ -265,7 +301,7 @@ export class PoseRenderer {
    persons.forEach((person, personIdx) => {
      if (person.confidence < this.config.confidenceThreshold || !person.keypoints) return;

-      const kps = person.keypoints;
+      const kps = this._getSmoothedKeypoints(personIdx, person.keypoints);

      bodyParts.forEach((part) => {
        // Collect valid keypoints for this body part
Author	SHA1	Message	Date
ruv	2732cf9e8f	Merge remote-tracking branch 'origin/main' into feat/cross-node-fusion # Conflicts: # rust-port/wifi-densepose-rs/crates/wifi-densepose-sensing-server/src/main.rs	2026-03-30 21:55:40 -04:00
ruv	94e928c274	docs: update CHANGELOG with v0.5.1-v0.5.3 releases Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 21:52:48 -04:00
ruv	10d69c1071	feat: DynamicMinCut person separation + UI lerp smoothing - Added ruvector-mincut dependency to sensing server - Replaced variance-based person scoring with actual graph min-cut on subcarrier temporal correlation matrix (Pearson correlation edges, DynamicMinCut exact max-flow) - Recalibrated feature scaling for real ESP32 data ranges - UI: client-side lerp interpolation (alpha=0.25) on keypoint positions - Dampened procedural animation (noise, stride, extremity jitter) - Person count thresholds retuned for mincut ratio Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 16:39:05 -04:00
ruv	3f549f4d25	fix(ui): add client-side lerp smoothing to pose renderer Keypoints now interpolate between frames (alpha=0.25) instead of jumping directly to new positions. This eliminates visual jitter that persists even with server-side EMA smoothing, because the renderer was drawing every WebSocket frame at full rate. Applied to skeleton, keypoints, and dense body rendering paths. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 15:54:44 -04:00
rUv	cd84c35f8f	feat: cross-node RSSI-weighted feature fusion (benchmarked) Adds fuse_multi_node_features() that combines CSI features across all active ESP32 nodes using RSSI-based weighting (closer node = higher weight). Benchmark results (2 ESP32 nodes, 30s, ~1500 frames): Metric \| Baseline \| Fusion \| Improvement ---------------------\|----------\|---------\|------------ Variance mean \| 109.4 \| 77.6 \| -29% noise Variance std \| 154.1 \| 105.4 \| -32% stability Confidence \| 0.643 \| 0.686 \| +7% Keypoint spread std \| 4.5 \| 1.3 \| -72% jitter Presence ratio \| 93.4% \| 94.6% \| +1.3pp Person count still fluctuates near threshold — tracked as known issue. Verified on real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net.	2026-03-30 15:48:33 -04:00
ruv	f0bdc1aa69	feat(server): cross-node RSSI-weighted feature fusion + benchmarks Adds fuse_multi_node_features() that combines CSI features across all active ESP32 nodes using RSSI-based weighting (closer node = higher weight). Benchmark results (2 ESP32 nodes, 30s, ~1500 frames): Metric \| Baseline \| Fusion \| Improvement ---------------------\|----------\|---------\|------------ Variance mean \| 109.4 \| 77.6 \| -29% noise Variance std \| 154.1 \| 105.4 \| -32% stability Confidence \| 0.643 \| 0.686 \| +7% Keypoint spread std \| 4.5 \| 1.3 \| -72% jitter Presence ratio \| 93.4% \| 94.6% \| +1.3pp Person count still fluctuates near threshold — tracked as known issue. Verified on real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 15:47:58 -04:00
rUv	dd45160cc5	fix: skeleton jitter + person count stability (hardware-verified) * chore: update vendored ruvector to latest main (v2.1.0-40) Was at v2.0.5-172 (f8f2c600a), now at v2.1.0-40 (050c3fe6f). 316 commits with new crates: ruvector-coherence, sona, ruvector-core, ruvector-gnn improvements, and security hardening. Co-Authored-By: claude-flow <ruv@ruv.net> * feat: RuVector Phases 2+3 — temporal smoothing, kinematic constraints, coherence gating Phase 2 (sensing server): - Temporal keypoint smoothing via EMA (alpha=0.3) with coherence-adaptive blending - Coherence scoring: running variance of motion_energy over 20 frames - Low coherence → reduce alpha to 0.1 (trust measurements less) - Per-node prev_keypoints for frame-to-frame smoothing - Bone length clamping (±20%) in derive_single_person_pose Phase 3 (signal crate): - SkeletonConstraints: Jakobsen relaxation (3 iterations) on 12-bone COCO-17 kinematic tree — prevents impossible skeletons - CompressedPoseHistory: two-tier storage (hot f32 + warm i16 quantized) for trajectory matching and re-ID - 8 new tests for constraints + history Vendored ruvector updated to v2.1.0-40 (latest main, 316 commits). Workspace deps remain at v2.0.4 (crates.io) until v2.1.0 is published. 647 tests pass across both crates (0 failures). Refs #296 Co-Authored-By: claude-flow <ruv@ruv.net> * fix(server): use max instead of sum for multi-node person aggregation With nodes in the same room, each node sees the same people. Summing per-node counts double-counted (2 nodes × 1 person = 2 persons). Now uses max() so 2 nodes × 1 person = 1 person. Verified on real hardware: COM6 (node 1) + COM9 (node 2) on ruv.net, estimated_persons=1 with 1 person in room. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(server): reduce skeleton jitter + raise person count thresholds - EMA alpha 0.3→0.15, low-coherence 0.1→0.05 - Remove tick-based noise (main jitter source) - Breathing 5x slower, extremity jitter 3x smaller, stride 2x smaller - Person count 1→2 threshold 0.65→0.80 - Aggregation sum→max for same-room nodes Verified on COM6+COM9: 1 person stable. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 15:17:48 -04:00
rUv	5e5781b28a	feat: RuVector all phases — temporal smoothing + kinematic constraints + coherence * chore: update vendored ruvector to latest main (v2.1.0-40) Was at v2.0.5-172 (f8f2c600a), now at v2.1.0-40 (050c3fe6f). 316 commits with new crates: ruvector-coherence, sona, ruvector-core, ruvector-gnn improvements, and security hardening. Co-Authored-By: claude-flow <ruv@ruv.net> * feat: RuVector Phases 2+3 — temporal smoothing, kinematic constraints, coherence gating Phase 2 (sensing server): - Temporal keypoint smoothing via EMA (alpha=0.3) with coherence-adaptive blending - Coherence scoring: running variance of motion_energy over 20 frames - Low coherence → reduce alpha to 0.1 (trust measurements less) - Per-node prev_keypoints for frame-to-frame smoothing - Bone length clamping (±20%) in derive_single_person_pose Phase 3 (signal crate): - SkeletonConstraints: Jakobsen relaxation (3 iterations) on 12-bone COCO-17 kinematic tree — prevents impossible skeletons - CompressedPoseHistory: two-tier storage (hot f32 + warm i16 quantized) for trajectory matching and re-ID - 8 new tests for constraints + history Vendored ruvector updated to v2.1.0-40 (latest main, 316 commits). Workspace deps remain at v2.0.4 (crates.io) until v2.1.0 is published. 647 tests pass across both crates (0 failures). Refs #296 Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 14:23:21 -04:00
rUv	6f23e89909	fix: deep review optimizations — firmware + server * feat(signal): subcarrier importance weighting via mincut partition (Phase 1) Adds subcarrier_importance_weights() to ruvector signal crate — converts mincut partition into per-subcarrier float weights (>1.0 for sensitive, 0.5 for insensitive subcarriers). Sensing server now uses weighted mean/variance in extract_features_from_frame instead of treating all 56 subcarriers equally. This emphasizes body-motion- sensitive subcarriers and reduces noise from static multipath. Expected: ~26% reduction in keypoint jitter (±15cm → ±11cm RMS). 284 tests pass (191 trainer + 51 lib + 18 vital_signs + 16 dataset + 8 multi_node). Co-Authored-By: claude-flow <ruv@ruv.net> * fix(firmware): stack overflow risk + tick-rate independence (review findings) Critical fixes from deep review: 1. Stack overflow prevention: Moved BPM scratch buffers (br_buf, hr_buf) from stack to static storage in both process_frame() and update_multi_person_vitals(). Combined stack was ~6.5-7.5 KB of 8 KB limit — now reduced by ~4 KB to safe margins. 2. Tick-rate independence: Post-batch yield now uses pdMS_TO_TICKS(20) with min-1 guard instead of raw vTaskDelay(2). Previously assumed 100Hz tick rate. 3. EDGE_BATCH_LIMIT to header: Moved from local const to edge_processing.h #define for configurability. Firmware builds clean at 843 KB. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(server): stale node eviction, remove unsafe pointer (review findings) Critical fixes from deep review: 1. Stale node eviction: node_states HashMap now evicts nodes with no frame for >60 seconds, every 100 ticks. Prevents unbounded memory growth and stale smoothing data when nodes are replaced. 2. Remove unsafe raw pointer: Replaced the unsafe raw pointer to adaptive_model (used to break borrow checker deadlock with node_states) with a safe .clone() before the mutable borrow. AdaptiveModel derives Clone so this is a clean copy. 284 tests pass, zero failures. Co-Authored-By: claude-flow <ruv@ruv.net>	2026-03-30 13:31:07 -04:00
rUv	1dcf5d42eb	feat(signal): subcarrier importance weighting — RuVector Phase 1 Adds subcarrier_importance_weights() to ruvector signal crate — converts mincut partition into per-subcarrier float weights (>1.0 for sensitive, 0.5 for insensitive subcarriers). Sensing server now uses weighted mean/variance in extract_features_from_frame instead of treating all 56 subcarriers equally. This emphasizes body-motion- sensitive subcarriers and reduces noise from static multipath. Expected: ~26% reduction in keypoint jitter (±15cm → ±11cm RMS). 284 tests pass (191 trainer + 51 lib + 18 vital_signs + 16 dataset + 8 multi_node).	2026-03-30 13:20:05 -04:00
rUv	9814d2bc62	fix(server): correct RSSI byte offset in frame parser (#332 ) The server parsed rssi from buf[14] and noise_floor from buf[15], but the firmware (csi_collector.c) packs them at buf[16] and buf[17]: Firmware: n_subcarriers=u16(6-7) freq=u32(8-11) seq=u32(12-15) rssi=i8(16) Server: n_subcarriers=u8(6) freq=u16(8-9) seq=u32(10-13) rssi=i8(14) ← WRONG This caused RSSI to read the high byte of the sequence counter instead of the actual signed RSSI value, producing positive values (e.g., +9) instead of the correct negative values (e.g., -46 dBm). Added inline documentation of the frame layout matching csi_collector.c. Closes #332	2026-03-30 11:54:03 -04:00
				`@@ -0,0 +1 @@`
				`{"intelligence":7,"timestamp":1774922079152}`
				`@@ -0,0 +1 @@`
				`{"intelligence":35,"timestamp":1774903706609}`
				`@@ -0,0 +1 @@`
				`{"intelligence":60,"timestamp":1774039923051}`