Compare commits

..

1 Commits

Author SHA1 Message Date
ruv b0b203d088 docs(adr): ADR-180 — through-wall camera↔CSI hand-off demo ("Behind the Wall")
Proposed design for the HTML demo: camera-supervised CSI model infers a full
skeleton, hands off camera→RF when you walk behind a wall, and keeps inferring
the skeleton through the wall (S3 + C6 mmWave + Pi5 nexmon multistatic fusion +
AETHER re-ID). Dead-reckoning Kalman smoother (reuses pose_tracker.rs) keeps the
figure fluid through dropped CSI with bounded extrapolation → LOST, never a
phantom. Honesty mechanism: a far-side camera (cognitum-v0) provides ground
truth behind the wall so the through-wall skeleton PCK is MEASURED + published
(metric-locked, ADR-173), not claimed. Reuses ADR-079 supervision, the
multistatic fuser, the calibration crate, and the Observatory UI — new code is a
hand-off module + dead-reckoning smoother + a single-file HTML viewer.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-06-15 15:30:59 -04:00
@@ -0,0 +1,272 @@
# ADR-180: Through-Wall Camera↔CSI Hand-off Demo ("Behind the Wall")
| Field | Value |
|-------|-------|
| **Status** | Proposed |
| **Date** | 2026-06-15 |
| **Deciders** | ruv |
| **Codename** | **BEHIND-THE-WALL** |
| **Builds on** | ADR-079 (camera ground-truth training), ADR-031 (sensing-first RF mode), ADR-134 (CSI→CIR multipath), ADR-029/030 (RuvSense multistatic + persistent field), ADR-024 (AETHER re-ID), ADR-151 (per-room calibration), ADR-173 (metric-locked PCK), ADR-095/096 (rvcsi nexmon) |
## Context
### The demo we want
A single self-contained **HTML page** that tells one honest, visceral story:
1. You stand in front of the laptop. The camera tracks your **full skeletal pose**;
the WiFi-CSI model, trained on *your* movements moments earlier, infers the **same
skeleton** in parallel — a side-by-side "camera vs RF agree" view.
2. You **walk out the door and behind the wall**. The camera **goes blind** (you are
occluded — it honestly shows "no person in frame"). The CSI model **keeps inferring
your skeleton** from the WiFi signal alone — the 3D figure keeps walking, behind the
wall, smoothly. A badge flips from `CAMERA` to `RF-INFERRED (through-wall)`.
3. You **walk back into view**. The camera **re-acquires**; the badge flips back to
`CAMERA`, and the RF-inferred and camera skeletons reconverge.
This is the "WiFi sees through walls" demo — and the user explicitly wants the **inferred
skeleton through the wall**, not just a blob. The project's "prove everything / no AI-slop"
bar means we make that claim **only because we measure it**: a second camera on the far side
of the wall records ground-truth pose *behind* the wall, so the through-wall skeleton's
accuracy is a **reported, reproducible number** — never an unfalsifiable "trust me."
### Honest capability framing (the load-bearing section)
Through-wall **per-joint skeletal inference from WiFi CSI is not a generally-validated
capability** in open settings — WiFi-DensePose (CMU) is camera-*co-located*. What makes it
defensible *here* is the tightly-controlled regime and the measurement:
- **Controlled regime:** one room, one subject (you), one doorway, a model **camera-supervised
on your exact gait and your exact through-door transition** (ADR-079) minutes earlier. This
is in-distribution for *this* demo, not a universal claim.
- **Measured, not asserted:** a far-side camera (cognitum-v0 has 17 `/dev/video*` nodes — use
one, or a phone) records ground-truth pose behind the wall. The through-wall CSI skeleton is
scored against it with the metric-locked PCK harness (ADR-173). **We publish the number.**
- **Uncertainty is rendered, not hidden:** the through-wall skeleton is drawn **translucent**,
with a live **per-joint confidence** and an explicit `RF-INFERRED` badge. High-confidence
joints render solid; low-confidence joints fade. It never masquerades as the camera's
ground-truth pose.
| While… | Camera | WiFi CSI (S3 / Pi5 nexmon, fused) | 60 GHz mmWave (C6 + MR60BHA2) |
|--------|--------|-----------------------------------|-------------------------------|
| In frame | **Full 17-kpt pose** — ground truth | full skeleton (supervised model) — *agrees with camera* | presence + range + micro-motion |
| Behind a **drywall** | nothing (occluded) | **inferred full skeleton** (camera-supervised model + multistatic fusion), confidence-scored, **measured vs far-side camera** | presence + range + breathing — independent through-thin-wall confirm |
| Behind **brick/metal** | nothing | degrades to coarse motion/position only — report honestly | blocked |
**The claim — stated precisely:** *"A WiFi-CSI model, camera-supervised on this subject and
room, infers a continuous skeletal pose that tracks the subject through a drywall partition;
through-wall accuracy is measured at X% PCK@k against a far-side camera (declared, not
claimed)."* If X turns out low, that is the **honest result we report** — the skeleton is still
rendered (the user wants it) but flagged with its true confidence, and the headline number is
whatever we measured, good or bad.
### Why multistatic + supervision is the enabler
A single node behind a wall sees only "something moved." Three spatially-diverse vantage points
around the doorway (RuvSense multistatic + cross-viewpoint fusion, ADR-029/030) triangulate the
moving scatterer — drywall attenuates and diffracts 2.4/5 GHz but does not block it — giving the
model a rich enough multipath signature to regress a skeleton it was *trained* to associate with
your through-door motion. AETHER re-ID embeddings (ADR-024) keep it locked to **you** across the
camera→RF→camera hand-off.
### Available hardware (the user's actual rig)
| Role | Device | Where | Stream |
|------|--------|-------|--------|
| Near ground truth (visible) | Laptop / USB camera | front of workstation (ruvzen) | MediaPipe pose → keypoints |
| **Far ground truth (validation)** | cognitum-v0 camera (1 of 17 `/dev/video*`) or a phone | **behind the wall** | MediaPipe pose → keypoints (for MEASURING the through-wall skeleton) |
| CSI node A | ESP32-S3 (8 MB) | COM9 (ruvzen) | UDP CSI :5005 |
| CSI + mmWave node B | ESP32-C6 + Seeed MR60BHA2 | COM12 (ruvzen) | WiFi CSI + 60 GHz FMCW presence/range |
| CSI node C (through-wall vantage) | Pi 5, BCM43455c0 | cognitum-v0 (other room) | nexmon_csi `.pcap` → rvcsi → CsiFrame |
| Fusion + serving | sensing-server | ruvzen :3000/:8765 | `/ws/sensing`, `/ws/pose`, new `/ws/handoff` |
Place **node C (Pi 5) and the far camera on the far side of the wall** — the Pi 5 gives the
fuser a vantage the camera lacks, and the far camera turns the through-wall claim into a
measurement.
## Decision
Build a **camera↔CSI hand-off demo** as a thin, additive layer over existing components (no new
heavy crate). Five parts: a multi-source capture plane, a camera-supervised calibration walk
that **learns to infer the skeleton through the wall**, a **hand-off state machine**, a
**dead-reckoning smoother** so dropped CSI never makes the figure jump, and a single-file HTML
viewer that renders the inferred skeleton with honest confidence.
### 1. Capture plane (reuse, don't rebuild)
- **Near camera:** `scripts/collect-ground-truth.py` already does MediaPipe pose + ESP32 CSI
paired capture (ADR-079). Extend it to also subscribe to the Pi 5 nexmon stream (rvcsi), the
C6 mmWave presence, **and the far camera**, so every frame is
`(near_pose|null, far_pose|null, csi_S3, csi_C6, mmwave_C6, csi_Pi5, t)`.
- **CSI nodes:** S3 over UDP :5005, Pi 5 via `rvcsi` (vendor/rvcsi nexmon adapter → `CsiFrame`),
C6 WiFi CSI + the MR60BHA2 60 GHz presence/range/breathing.
- **Fusion:** all CSI sources into the existing `MultistaticFuser`
(`signal/src/ruvsense/multistatic.rs`); node positions around the doorway via
`--node-positions` (geometric-diversity index drives confidence). **#1049:** with 3
independently-clocked nodes set `WDP_GUARD_INTERVAL_US` to the real inter-node spread or
fusion demotes.
### 2. Calibration walk — "it learns my movements **and infers them through the wall**" (ADR-079)
A 35 minute guided routine. The HTML page scripts the walk: stand, step left/right, walk to the
door, **cross fully behind the wall and back**, repeat — covering the visible AND the occluded
zone, because **both cameras label ground truth**:
- **Visible-zone supervision:** near camera labels pose; synchronized CSI window is the input.
- **Through-wall supervision (the key part):** while you are behind the wall, the **far camera**
labels your pose. So the CSI→skeleton model is trained on *real behind-wall poses* paired with
the *behind-wall multistatic CSI* — the model genuinely learns to infer your skeleton through
the wall, supervised by ground truth, not extrapolated blindly.
- Train/fine-tune on `ruvultra` (RTX 5080) if available, else the local recipe. Persist as a
per-room calibration bank (ADR-151 `baseline → enroll → extract → train`). AETHER re-ID
embeddings (ADR-024) bind the track to you across the hand-off.
- **Held-out split:** reserve some behind-wall passes for evaluation so through-wall PCK is
measured on data the model never trained on (no leakage — the ADR-152 measurement discipline).
### 3. Hand-off state machine (`sensing-server/src/handoff.rs`, < 300 lines)
States: `CAMERA``HANDOFF_OUT``RF_INFERRED``HANDOFF_IN``CAMERA` (+ `LOST`).
- **`CAMERA`** — near camera has a confident pose → render it; RF-inferred skeleton ghosted
alongside for the "they agree" effect.
- **`HANDOFF_OUT`** — near-camera confidence drops at the doorway **while** CSI motion stays high
and the multistatic track heads into the door zone → cross-fade source camera→RF.
- **`RF_INFERRED`** — no camera pose; the CSI model emits a **full 17-kpt skeleton** + per-joint
confidence; AETHER confirms it is still you. Render the translucent skeleton + confidence,
badge `RF-INFERRED (through-wall)`. (When fusion confidence is too low for a credible skeleton,
degrade gracefully to a coarse marker rather than a flailing one — honest fallback.)
- **`HANDOFF_IN`** — near camera re-acquires a pose positionally consistent with the last RF
skeleton (continuity gate) → cross-fade RF→camera.
- **`LOST`** — neither source for N cycles → "no track," never invented.
Fail-closed: `RF_INFERRED` requires real multistatic motion energy + an AETHER identity match
above calibrated floors; absent that → `LOST`, never a phantom. Mirrors the governed-trust gate
(ADR-031 / ADR-141).
### 4. Dead reckoning & smoothing — fluid, never jumpy (the user's requirement)
CSI does **not** arrive cleanly: UDP frames drop, nexmon `.pcap` has gaps, the fuser skips
cycles when the #1049 guard rejects a spread, and the model's per-frame skeleton jitters. Render
only on real frames and the figure teleports and shakes — which also *reads as fake*. A
**predict/correct (dead-reckoning) layer** keeps the skeleton continuous and smooth between
measurements, with **bounded** extrapolation so we never invent motion that didn't happen:
- **Per-joint constant-velocity Kalman filter** — reuse `signal/src/ruvsense/pose_tracker.rs`
(the project's existing 17-keypoint Kalman tracker with AETHER re-ID). The renderer runs at a
**fixed ~30 Hz, decoupled from CSI arrival**:
- **Measurement this tick** → Kalman *update* (correct) each joint with the new inferred pose.
- **Dropped CSI this tick** → Kalman *predict* only: advance each joint by `x += v·dt`, so the
skeleton keeps moving along its trajectory instead of freezing then snapping. **This is the
dead reckoning** — the limbs keep their motion through a dropout.
- **Confidence decay (honesty governor):** every predict-only tick multiplies confidence and
widens covariance. Dead reckoning is trusted for a **bounded** horizon (default ≤ ~500 ms,
`WDP_DEADRECKON_MAX_MS`); past it, confidence hits the floor → state machine → `LOST`. **We
coast briefly to stay smooth; we never coast forever to fake a track.** Someone who actually
stopped behind the wall converges to a still pose then `LOST`, not perpetual phantom walking.
- **Re-acquire smoothing:** a returning measurement after a gap is blended in with a
critically-damped step (no overshoot) over 23 ticks, so the skeleton eases onto truth.
- **Client render smoothing (already present):** `ui/observatory/js/figure-pool.js`
`applyKeypoints` already `lerp`s joints with a small velocity overshoot for secondary motion;
the hand-off viewer reuses it. The camera↔RF cross-fade is an alpha-lerp over ~300 ms.
**Dead-reckoning honesty invariants (testable):**
1. Predicted-only frames carry `"dead_reckoned": true` + `"age_ms"`; the UI dims them —
extrapolation is never shown as a fresh measurement.
2. Confidence is **monotonically non-increasing** across consecutive predict-only ticks.
3. After `WDP_DEADRECKON_MAX_MS` of silence the state **must** become `LOST` (pinned test:
measurements then silence → assert transition within the horizon; no perpetual motion).
4. Dead reckoning extrapolates an **existing** track only — no measurement ever ⇒ no track ⇒
`LOST`, never a phantom from zero.
### 5. The HTML demo (single file, vanilla — mirrors the Observatory)
`ui/through-wall/index.html` (+ a small JS bundle, zero build step, like `ui/observatory/`):
- **Left:** near camera feed with the MediaPipe skeleton overlaid while visible; greys to
"CAMERA BLIND" when occluded. (Optional second tile: the far camera, shown only in a
"validation" view, not the hero view.)
- **Right:** a top-down 3D room (Three.js) with the **wall** drawn, the doorway, the three
sensor positions, and the figure: a **solid skeleton** in `CAMERA`, a **translucent skeleton
with per-joint confidence fade** in `RF_INFERRED`, eased by the dead-reckoning smoother.
- **Banner / `BannerState`** (strict, mirrors rufield-viewer): `CAMERA` / `RF-INFERRED — through
wall (conf X%, measured Y% PCK@k)` / `DEAD-RECKONED (age N ms)` / `LOST` — mutually exclusive,
with a one-line honesty caption. The measured through-wall PCK is shown, not invented.
- Consumes a new `GET /ws/handoff` WS/SSE topic of `HandoffFrame`s; `?demo=1` replays a recorded
session badged `REPLAY`.
### Output contract (`HandoffFrame`, JSON)
```jsonc
{
"t_ns": 1718400000000,
"state": "RF_INFERRED", // CAMERA | HANDOFF_OUT | RF_INFERRED | HANDOFF_IN | LOST
"source": "fused_csi", // camera | fused_csi | mmwave | dead_reckoned
"pose": [[x,y,z,conf], …×17], // inferred skeleton WITH per-joint confidence (present in CAMERA/HANDOFF/RF_INFERRED)
"pose_confidence": 0.58, // aggregate; the rendered translucency
"identity_match": 0.81, // AETHER re-ID — is it still you?
"coarse": { "cell":[x,y], "zone":"behind_wall", "heading_deg":95, "node_diversity":0.48 },
"dead_reckoned": false, // true on predict-only (extrapolated) ticks
"age_ms": 0, // ms since the last real measurement (0 = fresh)
"camera_blind": true,
"measured_pck": { "k": 20, "value": null }, // filled from the far-camera validation run; null until measured
"caption": "RF-inferred skeleton — model camera-supervised on this room; through-wall PCK measured separately"
}
```
## Phased plan (each phase independently demoable + falsifiable)
- **P1 — wiring (no claim):** 3-source CSI capture (S3+C6+Pi5) + near camera into the multistatic
fuser. Gate: `/ws/sensing` shows ≥3 active nodes + a fused position with the camera running.
- **P2 — supervised calibration + through-wall training:** the guided walk with **both cameras**;
fine-tune CSI→skeleton on visible AND far-camera-labeled behind-wall poses (ADR-079). Gate:
while-visible PCK declared (metric-locked, ADR-173) on a held-out segment.
- **P3 — MEASURE the through-wall skeleton:** score the RF-inferred skeleton against the far
camera on held-out behind-wall passes → **publish the through-wall PCK@k** (good or bad). Gate:
a committed eval script reproduces the number; honest negative if low.
- **P4 — hand-off + dead reckoning + HTML:** the camera→RF→camera transition renders end-to-end,
smooth through dropped CSI. Gate: a recorded live walk where the camera goes blind, the inferred
skeleton keeps walking fluidly behind the wall, dead-reckons through dropouts without jumps, and
re-acquisition is position-continuous. **This is the demo.**
- **P5 — multi-modal corroboration (optional):** overlay C6 60 GHz presence/range as an
independent through-thin-wall confirm (two physics, one conclusion).
## Consequences
### Positive
- A genuinely compelling demo that does what the user asked — **infers and renders the skeleton
through the wall** — while staying honest because the through-wall accuracy is **measured**
against a far-side camera, not claimed. Reuses the multistatic fuser, ADR-079 supervision, the
Kalman pose tracker, AETHER re-ID, the calibration crate, and the Observatory UI: the new code
is a hand-off module + dead-reckoning smoother + an HTML page.
### Negative / Risks
- **Through-wall skeletal accuracy may be modest or poor.** That is acceptable *iff* reported
honestly — the headline is the measured PCK, whatever it is; the skeleton renders with its true
per-joint confidence (low-confidence joints fade), never as fake certainty.
- **Material dependence:** drywall good; brick/metal degrades to coarse-only — shoot on drywall
and say so.
- **3-node clock sync** is the #1049 hazard — tune `WDP_GUARD_INTERVAL_US`.
- **Per-room, per-subject:** the model that "learned your movements" does not transfer without
re-calibration — stated on the page.
- **Over-claiming is the failure mode.** Mitigations baked in: translucent confidence-faded
skeleton, `dead_reckoned`/`age_ms` flags, the measured-PCK banner, bounded extrapolation→`LOST`.
### Neutral
- No new heavy crate; signal-path proof (`verify.py`) untouched — capture/fusion/UI orchestration
over hardened, already-reviewed components.
## Acceptance criteria (falsifiable — "prove the haters wrong")
On a recorded live session, all must hold:
1. A contiguous window where the **near camera reports no person** (verifiable from raw frames)
**and** the system renders an `RF_INFERRED` skeleton.
2. The inferred skeleton's **gross motion matches reality** — direction of travel and rough gait
phase — confirmed against the **far camera** (not eyeballed).
3. **Through-wall per-joint accuracy is MEASURED** against the far camera and **reported** as
PCK@k from a committed script. Low is fine *if* honestly published; fabricated is not.
4. The figure is **smooth through dropped CSI** — no teleports/jitter — and every predicted-only
frame is flagged `dead_reckoned`; after `WDP_DEADRECKON_MAX_MS` of silence it goes `LOST`.
5. Re-acquisition is **position-continuous** (camera re-detects within a cell of the last RF
position), and AETHER confirms identity across the hand-off.
6. Every number (visible PCK, through-wall PCK, confidences) is MEASURED and reproducible — no
hand-typed metrics.
A demo that cannot meet (1)(2) and (4)(5) on the available hardware is reported as a **negative
result** (honest), not dressed up; a poor (3) is published as the real number.
## Links
- ADR-079 — camera ground-truth training (supervision pipeline; extended here to a far camera)
- ADR-031 — sensing-first RF mode / coherence gate (fail-closed honesty pattern)
- ADR-134 — CSI→CIR multipath (through-wall multipath physics)
- ADR-029 / ADR-030 — RuvSense multistatic + persistent field (the localization engine)
- ADR-024 — AETHER contrastive re-ID (identity lock across the hand-off)
- ADR-151 — per-room calibration crate (bank persistence)
- ADR-152 / ADR-173 — measurement discipline + metric-locked PCK (the honest accuracy readout)
- ADR-095 / ADR-096 — rvcsi nexmon (Pi 5 BCM43455c0 capture)
- `signal/src/ruvsense/pose_tracker.rs` — 17-kpt Kalman tracker reused for dead reckoning
- `ui/observatory/` — the vanilla-JS 3D viewer pattern this demo mirrors