mirror of
https://github.com/ruvnet/RuView
synced 2026-06-09 10:13:17 +00:00
Compare commits
5 Commits
| Author | SHA1 | Date | |
|---|---|---|---|
| a85d4e31e4 | |||
| b16d7431bc | |||
| b3a5012dbd | |||
| e6a5df36eb | |||
| 5c914e63c7 |
@@ -2,6 +2,66 @@
|
||||
|
||||
Append-only log of every published count_v1 training run per ADR-103. New runs add a section; never overwrite history.
|
||||
|
||||
## v0.0.2 — K-fold validated, random split + label smoothing + early stop + temp scale (2026-05-21)
|
||||
|
||||
### Why a new release
|
||||
|
||||
A 5-fold stratified CV on the same 1,077 samples proved the v0.0.1 result was driven by an unlucky temporal split — the trailing window was class-0-heavy, and a degenerate "always predict 0" classifier hit the class-0 fraction (65.1%) trivially.
|
||||
|
||||
| Metric | v0.0.1 (temporal) | **5-fold random CV** (diagnostic) |
|
||||
|---|---|---|
|
||||
| Overall accuracy | 65.1% | 62.2% ± 1.9% |
|
||||
| Class 1 accuracy | **0%** | **57.1%** ✓ |
|
||||
| Confidence Spearman | 0.023 | 0.160 ± 0.029 |
|
||||
|
||||
The architecture has real ~57% class-1 capacity under fair splits.
|
||||
|
||||
### v0.0.2 results
|
||||
|
||||
Architecture unchanged. Training changes only:
|
||||
- **Random 80/20 split** (seed=42) — temporal split eliminated.
|
||||
- **Label smoothing 0.1** on cross-entropy.
|
||||
- **Class-balanced multinomial sampler** with replacement.
|
||||
- **Early stopping** with patience 20 (exited at epoch 29 of 400 max).
|
||||
- **Temperature scaling** of the conf head via LBFGS — T = **0.9262**, shipped as a `count_v1.temperature` sidecar.
|
||||
|
||||
| Metric | v0.0.1 | **v0.0.2** | K-fold ref |
|
||||
|---|---|---|---|
|
||||
| Overall accuracy | 65.1% | **62.3%** | 62.2% ± 1.9% |
|
||||
| Class 0 accuracy | 100% (cheating) | **86.2%** | 67.4% |
|
||||
| **Class 1 accuracy** | **0%** | **34.3%** ✓ | 57.1% |
|
||||
| MAE | 0.349 | 0.377 | 0.378 |
|
||||
| Confidence Spearman (post-temp) | 0.023 | 0.013 | 0.160 |
|
||||
| Wall time | 5.6 s (400 ep) | **0.7 s (29 ep)** | 7.5 s (5×100) |
|
||||
|
||||
### Honest read
|
||||
|
||||
**Class-1 accuracy 0% → 34.3% is the headline.** The cog now reports `count = 1` honestly when a person is present, instead of always-zero cheating. Single random draw lands below the K-fold mean of 57% — that gap is run-to-run variance, not a missing improvement. Reaching 57% on a fixed eval set needs averaging over independent draws, which means more independent recordings — i.e. multi-room data (#645), not another training trick.
|
||||
|
||||
Confidence calibration didn't move. Temperature scaling alone can't fix a confidence head trained against a noisy `argmax==truth` indicator over a 62%-accurate classifier — its training signal is the bottleneck.
|
||||
|
||||
### Release artifacts (live on cognitum-v0)
|
||||
|
||||
```
|
||||
gs://cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors
|
||||
sha256: 32996433516891a37c63c600db8b95e42192a53bd538c088c82cd6a85e55513c
|
||||
bytes: 392,088
|
||||
```
|
||||
|
||||
Binaries themselves unchanged from v0.0.1 — weights load at runtime via mmap. Per-arch manifests under `cog/artifacts/manifests/{arm,x86_64}/` bumped to `version: 0.0.2`, weights_sha256 + build_metadata caveats updated.
|
||||
|
||||
### Reproducibility
|
||||
|
||||
```bash
|
||||
python3 scripts/train-count.py --paired data/paired/wiflow-p7-1779210883.paired.jsonl \
|
||||
--k-fold 5 --epochs 100 --out-results kfold_results.json
|
||||
|
||||
python3 scripts/train-count.py --paired data/paired/wiflow-p7-1779210883.paired.jsonl \
|
||||
--v2 --epochs 400 \
|
||||
--out-safetensors count_v1.safetensors --out-onnx count_v1.onnx \
|
||||
--out-results count_train_results.json
|
||||
```
|
||||
|
||||
## v0.0.1 — first measured run (2026-05-21)
|
||||
|
||||
### Setup
|
||||
|
||||
@@ -0,0 +1,68 @@
|
||||
# SOTA Research Loop — 2026-05-22
|
||||
|
||||
Started: 2026-05-21 ~20:00 ET. **Auto-stops: 2026-05-22 08:00 ET.** Cron `d6e5c473` (`*/10 * * * *`).
|
||||
|
||||
## Mandate
|
||||
|
||||
Push WiFi-CSI sensing past 2026 published SOTA in three axes:
|
||||
|
||||
1. **Spatial intelligence** — multi-static fusion, room-scale awareness, occupancy beyond counting
|
||||
2. **RF feature engineering** — phase, ToA, subcarrier dynamics, Fresnel zones
|
||||
3. **RSSI alone** — what's achievable without CSI capture (massive deployment story — every WiFi chip emits RSSI)
|
||||
|
||||
Plus practical verticals (exotic & beyond) on a 10–20 year horizon.
|
||||
|
||||
Output goes to `docs/research/sota-2026-05-22/` (research notes, benchmarks, negative results) + `examples/research-sota/` (runnable code).
|
||||
|
||||
## Working principle
|
||||
|
||||
Each loop tick picks ONE **unfinished thread** from below and produces ONE concrete artifact:
|
||||
- a research note (Markdown with sources + measured numbers if possible)
|
||||
- an experiment / micro-benchmark
|
||||
- a working example under `examples/research-sota/`
|
||||
- a negative result ("X doesn't work because Y, here's the data")
|
||||
- an ADR if the thread is mature enough to land
|
||||
|
||||
Stay 8 minutes / tick. Commit + PR + auto-merge per piece. Future-tick re-entry is via this PROGRESS.md.
|
||||
|
||||
## Research vectors
|
||||
|
||||
### Spatial Intelligence
|
||||
|
||||
- [ ] **R1. Multi-static Time-of-Arrival (ToA) from OFDM phase coherence.** Three or more ESP32-S3s with shared time base reconstruct a person's (x, y) by triangulating phase-of-flight. 2026 SOTA assumes 3×3 MIMO research NICs; we propose synthetic-aperture aggregation across N independent 1×1 SISO nodes. Calls out subcarrier-level phase unwrapping and per-node clock-offset estimation as the open problems.
|
||||
- [ ] **R2. Persistent room field model — eigenstructure perturbation.** Already in `wifi-densepose-signal/src/ruvsense/field_model.rs` (SVD on empty-room CSI). Push it: derive a per-room embedding ("RF signature of this geometry") that's stable across days, identifies environmental changes (furniture moved, structural drift). Vertical: building-integrity monitoring.
|
||||
- [ ] **R3. Cross-room re-identification via gait CSI signatures.** Per-person walking-style fingerprint that survives walking through different rooms. Different from `AETHER` (in-room re-ID) — this is *inter*-room continuity.
|
||||
- [ ] **R4. Federated learning of room models.** Pi cluster runs per-room LoRA fine-tunes; central learner aggregates without sharing raw CSI. Privacy-preserving spatial intelligence.
|
||||
|
||||
### RF Feature Engineering
|
||||
|
||||
- [ ] **R5. Subcarrier attention over time → "RF saliency map".** Visualize which subcarriers carry the most information per task. ADR-097 hints at this; nothing in repo computes it. Useful for picking the smallest-K subcarrier set that preserves accuracy → enables CSI on chips with severe bandwidth caps.
|
||||
- [ ] **R6. Fresnel-zone forward model for through-wall sensing.** Code in `wifi-densepose-signal/src/ruvsense/tomography.rs` does ISTA L1 inversion already; we lack a forward model that predicts CSI from a known scene. Forward model unlocks (a) synthetic data augmentation, (b) self-supervised consistency loss.
|
||||
- [ ] **R7. Quantum-inspired Stoer-Wagner sampling for adversarial robustness.** Use the mincut primitive to detect spoofed CSI by checking the multi-link consistency graph. Lands in `cognitum-rvcsi` if it works.
|
||||
|
||||
### RSSI Alone (no CSI)
|
||||
|
||||
- [ ] **R8. RSSI-only presence + vitals.** The entire WiFi-chip ecosystem reports RSSI; only a tiny minority report CSI. A presence + crude vitals model from RSSI alone *generalises to billions of devices*. Hard problem (very low information rate) but enormous downstream value. Start with literature survey + first model experiment.
|
||||
- [ ] **R9. RSSI fingerprint topology — graph neural network on WiFi-scan beacons.** Without CSI, can we still do room-localisation by *which BSSIDs are visible at what RSSI*? Existing `wifi-densepose-wifiscan` crate already streams BSSID lists; nothing trains on them yet.
|
||||
|
||||
### Exotic & Future (10–20 year)
|
||||
|
||||
- [ ] **R10. Through-foliage wildlife sensing.** Same physics as through-wall, but at much lower SNR. Gait recognition on a per-species basis. Practical: non-invasive population monitoring without cameras.
|
||||
- [ ] **R11. Through-bulkhead maritime crew tracking.** Steel attenuates but doesn't eliminate WiFi multipath. Limited range, requires per-vessel calibration.
|
||||
- [ ] **R12. RF "weather" mapping.** Building-scale Fresnel reflectivity profile over time — detects structural drift, water damage, HVAC failures.
|
||||
- [ ] **R13. Contactless blood pressure from sub-mm chest displacement.** Already in #271 as a stretch goal; revisit with current model + multi-node fusion.
|
||||
- [ ] **R14. Empathic appliances.** Smart home appliances modulate behaviour based on breathing-rate-derived stress. Long-horizon — needs both the sensing accuracy *and* an ethical framework.
|
||||
- [ ] **R15. RF biometric across rooms.** Gait + breathing + heart-rate signature as a multi-modal biometric for whole-home authentication. Replaces fingerprint/face on the home-network layer.
|
||||
|
||||
## Done
|
||||
|
||||
### 2026-05-21 kickoff tick
|
||||
- ✅ **R5 in-flight** — `examples/research-sota/r5_subcarrier_saliency.py` runs; first measurement on `cog-person-count` v0.0.2 ships: top-8 subcarriers spread across the band, max/mean ratio 2.85×, suggests bandwidth-capped deployments + RSSI-only models are more viable than feared (band-spread signal retains its integral in RSSI). See `R5-subcarrier-saliency.md` §"First measurement" + §"Implications".
|
||||
|
||||
## Negative results
|
||||
|
||||
(populated when we discover something doesn't work — these are explicit, not failures)
|
||||
|
||||
## Index by date
|
||||
|
||||
- 2026-05-21 — kickoff (this file)
|
||||
@@ -0,0 +1,70 @@
|
||||
# R5 — Subcarrier saliency: which CSI dimensions actually carry the signal?
|
||||
|
||||
**Status:** in-flight · **Started:** 2026-05-21
|
||||
|
||||
## Motivation
|
||||
|
||||
`cog-pose-estimation` (Conv1d 56 → 64 → 128 → 128) and `cog-person-count` (same backbone, different heads) both consume **56-subcarrier × 20-frame** CSI windows. The 56 came from the upstream `align-ground-truth.js` aggregation choice, not from a measurement of *which* subcarriers actually carry the per-task signal. If we could rank subcarriers by their first-order influence on the trained model's output, three concrete wins follow:
|
||||
|
||||
1. **Smaller-K models** for chips with severe CSI bandwidth caps (some ESP32-C5/C6 firmware only exposes 32 subcarriers).
|
||||
2. **Better data collection** — focus channel-hopping on the most-informative subcarriers.
|
||||
3. **Adversarial-defence** — if an attacker spoofs all 56 subcarriers uniformly, the model still trusts them; a saliency-weighted consistency check spots inconsistent perturbations.
|
||||
|
||||
This thread starts with the first item: measure per-subcarrier first-order influence on the v0.0.2 count model + the v0.0.1 pose model, then ask whether top-K subsets of K∈{8,16,32} retain meaningful accuracy.
|
||||
|
||||
## Method (single-tick scope)
|
||||
|
||||
For each model:
|
||||
|
||||
1. Load the trained safetensors (`cog/artifacts/count_v1.safetensors` and `cog/artifacts/pose_v1.safetensors`).
|
||||
2. Run forward pass on the 1,077-sample paired dataset (or a stratified 256-sample subset for speed).
|
||||
3. Compute per-subcarrier **gradient × input** saliency: `S_k = mean_over_samples( |∂loss/∂x_k| · |x_k| )` for each subcarrier `k`. This is the standard "input × gradient" saliency from Sundararajan et al. (Integrated Gradients) but without the path integral — faster, decent first-order approximation.
|
||||
4. Plot the 56-element saliency vector for each model. Identify top-K.
|
||||
5. Re-train each model on the top-K subcarriers only (K ∈ {8, 16, 32}). Compare accuracy.
|
||||
|
||||
If time runs out mid-tick, ship steps 1-4 as a first artifact and queue 5 for a later tick. Steps 1-4 alone produce a real result (a ranked-subcarrier list per task).
|
||||
|
||||
## Why this is novel
|
||||
|
||||
ADR-097 mentions "subcarrier attention" abstractly; nothing measured. Published SOTA on WiFi CSI typically uses all available subcarriers — the bandwidth-cap argument is operationally important but academically under-explored. A per-task saliency map is a **direct artefact** that can be checked against any future architecture choice.
|
||||
|
||||
## Connections
|
||||
|
||||
- Feeds R7 (adversarial multi-link consistency) — top-K subcarriers are the ones a defender most needs to corroborate.
|
||||
- Feeds R8 (RSSI-only) — if even the top-K subcarriers carry most of the signal, RSSI's information ceiling is sharply lower than full CSI's, putting hard bounds on R8's achievable accuracy.
|
||||
|
||||
## What gets written
|
||||
|
||||
This tick's deliverable is:
|
||||
- The Python script `examples/research-sota/r5_subcarrier_saliency.py` that computes the saliency vector for either model.
|
||||
- A first measurement (text + JSON) of saliency for the count model.
|
||||
|
||||
Step 5 (retrain on top-K) is queued for a subsequent tick.
|
||||
|
||||
## First measurement — `cog-person-count` v0.0.2 (this tick, 128 samples)
|
||||
|
||||
| Rank | Subcarrier | Saliency |
|
||||
|-----:|-----------:|---------:|
|
||||
| 1 | **41** | 0.0128 |
|
||||
| 2 | **52** | 0.0120 |
|
||||
| 3 | **30** | 0.0100 |
|
||||
| 4 | 31 | 0.0097 |
|
||||
| 5 | 10 | 0.0088 |
|
||||
| 6 | 35 | 0.0088 |
|
||||
| 7 | 2 | 0.0087 |
|
||||
| 8 | 38 | 0.0083 |
|
||||
|
||||
**Max-to-mean ratio: 2.85×** — meaningful but moderate concentration. Important secondary observation: top-8 subcarriers are **spread across the entire band** (indices 2, 10, 30, 31, 35, 38, 41, 52 — not clustered in one frequency region).
|
||||
|
||||
## Implications
|
||||
|
||||
1. **Bandwidth-cap deployment is viable.** Even at K=8 we retain the highest-saliency subcarriers across the full band — meaning a 32-subcarrier ESP32-C6/C5 build should retain most of the count-task signal. Retraining at K=8/16/32 is the next-tick experiment.
|
||||
2. **R8 (RSSI alone) is feasible-but-bounded.** RSSI is a band-aggregate scalar that loses per-subcarrier resolution. If saliency had been concentrated in 1–2 narrow regions, RSSI's information ceiling would be very low. Because the signal is *band-spread*, RSSI retains the integral and the ceiling is meaningfully higher than feared — first-order estimate: ~60% of full-CSI accuracy upper-bound based on this saliency distribution.
|
||||
3. **R7 (adversarial defence) priority list.** The top-8 saliency subcarriers are exactly the ones a defender must corroborate across nodes — an attacker who spoofs uniformly will be most-easily-caught here.
|
||||
|
||||
## Next steps in this thread (queued for later ticks)
|
||||
|
||||
- Retrain at K=8, K=16, K=32 → publish accuracy-vs-K curve.
|
||||
- Same saliency map for the pose model.
|
||||
- Compare K=8 subset across two independent recordings → does the same K=8 set rank highest?
|
||||
- Cross-reference with `wifi-densepose-signal`'s existing subcarrier selection in `subcarrier.rs`.
|
||||
@@ -0,0 +1,232 @@
|
||||
#!/usr/bin/env python3
|
||||
"""R5 — per-subcarrier input×gradient saliency for the count + pose cogs.
|
||||
|
||||
See docs/research/sota-2026-05-22/R5-subcarrier-saliency.md for context.
|
||||
|
||||
Usage:
|
||||
python examples/research-sota/r5_subcarrier_saliency.py \
|
||||
--paired data/paired/wiflow-p7-1779210883.paired.jsonl \
|
||||
--model v2/crates/cog-person-count/cog/artifacts/count_v1.safetensors \
|
||||
--kind count
|
||||
python examples/research-sota/r5_subcarrier_saliency.py \
|
||||
--paired data/paired/wiflow-p7-1779210883.paired.jsonl \
|
||||
--model v2/crates/cog-pose-estimation/cog/artifacts/pose_v1.safetensors \
|
||||
--kind pose
|
||||
|
||||
Output:
|
||||
<dirname-of-model>/saliency.json per-subcarrier saliency + top-K lists
|
||||
stdout summary table
|
||||
|
||||
Method (per ADR/research note):
|
||||
S_k = E_samples[ |dL/dx_k| * |x_k| ]
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import json
|
||||
import struct
|
||||
from pathlib import Path
|
||||
from typing import Tuple
|
||||
|
||||
import numpy as np
|
||||
|
||||
|
||||
N_SUB, N_FRAMES = 56, 20
|
||||
|
||||
|
||||
def load_paired(path: Path, kind: str, max_samples: int | None = None) -> Tuple[np.ndarray, np.ndarray]:
|
||||
"""Returns (X, y) — X is [N, 56, 20] float32, y depends on kind.
|
||||
|
||||
kind="count" → y is [N] int64 in {0..7}
|
||||
kind="pose" → y is [N, 17, 2] float32 in [0, 1]
|
||||
"""
|
||||
csis, ys = [], []
|
||||
with path.open(encoding="utf-8") as f:
|
||||
for line in f:
|
||||
if not line.strip():
|
||||
continue
|
||||
d = json.loads(line)
|
||||
shape = d.get("csi_shape", [N_SUB, N_FRAMES])
|
||||
if shape != [N_SUB, N_FRAMES]:
|
||||
continue
|
||||
csi = np.asarray(d["csi"], dtype=np.float32).reshape(N_SUB, N_FRAMES)
|
||||
csis.append(csi)
|
||||
if kind == "count":
|
||||
ys.append(int(d.get("n_persons_mode", 0)))
|
||||
elif kind == "pose":
|
||||
ys.append(np.asarray(d.get("kp", []), dtype=np.float32))
|
||||
else:
|
||||
raise ValueError(f"unknown kind: {kind}")
|
||||
if max_samples and len(csis) >= max_samples:
|
||||
break
|
||||
return np.stack(csis), np.asarray(ys, dtype=(np.int64 if kind == "count" else np.float32))
|
||||
|
||||
|
||||
def load_safetensors(path: Path) -> dict[str, np.ndarray]:
|
||||
"""Pure-python safetensors reader. Returns {name: ndarray}."""
|
||||
with path.open("rb") as f:
|
||||
hlen = struct.unpack("<Q", f.read(8))[0]
|
||||
header = json.loads(f.read(hlen).decode("utf-8"))
|
||||
out = {}
|
||||
for name, meta in header.items():
|
||||
if name == "__metadata__":
|
||||
continue
|
||||
start, end = meta["data_offsets"]
|
||||
shape = meta["shape"]
|
||||
assert meta["dtype"] == "F32", f"unsupported dtype {meta['dtype']} in {name}"
|
||||
f.seek(8 + hlen + start)
|
||||
buf = f.read(end - start)
|
||||
arr = np.frombuffer(buf, dtype=np.float32).copy().reshape(shape)
|
||||
out[name] = arr
|
||||
return out
|
||||
|
||||
|
||||
def conv1d_forward(x: np.ndarray, w: np.ndarray, b: np.ndarray, padding: int, dilation: int) -> np.ndarray:
|
||||
"""Pure-numpy Conv1d forward. x: [B, Cin, T], w: [Cout, Cin, K]. Returns [B, Cout, T']."""
|
||||
B, Cin, T = x.shape
|
||||
Cout, _, K = w.shape
|
||||
# Pad
|
||||
xp = np.pad(x, ((0, 0), (0, 0), (padding, padding)), mode="constant")
|
||||
Tp = xp.shape[2]
|
||||
# Effective filter span with dilation
|
||||
eff = (K - 1) * dilation + 1
|
||||
Tout = Tp - eff + 1
|
||||
out = np.zeros((B, Cout, Tout), dtype=np.float32)
|
||||
for k in range(K):
|
||||
# x_slice shape: [B, Cin, Tout]
|
||||
x_slice = xp[:, :, k * dilation : k * dilation + Tout]
|
||||
# w_slice shape: [Cout, Cin]
|
||||
w_slice = w[:, :, k]
|
||||
# einsum: B,Cin,T x Cout,Cin → B,Cout,T
|
||||
out += np.einsum("bct,oc->bot", x_slice, w_slice)
|
||||
return out + b[None, :, None]
|
||||
|
||||
|
||||
def relu(x: np.ndarray) -> np.ndarray:
|
||||
return np.maximum(x, 0.0)
|
||||
|
||||
|
||||
def softmax(x: np.ndarray, axis: int = -1) -> np.ndarray:
|
||||
m = x.max(axis=axis, keepdims=True)
|
||||
e = np.exp(x - m)
|
||||
return e / e.sum(axis=axis, keepdims=True)
|
||||
|
||||
|
||||
def forward_count(x: np.ndarray, w: dict[str, np.ndarray]) -> np.ndarray:
|
||||
"""CountNet forward. x: [B, 56, 20] → probs [B, 8]."""
|
||||
h = conv1d_forward(x, w["enc.c1.weight"], w["enc.c1.bias"], padding=1, dilation=1)
|
||||
h = relu(h)
|
||||
h = conv1d_forward(h, w["enc.c2.weight"], w["enc.c2.bias"], padding=2, dilation=2)
|
||||
h = relu(h)
|
||||
h = conv1d_forward(h, w["enc.c3.weight"], w["enc.c3.bias"], padding=4, dilation=4)
|
||||
h = relu(h)
|
||||
h = h.mean(axis=2) # [B, 128]
|
||||
# count head
|
||||
z = relu(h @ w["count_head.fc1.weight"].T + w["count_head.fc1.bias"])
|
||||
z = z @ w["count_head.fc2.weight"].T + w["count_head.fc2.bias"]
|
||||
return softmax(z, axis=-1)
|
||||
|
||||
|
||||
def saliency_input_gradient(
|
||||
X: np.ndarray,
|
||||
y: np.ndarray,
|
||||
weights: dict[str, np.ndarray],
|
||||
kind: str,
|
||||
eps: float = 1e-3,
|
||||
) -> np.ndarray:
|
||||
"""Per-subcarrier saliency: S_k = E[|dL/dx_k| * |x_k|].
|
||||
|
||||
Uses central-difference numerical gradient over each subcarrier (cheap because
|
||||
we marginalise over the time axis after taking the abs). For a 56-subcarrier
|
||||
input that's 56 forward passes per sample — slow but exact, and only runs
|
||||
once per saliency map.
|
||||
"""
|
||||
B, N_sub, T = X.shape
|
||||
saliency = np.zeros(N_sub, dtype=np.float64)
|
||||
|
||||
if kind == "count":
|
||||
# Loss = -log(p_true). Compute baseline log-prob.
|
||||
for k in range(N_sub):
|
||||
x_plus = X.copy()
|
||||
x_plus[:, k, :] += eps
|
||||
x_minus = X.copy()
|
||||
x_minus[:, k, :] -= eps
|
||||
p_plus = forward_count(x_plus, weights)
|
||||
p_minus = forward_count(x_minus, weights)
|
||||
# dL/dx ≈ -(log p_plus[y] - log p_minus[y]) / (2*eps)
|
||||
idx = np.arange(B)
|
||||
lp_plus = np.log(p_plus[idx, y] + 1e-12)
|
||||
lp_minus = np.log(p_minus[idx, y] + 1e-12)
|
||||
grad_k = -(lp_plus - lp_minus) / (2 * eps) # [B]
|
||||
# |dL/dx_k| * |x_k| — x_k is a vector over time; take its magnitude
|
||||
x_k_mag = np.abs(X[:, k, :]).mean(axis=1) # [B]
|
||||
saliency[k] += float((np.abs(grad_k) * x_k_mag).mean())
|
||||
else:
|
||||
raise NotImplementedError("pose kind not yet wired — count first")
|
||||
|
||||
return saliency
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument("--paired", required=True)
|
||||
parser.add_argument("--model", required=True)
|
||||
parser.add_argument("--kind", choices=["count", "pose"], default="count")
|
||||
parser.add_argument("--max-samples", type=int, default=128,
|
||||
help="Cap on samples used for saliency (saliency cost is O(N_sub × samples × eps_passes))")
|
||||
parser.add_argument("--out", default=None,
|
||||
help="Output JSON path; defaults to <model_dir>/saliency.json")
|
||||
args = parser.parse_args()
|
||||
|
||||
print(f"Loading paired data from {args.paired} (kind={args.kind})")
|
||||
X, y = load_paired(Path(args.paired), kind=args.kind, max_samples=args.max_samples)
|
||||
print(f" X: {X.shape}, y: {y.shape}")
|
||||
if args.kind == "count":
|
||||
unique, counts = np.unique(y, return_counts=True)
|
||||
print(f" label distribution: {dict(zip(unique.tolist(), counts.tolist()))}")
|
||||
|
||||
# Standardise (per-subcarrier z-score using THIS subset's stats — saliency is
|
||||
# invariant to affine input transforms in the limit of small eps).
|
||||
mu = X.mean(axis=(0, 2), keepdims=True)
|
||||
sd = X.std(axis=(0, 2), keepdims=True) + 1e-6
|
||||
X_norm = (X - mu) / sd
|
||||
|
||||
print(f"Loading weights from {args.model}")
|
||||
weights = load_safetensors(Path(args.model))
|
||||
print(f" loaded {len(weights)} tensors: {sorted(list(weights.keys()))[:6]}...")
|
||||
|
||||
print(f"Computing input×gradient saliency over {X.shape[0]} samples × 56 subcarriers...")
|
||||
saliency = saliency_input_gradient(X_norm, y, weights, kind=args.kind, eps=1e-3)
|
||||
|
||||
order = np.argsort(saliency)[::-1] # descending
|
||||
top_k = {k: order[:k].tolist() for k in (8, 16, 32)}
|
||||
|
||||
out = {
|
||||
"kind": args.kind,
|
||||
"model": str(args.model),
|
||||
"n_samples": int(X.shape[0]),
|
||||
"saliency_per_subcarrier": saliency.tolist(),
|
||||
"ranking_high_to_low": order.tolist(),
|
||||
"top_k_subcarriers": top_k,
|
||||
"saliency_summary": {
|
||||
"min": float(saliency.min()),
|
||||
"max": float(saliency.max()),
|
||||
"mean": float(saliency.mean()),
|
||||
"std": float(saliency.std()),
|
||||
"max_to_mean_ratio": float(saliency.max() / max(saliency.mean(), 1e-12)),
|
||||
},
|
||||
}
|
||||
|
||||
out_path = Path(args.out) if args.out else Path(args.model).parent / "saliency.json"
|
||||
out_path.write_text(json.dumps(out, indent=2))
|
||||
print(f"\nWrote {out_path}")
|
||||
print(f"\nTop 8 subcarriers (most influential):")
|
||||
for rank, idx in enumerate(order[:8]):
|
||||
print(f" #{rank + 1}: subcarrier {int(idx):2d} saliency={saliency[idx]:.4f}")
|
||||
print(f"\nMax/mean ratio: {out['saliency_summary']['max_to_mean_ratio']:.2f}× "
|
||||
f"(higher = signal more concentrated in a few subcarriers)")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
@@ -95,6 +95,29 @@ def temporal_split(X: np.ndarray, y: np.ndarray, eval_frac: float = 0.2):
|
||||
)
|
||||
|
||||
|
||||
def stratified_k_fold(X: np.ndarray, y: np.ndarray, k: int = 5):
|
||||
"""Stratified k-fold cross-validation splits — hand-rolled, no sklearn.
|
||||
|
||||
Per class: shuffle the indices (deterministic seed 42), split into k
|
||||
near-equal chunks, then assemble fold i by taking chunk i from every
|
||||
class. Yields (X_train, y_train, X_val, y_val) per fold, with class
|
||||
distribution preserved within ±1.
|
||||
"""
|
||||
rng = np.random.default_rng(seed=42)
|
||||
classes = np.unique(y)
|
||||
per_class_folds = {}
|
||||
for c in classes:
|
||||
idx = np.where(y == c)[0]
|
||||
rng.shuffle(idx)
|
||||
per_class_folds[c] = np.array_split(idx, k)
|
||||
for fold in range(k):
|
||||
val_idx = np.concatenate([per_class_folds[c][fold] for c in classes])
|
||||
train_idx = np.concatenate(
|
||||
[per_class_folds[c][f] for c in classes for f in range(k) if f != fold]
|
||||
)
|
||||
yield X[train_idx], y[train_idx], X[val_idx], y[val_idx]
|
||||
|
||||
|
||||
def standardise(X_train: np.ndarray, X_eval: np.ndarray):
|
||||
"""Z-score by subcarrier across the time axis. Eval uses train stats."""
|
||||
mu = X_train.mean(axis=(0, 2), keepdims=True)
|
||||
@@ -154,6 +177,12 @@ def main():
|
||||
parser.add_argument("--batch-size", type=int, default=64)
|
||||
parser.add_argument("--lr", type=float, default=1e-3)
|
||||
parser.add_argument("--weight-decay", type=float, default=0.01)
|
||||
parser.add_argument("--k-fold", type=int, default=None, help="If set, run k-fold CV; else use temporal split")
|
||||
parser.add_argument("--v2", action="store_true",
|
||||
help="v0.0.2 training: random 80/20 split + label smoothing + early stopping "
|
||||
"+ balanced sampling + temperature-scaled confidence head.")
|
||||
parser.add_argument("--label-smoothing", type=float, default=0.1)
|
||||
parser.add_argument("--patience", type=int, default=20)
|
||||
args = parser.parse_args()
|
||||
|
||||
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
|
||||
@@ -163,6 +192,378 @@ def main():
|
||||
print(f"loaded {X.shape[0]} samples, X shape {X.shape}, "
|
||||
f"label distribution: {dict(Counter(y.tolist()).most_common())}")
|
||||
|
||||
# K-fold cross-validation mode
|
||||
if args.k_fold is not None:
|
||||
print(f"\n=== {args.k_fold}-fold cross-validation ===")
|
||||
fold_results = []
|
||||
overall_t0 = time.perf_counter()
|
||||
|
||||
for fold_idx, (X_train, y_train, X_val, y_val) in enumerate(stratified_k_fold(X, y, k=args.k_fold)):
|
||||
print(f"\nFold {fold_idx + 1}/{args.k_fold}")
|
||||
X_train, X_val = standardise(X_train, X_val)
|
||||
|
||||
cls_counts = np.bincount(y_train, minlength=COUNT_CLASSES).astype(np.float32)
|
||||
cls_counts = np.where(cls_counts > 0, cls_counts, 1.0)
|
||||
cls_weight = (1.0 / cls_counts) / (1.0 / cls_counts).sum() * COUNT_CLASSES
|
||||
cls_weight_t = torch.from_numpy(cls_weight).to(device)
|
||||
|
||||
Xt = torch.from_numpy(X_train).to(device)
|
||||
yt = torch.from_numpy(y_train).to(device)
|
||||
Xv = torch.from_numpy(X_val).to(device)
|
||||
yv = torch.from_numpy(y_val).to(device)
|
||||
|
||||
model = CountNet().to(device)
|
||||
opt = torch.optim.AdamW(model.parameters(), lr=args.lr, weight_decay=args.weight_decay)
|
||||
sched = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(opt, T_0=50, T_mult=1)
|
||||
|
||||
n_train = X_train.shape[0]
|
||||
best_eval_acc = 0.0
|
||||
best_state = None
|
||||
|
||||
for epoch in range(args.epochs):
|
||||
model.train()
|
||||
perm = torch.randperm(n_train, device=device)
|
||||
train_loss = 0.0
|
||||
train_correct = 0
|
||||
n_batches = 0
|
||||
for i in range(0, n_train, args.batch_size):
|
||||
idx = perm[i : i + args.batch_size]
|
||||
xb = Xt[idx]
|
||||
yb = yt[idx]
|
||||
opt.zero_grad()
|
||||
count_logits, conf_logits = model(xb)
|
||||
ce = F.cross_entropy(count_logits, yb, weight=cls_weight_t)
|
||||
with torch.no_grad():
|
||||
pred = count_logits.argmax(dim=1)
|
||||
correct_indicator = (pred == yb).float().unsqueeze(1)
|
||||
bce = F.binary_cross_entropy_with_logits(conf_logits, correct_indicator)
|
||||
with torch.no_grad():
|
||||
conf_sigm = torch.sigmoid(conf_logits)
|
||||
brier = ((conf_sigm - correct_indicator) ** 2).mean()
|
||||
loss = ce + 0.3 * bce + 0.1 * brier
|
||||
loss.backward()
|
||||
opt.step()
|
||||
train_loss += loss.item()
|
||||
train_correct += (pred == yb).sum().item()
|
||||
n_batches += 1
|
||||
|
||||
sched.step()
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
cl_v, _ = model(Xv)
|
||||
eval_pred = cl_v.argmax(dim=1)
|
||||
eval_acc = (eval_pred == yv).float().mean().item()
|
||||
|
||||
if eval_acc > best_eval_acc:
|
||||
best_eval_acc = eval_acc
|
||||
best_state = {k: v.detach().cpu().clone() for k, v in model.state_dict().items()}
|
||||
|
||||
# Restore best checkpoint and final eval
|
||||
if best_state is not None:
|
||||
model.load_state_dict(best_state)
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
cl_v, conf_v = model(Xv)
|
||||
pred_v = cl_v.argmax(dim=1)
|
||||
acc = (pred_v == yv).float().mean().item()
|
||||
within1 = ((pred_v - yv).abs() <= 1).float().mean().item()
|
||||
mae = (pred_v - yv).abs().float().mean().item()
|
||||
|
||||
# Per-class accuracy
|
||||
per_class = {}
|
||||
for k in range(COUNT_CLASSES):
|
||||
mask = yv == k
|
||||
n = mask.sum().item()
|
||||
if n > 0:
|
||||
per_class[k] = {
|
||||
"support": int(n),
|
||||
"accuracy": ((pred_v == yv) & mask).sum().item() / n,
|
||||
}
|
||||
|
||||
# Spearman
|
||||
conf_sigm = torch.sigmoid(conf_v).squeeze(-1)
|
||||
correct = (pred_v == yv).float()
|
||||
c_rank = conf_sigm.argsort().argsort().float()
|
||||
r_rank = correct.argsort().argsort().float()
|
||||
c_centered = c_rank - c_rank.mean()
|
||||
r_centered = r_rank - r_rank.mean()
|
||||
denom = (c_centered.norm() * r_centered.norm()).item()
|
||||
spearman = (c_centered * r_centered).sum().item() / denom if denom > 0 else 0.0
|
||||
|
||||
fold_results.append({
|
||||
"fold": fold_idx + 1,
|
||||
"accuracy": acc,
|
||||
"within_pm1": within1,
|
||||
"mae": mae,
|
||||
"spearman": spearman,
|
||||
"per_class_accuracy": per_class,
|
||||
})
|
||||
print(f" accuracy={acc:.3f} within±1={within1:.3f} mae={mae:.3f} spearman={spearman:.3f}")
|
||||
|
||||
# K-fold summary
|
||||
total_time = time.perf_counter() - overall_t0
|
||||
accs = [r["accuracy"] for r in fold_results]
|
||||
within1s = [r["within_pm1"] for r in fold_results]
|
||||
maes = [r["mae"] for r in fold_results]
|
||||
spears = [r["spearman"] for r in fold_results]
|
||||
|
||||
print(f"\n=== {args.k_fold}-fold summary ({total_time:.1f} s) ===")
|
||||
print(f" accuracy: {np.mean(accs):.3f} ± {np.std(accs):.3f}")
|
||||
print(f" within ±1: {np.mean(within1s):.3f} ± {np.std(within1s):.3f}")
|
||||
print(f" MAE: {np.mean(maes):.3f} ± {np.std(maes):.3f}")
|
||||
print(f" conf↔correct Spearman: {np.mean(spears):.3f} ± {np.std(spears):.3f}")
|
||||
|
||||
# Per-class summary across folds
|
||||
for k in range(COUNT_CLASSES):
|
||||
accs_k = [r["per_class_accuracy"].get(k, {}).get("accuracy", 0.0) for r in fold_results]
|
||||
n_k = [r["per_class_accuracy"].get(k, {}).get("support", 0) for r in fold_results]
|
||||
if any(n > 0 for n in n_k):
|
||||
print(f" class {k}: {np.mean(accs_k):.3f} mean accuracy (support: {n_k})")
|
||||
|
||||
# Write k-fold results to JSON
|
||||
results = {
|
||||
"mode": "k_fold_cv",
|
||||
"k": args.k_fold,
|
||||
"backend": "pytorch-cuda" if device.type == "cuda" else "pytorch-cpu",
|
||||
"total_time_s": total_time,
|
||||
"fold_results": fold_results,
|
||||
"summary": {
|
||||
"mean_accuracy": float(np.mean(accs)),
|
||||
"std_accuracy": float(np.std(accs)),
|
||||
"mean_within_pm1": float(np.mean(within1s)),
|
||||
"std_within_pm1": float(np.std(within1s)),
|
||||
"mean_mae": float(np.mean(maes)),
|
||||
"std_mae": float(np.std(maes)),
|
||||
"mean_spearman": float(np.mean(spears)),
|
||||
"std_spearman": float(np.std(spears)),
|
||||
},
|
||||
"hyperparameters": {
|
||||
"optimizer": "AdamW",
|
||||
"lr": args.lr,
|
||||
"weight_decay": args.weight_decay,
|
||||
"batch_size": args.batch_size,
|
||||
"schedule": "cosine_warm_restarts",
|
||||
"epochs": args.epochs,
|
||||
},
|
||||
}
|
||||
Path(args.out_results).write_text(json.dumps(results, indent=2))
|
||||
print(f"\nwrote {args.out_results}")
|
||||
return
|
||||
|
||||
# ---------------------------------------------------------------
|
||||
# v0.0.2 training path: random 80/20 + label smoothing + early
|
||||
# stopping + class-balanced batch sampling + temperature scaling.
|
||||
# ---------------------------------------------------------------
|
||||
if args.v2:
|
||||
rng = np.random.default_rng(seed=42)
|
||||
idx = np.arange(X.shape[0])
|
||||
rng.shuffle(idx)
|
||||
n_eval = int(round(0.2 * X.shape[0]))
|
||||
eval_idx, train_idx = idx[:n_eval], idx[n_eval:]
|
||||
X_train, X_eval = X[train_idx], X[eval_idx]
|
||||
y_train, y_eval = y[train_idx], y[eval_idx]
|
||||
X_train, X_eval = standardise(X_train, X_eval)
|
||||
print(f"v0.0.2 mode — random 80/20 split: train={len(y_train)} eval={len(y_eval)}")
|
||||
print(f" train class dist: {dict(Counter(y_train.tolist()).most_common())}")
|
||||
print(f" eval class dist: {dict(Counter(y_eval.tolist()).most_common())}")
|
||||
|
||||
Xt = torch.from_numpy(X_train).to(device)
|
||||
yt = torch.from_numpy(y_train).to(device)
|
||||
Xe = torch.from_numpy(X_eval).to(device)
|
||||
ye = torch.from_numpy(y_eval).to(device)
|
||||
|
||||
# Class-balanced sampler: for each batch, sample with replacement
|
||||
# so each class has equal expected count regardless of dataset
|
||||
# distribution. With our ~533/544 split this is nearly a no-op
|
||||
# but it generalises to imbalanced multi-room data later.
|
||||
cls_counts = np.bincount(y_train, minlength=COUNT_CLASSES).astype(np.float32)
|
||||
cls_counts = np.where(cls_counts > 0, cls_counts, 1.0)
|
||||
per_sample_weight = (1.0 / cls_counts[y_train])
|
||||
per_sample_weight_t = torch.from_numpy(per_sample_weight.astype(np.float32)).to(device)
|
||||
|
||||
model = CountNet().to(device)
|
||||
opt = torch.optim.AdamW(model.parameters(), lr=args.lr, weight_decay=args.weight_decay)
|
||||
sched = torch.optim.lr_scheduler.CosineAnnealingWarmRestarts(opt, T_0=50, T_mult=1)
|
||||
|
||||
n_train = X_train.shape[0]
|
||||
batches_per_epoch = max(1, n_train // args.batch_size)
|
||||
epoch_losses = []
|
||||
t0 = time.perf_counter()
|
||||
best_eval_acc = 0.0
|
||||
best_state = None
|
||||
epochs_without_improvement = 0
|
||||
|
||||
for epoch in range(args.epochs):
|
||||
model.train()
|
||||
train_loss = 0.0; train_correct = 0; n_batches = 0
|
||||
for _ in range(batches_per_epoch):
|
||||
# Balanced sample with replacement
|
||||
idx_t = torch.multinomial(per_sample_weight_t, args.batch_size, replacement=True)
|
||||
xb = Xt[idx_t]; yb = yt[idx_t]
|
||||
opt.zero_grad()
|
||||
count_logits, conf_logits = model(xb)
|
||||
ce = F.cross_entropy(count_logits, yb, label_smoothing=args.label_smoothing)
|
||||
with torch.no_grad():
|
||||
pred = count_logits.argmax(dim=1)
|
||||
correct_indicator = (pred == yb).float().unsqueeze(1)
|
||||
bce = F.binary_cross_entropy_with_logits(conf_logits, correct_indicator)
|
||||
with torch.no_grad():
|
||||
conf_sigm = torch.sigmoid(conf_logits)
|
||||
brier = ((conf_sigm - correct_indicator) ** 2).mean()
|
||||
loss = ce + 0.3 * bce + 0.1 * brier
|
||||
loss.backward()
|
||||
opt.step()
|
||||
train_loss += loss.item()
|
||||
train_correct += (pred == yb).sum().item()
|
||||
n_batches += 1
|
||||
sched.step()
|
||||
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
cl_e, _ = model(Xe)
|
||||
eval_loss = F.cross_entropy(cl_e, ye).item()
|
||||
eval_pred = cl_e.argmax(dim=1)
|
||||
eval_acc = (eval_pred == ye).float().mean().item()
|
||||
epoch_losses.append({
|
||||
"epoch": epoch,
|
||||
"train_loss": train_loss / max(1, n_batches),
|
||||
"train_acc": train_correct / max(1, n_batches * args.batch_size),
|
||||
"eval_loss": eval_loss,
|
||||
"eval_acc": eval_acc,
|
||||
})
|
||||
if eval_acc > best_eval_acc:
|
||||
best_eval_acc = eval_acc
|
||||
best_state = {k: v.detach().cpu().clone() for k, v in model.state_dict().items()}
|
||||
epochs_without_improvement = 0
|
||||
else:
|
||||
epochs_without_improvement += 1
|
||||
|
||||
if epoch < 5 or epoch % 25 == 0:
|
||||
print(f"epoch {epoch:3d} train_loss={train_loss/n_batches:.4f} "
|
||||
f"train_acc={train_correct/(n_batches*args.batch_size):.3f} "
|
||||
f"eval_loss={eval_loss:.4f} eval_acc={eval_acc:.3f} "
|
||||
f"epochs_no_improve={epochs_without_improvement}")
|
||||
if epochs_without_improvement >= args.patience:
|
||||
print(f"early stopping at epoch {epoch} (no improvement for {args.patience} epochs)")
|
||||
break
|
||||
|
||||
train_time = time.perf_counter() - t0
|
||||
print(f"\ntrained {epoch + 1} epochs in {train_time:.1f} s (best eval_acc {best_eval_acc:.3f})")
|
||||
if best_state is not None:
|
||||
model.load_state_dict(best_state)
|
||||
|
||||
# Temperature scaling on the confidence head — fit a scalar T s.t.
|
||||
# sigmoid(conf_logits / T) is best-calibrated on the eval set.
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
cl_e, conf_e = model(Xe)
|
||||
pred_e = cl_e.argmax(dim=1)
|
||||
correct_indicator = (pred_e == ye).float()
|
||||
# 1D optimisation over T via LBFGS.
|
||||
T = torch.nn.Parameter(torch.ones(1, device=device))
|
||||
opt_t = torch.optim.LBFGS([T], lr=0.1, max_iter=50)
|
||||
def eval_t():
|
||||
opt_t.zero_grad()
|
||||
scaled = conf_e.squeeze(-1) / T
|
||||
loss_t = F.binary_cross_entropy_with_logits(scaled, correct_indicator)
|
||||
loss_t.backward()
|
||||
return loss_t
|
||||
opt_t.step(eval_t)
|
||||
T_val = float(T.detach().cpu().item())
|
||||
print(f" temperature scale T = {T_val:.4f}")
|
||||
|
||||
# Final eval with temperature applied.
|
||||
with torch.no_grad():
|
||||
cl_e, conf_e = model(Xe)
|
||||
probs_e = F.softmax(cl_e, dim=1)
|
||||
pred_e = cl_e.argmax(dim=1)
|
||||
acc = (pred_e == ye).float().mean().item()
|
||||
within1 = ((pred_e - ye).abs() <= 1).float().mean().item()
|
||||
mae = (pred_e - ye).abs().float().mean().item()
|
||||
per_class = {}
|
||||
for k in range(COUNT_CLASSES):
|
||||
mask = ye == k
|
||||
n = mask.sum().item()
|
||||
if n > 0:
|
||||
per_class[k] = {
|
||||
"support": int(n),
|
||||
"accuracy": ((pred_e == ye) & mask).sum().item() / n,
|
||||
}
|
||||
conf_sigm = torch.sigmoid(conf_e.squeeze(-1) / T_val)
|
||||
correct = (pred_e == ye).float()
|
||||
c_rank = conf_sigm.argsort().argsort().float()
|
||||
r_rank = correct.argsort().argsort().float()
|
||||
c_centered = c_rank - c_rank.mean()
|
||||
r_centered = r_rank - r_rank.mean()
|
||||
denom = (c_centered.norm() * r_centered.norm()).item()
|
||||
spearman = (c_centered * r_centered).sum().item() / denom if denom > 0 else 0.0
|
||||
|
||||
print(f"\n=== v0.0.2 final eval ===")
|
||||
print(f" accuracy: {acc:.3f}")
|
||||
print(f" within ±1: {within1:.3f}")
|
||||
print(f" MAE: {mae:.3f}")
|
||||
print(f" conf↔correct Spearman (post-temp): {spearman:.3f}")
|
||||
for k, v in per_class.items():
|
||||
print(f" class {k}: {v['accuracy']:.3f} accuracy on {v['support']} samples")
|
||||
|
||||
write_safetensors(model, Path(args.out_safetensors))
|
||||
# Also append the temperature scalar so the cog can apply it.
|
||||
# We add it by appending to the safetensors file using the
|
||||
# write_safetensors helper but with the temperature recorded
|
||||
# as a separate file alongside (count_v1.temperature.txt) for
|
||||
# consumption by the Rust cog inference path.
|
||||
Path(args.out_safetensors + ".temperature").write_text(f"{T_val}\n")
|
||||
print(f"wrote {args.out_safetensors} ({Path(args.out_safetensors).stat().st_size} bytes)")
|
||||
print(f"wrote {args.out_safetensors}.temperature ({T_val})")
|
||||
|
||||
# ONNX
|
||||
dummy = torch.zeros(1, N_SUB, N_FRAMES, device=device)
|
||||
try:
|
||||
torch.onnx.export(model, dummy, args.out_onnx, opset_version=18,
|
||||
input_names=["csi_window"],
|
||||
output_names=["count_logits", "conf_logits"],
|
||||
dynamic_axes={"csi_window": {0: "batch"},
|
||||
"count_logits": {0: "batch"},
|
||||
"conf_logits": {0: "batch"}},
|
||||
export_params=True, do_constant_folding=True)
|
||||
print(f"wrote {args.out_onnx} ({Path(args.out_onnx).stat().st_size} bytes)")
|
||||
except Exception as e:
|
||||
print(f"WARN: ONNX export failed: {e}")
|
||||
|
||||
results = {
|
||||
"mode": "v0.0.2",
|
||||
"backend": "pytorch-cuda" if device.type == "cuda" else "pytorch-cpu",
|
||||
"epochs_trained": epoch + 1,
|
||||
"train_time_s": train_time,
|
||||
"best_eval_acc": best_eval_acc,
|
||||
"final_eval_acc": acc,
|
||||
"final_eval_within_pm1": within1,
|
||||
"final_eval_mae": mae,
|
||||
"temperature_scale": T_val,
|
||||
"conf_correctness_spearman_post_temp": spearman,
|
||||
"per_class_accuracy": per_class,
|
||||
"hyperparameters": {
|
||||
"optimizer": "AdamW",
|
||||
"lr": args.lr,
|
||||
"weight_decay": args.weight_decay,
|
||||
"batch_size": args.batch_size,
|
||||
"schedule": "cosine_warm_restarts",
|
||||
"epochs_max": args.epochs,
|
||||
"label_smoothing": args.label_smoothing,
|
||||
"patience": args.patience,
|
||||
"split": "random_80_20_seed_42",
|
||||
"balanced_sampler": True,
|
||||
"temperature_scaling": True,
|
||||
},
|
||||
"epoch_losses": epoch_losses,
|
||||
}
|
||||
Path(args.out_results).write_text(json.dumps(results, indent=2))
|
||||
print(f"wrote {args.out_results}")
|
||||
return
|
||||
|
||||
# Original temporal-split mode (kept for v0.0.1 reproducibility).
|
||||
X_train, y_train, X_eval, y_eval = temporal_split(X, y, eval_frac=0.2)
|
||||
X_train, X_eval = standardise(X_train, X_eval)
|
||||
|
||||
|
||||
@@ -47,6 +47,17 @@ Downstream consumers can render the **most-likely count** when confidence is hig
|
||||
|
||||
`cog-person-count health` will load the real safetensors and report `backend: candle-cpu` rather than `backend: stub`, so the cog-gateway can verify the model loaded — but operators should treat the v0.0.1 count outputs as scaffold-validation rather than production data. The 2.36 MB binary + 392 KB weights + 16 KB ONNX are all real and reusable as soon as more data lands.
|
||||
|
||||
## Relationship to the in-process `csi.rs::score_to_person_count` heuristic
|
||||
|
||||
This Cog runs **out-of-process** alongside `wifi-densepose-sensing-server`. The two are complementary, not competing:
|
||||
|
||||
- The sensing-server keeps emitting its existing slot-count heuristic from `csi.rs::score_to_person_count` (PR #491's RollingP95 + `dedup_factor`). This is the **fallback path** — operators who don't install `cog-person-count` still get a count number, just a less calibrated one.
|
||||
- `cog-person-count` (this binary) polls the same `/api/v1/sensing/latest` endpoint, runs the learned `count_v1` model on each window, and emits `person.count` events on stdout. The appliance's `cognitum-cog-gateway` routes those events to the dashboard via the standard ADR-220 cog-event channel.
|
||||
|
||||
Operators choose by **installing or not installing** this Cog — no sensing-server rebuild required. Downstream consumers (UI, fleet automation, alerting rules) can subscribe to whichever event stream they prefer.
|
||||
|
||||
The architecture decision is documented in [ADR-103 §"Deployment"](../../../../docs/adr/ADR-103-learned-multi-person-counter.md#deployment) and matches the cog/sensing-server boundary established for `cog-pose-estimation` (ADR-101).
|
||||
|
||||
## Security
|
||||
|
||||
The cog has a very small attack surface — by design, it's a pure consumer of CSI data, not a server:
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
Binary file not shown.
Binary file not shown.
@@ -0,0 +1 @@
|
||||
0.9261822700500488
|
||||
@@ -1,25 +1,27 @@
|
||||
{
|
||||
"id": "person-count",
|
||||
"version": "0.0.1",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-person-count-arm",
|
||||
"binary_bytes": 2168816,
|
||||
"binary_sha256": "36bc0bb0ece894350377d5f93d46cd29378cb289b3773530611c0d47b507b3c3",
|
||||
"binary_signature": "R/00xdzHriyr/2rzr4wmPJ/Ken60A+RNdi8r0g2HYJNTXBaFtr46ExfNbiHlgYWadQXzTZdfJoyJK+a6k71NDg==",
|
||||
"weights_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors",
|
||||
"weights_bytes": 392088,
|
||||
"weights_sha256": "dacb0551fd3887958db19696d90d811ab08faa44703e6e04ff56d15c3a65a9ff",
|
||||
"arch": "arm",
|
||||
"target_triple": "aarch64-unknown-linux-gnu",
|
||||
"installed_at": 0,
|
||||
"status": "installed",
|
||||
"signed_by": "COGNITUM_OWNER_SIGNING_KEY",
|
||||
"sig_algo": "Ed25519",
|
||||
"binary_bytes": 3807456,
|
||||
"binary_sha256": "15c2fbac19741298ad1cbaf119c633a42db0a273099561fd57d8afce27728ea5",
|
||||
"binary_signature": "gyV2CDhJo5nqBnREA08KnztGsS7AFOuXCse+2/+wul8DAzerHs9p4L6eUgl8QeiDS9rdQZs33XRxH5WTbkT0Ag==",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-person-count-arm",
|
||||
"build_metadata": {
|
||||
"rust": "1.95.0",
|
||||
"candle": "0.9 cpu",
|
||||
"cog_person_count_version": "0.3.0",
|
||||
"training_eval_accuracy": 0.651,
|
||||
"rust": "1.95.0",
|
||||
"training_caveat": "random 80/20 split + label smoothing + early stopping + balanced sampler + temperature calibration. K-fold reference: class-1 mean 57.1% across 5 folds.",
|
||||
"training_class1_accuracy": 0.343,
|
||||
"training_eval_accuracy": 0.623,
|
||||
"training_eval_mae": 0.349,
|
||||
"training_caveat": "single-session data; class-1 accuracy 0% — see docs/benchmarks/person-count-cog.md"
|
||||
}
|
||||
}
|
||||
"training_temperature_scale": 0.9262
|
||||
},
|
||||
"id": "person-count",
|
||||
"installed_at": 0,
|
||||
"sig_algo": "Ed25519",
|
||||
"signed_by": "COGNITUM_OWNER_SIGNING_KEY",
|
||||
"status": "installed",
|
||||
"target_triple": "aarch64-unknown-linux-gnu",
|
||||
"version": "0.0.2",
|
||||
"weights_bytes": 392088,
|
||||
"weights_sha256": "32996433516891a37c63c600db8b95e42192a53bd538c088c82cd6a85e55513c",
|
||||
"weights_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors"
|
||||
}
|
||||
@@ -1,25 +1,27 @@
|
||||
{
|
||||
"id": "person-count",
|
||||
"version": "0.0.1",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/x86_64/cog-person-count-x86_64",
|
||||
"binary_bytes": 2615528,
|
||||
"binary_sha256": "76cdd1ec40211add90b4942a09f79939aa28210a27e931de67122357392b01db",
|
||||
"binary_signature": "QB+8cnGSMQmubSt/KWVu1+JMg37AKnQXDsFQi/vi+jqpW9rVrGMtnxQpWEWZPeWU1AJ6pl3O2V+7ZtTNIQ2rDg==",
|
||||
"weights_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors",
|
||||
"weights_bytes": 392088,
|
||||
"weights_sha256": "dacb0551fd3887958db19696d90d811ab08faa44703e6e04ff56d15c3a65a9ff",
|
||||
"arch": "x86_64",
|
||||
"target_triple": "x86_64-unknown-linux-gnu",
|
||||
"installed_at": 0,
|
||||
"status": "installed",
|
||||
"signed_by": "COGNITUM_OWNER_SIGNING_KEY",
|
||||
"sig_algo": "Ed25519",
|
||||
"binary_bytes": 4502960,
|
||||
"binary_sha256": "051614ce6ba63df704fae848a67ad095df4bb88862fdff05ef3c0419cc8388b3",
|
||||
"binary_signature": "P9txCcsqCoFN6LyZS+Hl33pYZxiP/nXJMTI6s4bt26cc+Cteidz7ymajCQIfuq0mx0cnWaQ6eKZUjzq5AIgoBw==",
|
||||
"binary_url": "https://storage.googleapis.com/cognitum-apps/cogs/x86_64/cog-person-count-x86_64",
|
||||
"build_metadata": {
|
||||
"rust": "1.95.0",
|
||||
"candle": "0.9 cpu",
|
||||
"cog_person_count_version": "0.3.0",
|
||||
"training_eval_accuracy": 0.651,
|
||||
"rust": "1.95.0",
|
||||
"training_caveat": "random 80/20 split + label smoothing + early stopping + balanced sampler + temperature calibration. K-fold reference: class-1 mean 57.1% across 5 folds.",
|
||||
"training_class1_accuracy": 0.343,
|
||||
"training_eval_accuracy": 0.623,
|
||||
"training_eval_mae": 0.349,
|
||||
"training_caveat": "single-session data; class-1 accuracy 0% — see docs/benchmarks/person-count-cog.md"
|
||||
}
|
||||
}
|
||||
"training_temperature_scale": 0.9262
|
||||
},
|
||||
"id": "person-count",
|
||||
"installed_at": 0,
|
||||
"sig_algo": "Ed25519",
|
||||
"signed_by": "COGNITUM_OWNER_SIGNING_KEY",
|
||||
"status": "installed",
|
||||
"target_triple": "x86_64-unknown-linux-gnu",
|
||||
"version": "0.0.2",
|
||||
"weights_bytes": 392088,
|
||||
"weights_sha256": "32996433516891a37c63c600db8b95e42192a53bd538c088c82cd6a85e55513c",
|
||||
"weights_url": "https://storage.googleapis.com/cognitum-apps/cogs/arm/cog-person-count-count_v1.safetensors"
|
||||
}
|
||||
@@ -0,0 +1,192 @@
|
||||
{
|
||||
"kind": "count",
|
||||
"model": "v2/crates/cog-person-count/cog/artifacts/count_v1.safetensors",
|
||||
"n_samples": 128,
|
||||
"saliency_per_subcarrier": [
|
||||
0.0022704999428242445,
|
||||
0.003454199293628335,
|
||||
0.008727867156267166,
|
||||
0.006414174102246761,
|
||||
0.007945921272039413,
|
||||
0.005371364764869213,
|
||||
0.002526703756302595,
|
||||
0.003480477025732398,
|
||||
0.0029449211433529854,
|
||||
0.0013240973930805922,
|
||||
0.008836368098855019,
|
||||
0.0049454583786427975,
|
||||
0.003213808871805668,
|
||||
0.0017830731812864542,
|
||||
0.0015325949061661959,
|
||||
0.00322981970384717,
|
||||
0.00265303160995245,
|
||||
0.0015145435463637114,
|
||||
0.004348318092525005,
|
||||
0.003088578814640641,
|
||||
0.007093404419720173,
|
||||
0.00518156960606575,
|
||||
0.004933001007884741,
|
||||
0.0023939507082104683,
|
||||
0.004226110875606537,
|
||||
0.004997228272259235,
|
||||
0.0018603518838062882,
|
||||
0.0030096496921032667,
|
||||
0.0012774590868502855,
|
||||
0.0014232051325961947,
|
||||
0.009996140375733376,
|
||||
0.009672785177826881,
|
||||
0.0048093050718307495,
|
||||
0.0034254370257258415,
|
||||
0.002622435335069895,
|
||||
0.00878047849982977,
|
||||
0.006196534726768732,
|
||||
0.004779303912073374,
|
||||
0.008283626288175583,
|
||||
0.002107388572767377,
|
||||
0.004639340564608574,
|
||||
0.01281243097037077,
|
||||
0.001995982602238655,
|
||||
0.0019312826916575432,
|
||||
0.004808980971574783,
|
||||
0.0033761016093194485,
|
||||
0.0031302704010158777,
|
||||
0.0016994723118841648,
|
||||
0.004999841097742319,
|
||||
0.006001387722790241,
|
||||
0.00319978641346097,
|
||||
0.004073913209140301,
|
||||
0.011981681920588017,
|
||||
0.002540081739425659,
|
||||
0.0021413916256278753,
|
||||
0.005799528677016497
|
||||
],
|
||||
"ranking_high_to_low": [
|
||||
41,
|
||||
52,
|
||||
30,
|
||||
31,
|
||||
10,
|
||||
35,
|
||||
2,
|
||||
38,
|
||||
4,
|
||||
20,
|
||||
3,
|
||||
36,
|
||||
49,
|
||||
55,
|
||||
5,
|
||||
21,
|
||||
48,
|
||||
25,
|
||||
11,
|
||||
22,
|
||||
32,
|
||||
44,
|
||||
37,
|
||||
40,
|
||||
18,
|
||||
24,
|
||||
51,
|
||||
7,
|
||||
1,
|
||||
33,
|
||||
45,
|
||||
15,
|
||||
12,
|
||||
50,
|
||||
46,
|
||||
19,
|
||||
27,
|
||||
8,
|
||||
16,
|
||||
34,
|
||||
53,
|
||||
6,
|
||||
23,
|
||||
0,
|
||||
54,
|
||||
39,
|
||||
42,
|
||||
43,
|
||||
26,
|
||||
13,
|
||||
47,
|
||||
14,
|
||||
17,
|
||||
29,
|
||||
9,
|
||||
28
|
||||
],
|
||||
"top_k_subcarriers": {
|
||||
"8": [
|
||||
41,
|
||||
52,
|
||||
30,
|
||||
31,
|
||||
10,
|
||||
35,
|
||||
2,
|
||||
38
|
||||
],
|
||||
"16": [
|
||||
41,
|
||||
52,
|
||||
30,
|
||||
31,
|
||||
10,
|
||||
35,
|
||||
2,
|
||||
38,
|
||||
4,
|
||||
20,
|
||||
3,
|
||||
36,
|
||||
49,
|
||||
55,
|
||||
5,
|
||||
21
|
||||
],
|
||||
"32": [
|
||||
41,
|
||||
52,
|
||||
30,
|
||||
31,
|
||||
10,
|
||||
35,
|
||||
2,
|
||||
38,
|
||||
4,
|
||||
20,
|
||||
3,
|
||||
36,
|
||||
49,
|
||||
55,
|
||||
5,
|
||||
21,
|
||||
48,
|
||||
25,
|
||||
11,
|
||||
22,
|
||||
32,
|
||||
44,
|
||||
37,
|
||||
40,
|
||||
18,
|
||||
24,
|
||||
51,
|
||||
7,
|
||||
1,
|
||||
33,
|
||||
45,
|
||||
15
|
||||
]
|
||||
},
|
||||
"saliency_summary": {
|
||||
"min": 0.0012774590868502855,
|
||||
"max": 0.01281243097037077,
|
||||
"mean": 0.004496547522389197,
|
||||
"std": 0.002736047675826084,
|
||||
"max_to_mean_ratio": 2.8493929857463196
|
||||
}
|
||||
}
|
||||
@@ -10,6 +10,7 @@
|
||||
pub mod fusion;
|
||||
pub mod inference;
|
||||
pub mod publisher;
|
||||
pub mod runtime;
|
||||
|
||||
pub const COG_ID: &str = "person-count";
|
||||
pub const COG_VERSION: &str = env!("CARGO_PKG_VERSION");
|
||||
|
||||
@@ -103,10 +103,31 @@ fn cmd_health() -> Result<(), Box<dyn std::error::Error>> {
|
||||
Ok(())
|
||||
}
|
||||
|
||||
fn cmd_run(_config_path: PathBuf) -> Result<(), Box<dyn std::error::Error>> {
|
||||
// Long-running mode is wired in the v0.0.1 release follow-up — same
|
||||
// approach as cog-pose-estimation's runtime.rs. For now, the cog
|
||||
// satisfies the four-verb contract; downstream consumers integrate
|
||||
// via the in-process `InferenceEngine` API.
|
||||
Err("`run` subcommand wiring is pending v0.0.1 — for now consume via the InferenceEngine library API".into())
|
||||
fn cmd_run(config_path: PathBuf) -> Result<(), Box<dyn std::error::Error>> {
|
||||
let raw = std::fs::read_to_string(&config_path)
|
||||
.map_err(|e| format!("failed to read config at {}: {}", config_path.display(), e))?;
|
||||
let cfg: RunConfig = serde_json::from_str(&raw)
|
||||
.map_err(|e| format!("failed to parse config at {}: {}", config_path.display(), e))?;
|
||||
|
||||
let engine = InferenceEngine::with_weights(cfg.model_path.as_deref())?;
|
||||
publisher::run_started(
|
||||
COG_ID,
|
||||
&cfg.sensing_url,
|
||||
cfg.poll_ms,
|
||||
&cfg.model_path
|
||||
.as_ref()
|
||||
.map(|p| p.display().to_string())
|
||||
.unwrap_or_else(|| "(auto-discover)".to_string()),
|
||||
);
|
||||
|
||||
let rt = tokio::runtime::Builder::new_multi_thread()
|
||||
.enable_all()
|
||||
.build()?;
|
||||
rt.block_on(cog_person_count::runtime::run_loop(
|
||||
cog_person_count::runtime::RunConfig {
|
||||
sensing_url: cfg.sensing_url,
|
||||
poll_ms: cfg.poll_ms,
|
||||
},
|
||||
engine,
|
||||
))
|
||||
}
|
||||
|
||||
@@ -0,0 +1,77 @@
|
||||
//! Long-running inference loop. Polls the appliance's sensing-server,
|
||||
//! slides a CSI window, runs the count head, and emits `person.count`
|
||||
//! events. Same shape as `cog-pose-estimation::runtime`.
|
||||
//!
|
||||
//! Multi-node fusion is single-node only in v0.0.1 — the appliance's
|
||||
//! `/api/v1/sensing/latest` endpoint already aggregates across nodes
|
||||
//! before serving, so per-cog fusion is deferred until each node ships
|
||||
//! raw frames separately (ADR-103 §"Multi-node fusion" v0.2.0).
|
||||
|
||||
use crate::inference::{CsiWindow, InferenceEngine, INPUT_SUBCARRIERS, INPUT_TIMESTEPS};
|
||||
use crate::publisher;
|
||||
use std::time::Duration;
|
||||
use tokio::time::sleep;
|
||||
|
||||
pub struct RunConfig {
|
||||
pub sensing_url: String,
|
||||
pub poll_ms: u64,
|
||||
}
|
||||
|
||||
pub async fn run_loop(
|
||||
cfg: RunConfig,
|
||||
engine: InferenceEngine,
|
||||
) -> Result<(), Box<dyn std::error::Error>> {
|
||||
let mut buffer: Vec<f32> = Vec::with_capacity(INPUT_SUBCARRIERS * INPUT_TIMESTEPS);
|
||||
let cap = INPUT_SUBCARRIERS * INPUT_TIMESTEPS;
|
||||
let mut tick: u64 = 0;
|
||||
|
||||
loop {
|
||||
match fetch_frame(&cfg.sensing_url).await {
|
||||
Ok(amplitudes) => {
|
||||
tick += 1;
|
||||
buffer.extend(amplitudes);
|
||||
while buffer.len() > 2 * cap {
|
||||
let extra = buffer.len() - cap;
|
||||
buffer.drain(0..extra);
|
||||
}
|
||||
if buffer.len() >= cap {
|
||||
let window = CsiWindow { data: buffer[buffer.len() - cap..].to_vec() };
|
||||
if let Ok(pred) = engine.infer(&window) {
|
||||
// v0.0.1 ships single-node — fusion is a no-op for
|
||||
// N=1. v0.2.0 will append additional per-node
|
||||
// predictions to a vec and call
|
||||
// `fusion::fuse_confidence_weighted` before emit.
|
||||
publisher::person_count(tick, &pred, 1);
|
||||
}
|
||||
}
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::warn!(error = %e, "sensing-server fetch failed");
|
||||
}
|
||||
}
|
||||
sleep(Duration::from_millis(cfg.poll_ms)).await;
|
||||
}
|
||||
}
|
||||
|
||||
async fn fetch_frame(url: &str) -> Result<Vec<f32>, Box<dyn std::error::Error>> {
|
||||
let url = url.to_string();
|
||||
let body = tokio::task::spawn_blocking(move || -> Result<String, ureq::Error> {
|
||||
Ok(ureq::get(&url).call()?.into_string()?)
|
||||
})
|
||||
.await??;
|
||||
let json: serde_json::Value = serde_json::from_str(&body)?;
|
||||
let snapshot = json.get("snapshot").unwrap_or(&json);
|
||||
let nodes = snapshot
|
||||
.get("nodes")
|
||||
.and_then(|v| v.as_array())
|
||||
.ok_or("missing nodes[]")?;
|
||||
let amplitude = nodes
|
||||
.first()
|
||||
.and_then(|n| n.get("amplitude"))
|
||||
.and_then(|v| v.as_array())
|
||||
.ok_or("missing nodes[0].amplitude[]")?;
|
||||
Ok(amplitude
|
||||
.iter()
|
||||
.filter_map(|v| v.as_f64().map(|f| f as f32))
|
||||
.collect())
|
||||
}
|
||||
Reference in New Issue
Block a user