feat(cog-person-count): train count_v1.safetensors — honest v0.0.1 (ADR-103) (#695)

Phase 2 of ADR-103: trained count head on the existing 1,077 paired
samples (the same data that produced pose_v1 yesterday).

Honest result: 65.1% eval accuracy / 100% within ±1 / MAE 0.349 on
the held-out time-window. Per-class: 100% on "empty room" / 0% on
"1 person". The model overfit by epoch 100 (train_acc → 1.0,
eval_loss climbed 0.67 → 7.8) and the "best" checkpoint is the
snapshot that happened to predict the eval window's class
distribution (140/215 = 65.1%, matches eval_acc exactly). Confidence
head Spearman = 0.023 ⇒ uncalibrated. Same data-bound failure mode
as pose_v1 (#645), bounded by single-session training data; same
fix path (multi-room).

What v0.0.1 still validates end-to-end:
* PyTorch → safetensors → Candle Rust loads cleanly on first try.
  `cog-person-count health` reports `backend: candle-cpu` and emits
  real per-frame predictions instead of the stub backend's hard-coded
  {1 person, 0 confidence}. Architecture parity between train-count.py
  and src/inference.rs::CountNet is bit-exact.
* ONNX export bit-clean (16 KB, opset 18, dynamic batch axis).
* Training wall time: 5.6 s for 400 epochs on RTX 5080.
* Binary size unchanged (2.36 MB stripped), model loads via mmap at
  runtime.

This commit ships:

* scripts/align-ground-truth.js: extended to emit n_persons_mode +
  n_persons_max per window so the training pipeline has count
  labels. Backwards-compatible (additive fields).
* scripts/train-count.py: new — mirrors CountNet architecture
  exactly, loads paired.jsonl, trains 400 epochs with
  CE+BCE+Brier loss, exports safetensors + ONNX + per-epoch JSON.
* v2/.../cog/artifacts/{count_v1.safetensors,count_v1.onnx,
  count_train_results.json}: the trained artifacts.
* v2/.../cog/README.md: Status table updated with the v0.0.1 numbers
  + an Honest Caveat section explaining the data-bound result.
* docs/benchmarks/person-count-cog.md: new — full v0.0.1 benchmark
  log mirroring the format docs/benchmarks/pose-estimation-cog.md
  established. Includes comparison to ADR-103 v0.1.0 acceptance
  gates and per-class breakdown.

Still pending:
* `run` subcommand wiring (long-running polling loop, same as pose)
* Cross-compile + sign + GCS upload (mirror of pose cog pipeline)
* Live install on cognitum-v0
* v0.2.0: re-train on multi-room data, LoRA per-room adapters,
  Stoer-Wagner min-cut clip in fusion stage
This commit is contained in:
rUv
2026-05-21 18:56:52 -04:00
committed by GitHub
parent 6959a42312
commit 6b4994e105
7 changed files with 3719 additions and 6 deletions
+21
View File
@@ -481,12 +481,33 @@ function align() {
? extractCsiMatrix(window)
: extractFeatureMatrix(window);
// ADR-103: aggregate `n_persons` per window so the cog-person-count
// training pipeline has count labels. Two summaries:
// - `n_persons_mode` — modal value across the camera frames in
// the window. Robust to single-frame noise;
// this is the supervised label for the
// categorical {0..7} count head.
// - `n_persons_max` — the maximum value seen in the window.
// Useful as a soft upper bound (e.g. for
// dynamic dropout weighting during training).
const personCounts = matched.map(f => f.nPersons ?? 0);
const counts = new Map();
for (const v of personCounts) counts.set(v, (counts.get(v) ?? 0) + 1);
let modeVal = 0;
let modeCount = -1;
for (const [v, n] of counts) {
if (n > modeCount) { modeVal = v; modeCount = n; }
}
const maxVal = personCounts.reduce((a, b) => Math.max(a, b), 0);
paired.push({
csi: csiMatrix.data,
csi_shape: csiMatrix.shape,
kp: keypoints,
conf: Math.round(avgConfidence * 1000) / 1000,
n_camera_frames: matched.length,
n_persons_mode: modeVal,
n_persons_max: maxVal,
ts_start: new Date(tStartMs).toISOString(),
ts_end: new Date(tEndMs).toISOString(),
});