mirror of
https://github.com/ruvnet/RuView
synced 2026-06-18 11:43:19 +00:00
17471e93ff
* feat(calibration): NodeGeometry transceiver-geometry recording (ADR-152 §2.1.1) PerceptAlign-motivated geometry capture at enrollment: per-node optional records (position, antenna orientation, inter-node distances, acquisition method) — recorded when known, never required. Event-sourced via EnrollmentEvent::GeometryRecorded (latest recording wins); persisted on SpecialistBank with serde defaults so pre-ADR-152 bank JSON loads cleanly (fixture-proven, and geometry-free banks serialize byte-shape-identical to the old schema); threaded through MultiNodeMixture as data only — the learned geometry embeddings and algorithmic fusion use are §2.1.2, deliberately deferred until the ADR-151 P6 LoRA heads exist. Geometry recorded from now on means banks captured today remain usable for layout-conditioned training later — you can't retroactively add geometry to data you didn't record. 8 new tests (3 geometry, 2 anchor, 2 bank, 1 multistatic) + full-loop extension (2-node geometry, one tape-measured + one unknown, surviving the bank JSON round-trip the runtime loads from). 50/50 calibration (both feature configs) + 23 CLI tests green. Co-Authored-By: RuFlo <ruv@ruv.net> * feat(training): two-checkerboard camera↔room calibration for ADR-079 labels (ADR-152 §2.1.3) Defends the camera-supervised pipeline against PerceptAlign's "coordinate overfitting": MediaPipe keypoints were emitted in raw camera coordinates with no shared frame and no transceiver-geometry metadata — the exact label shape that memorizes deployment layout and collapses cross-layout. - scripts/calibrate-camera-room.py + calibration_lib.py: OpenCV two-checkerboard calibration → versioned bundle JSON (intrinsics, camera→room extrinsics, checkerboard spec, transceiver geometry, sha256 calibration_id). Intrinsics resolve from file > cache > multi-view computation > loud-warning 2-view fallback. - collect-ground-truth.py --calibration <bundle>: every sample gains keypoints_room (unit bearing rays from the camera center in the room frame — documented projective alignment; raw image coords preserved so training chooses), camera_origin_room, calibration_id, and the transceiver geometry stamp. Without the flag, output is byte-identical to before (tested) + a one-line ADR-152 warning. Design finding (recorded for ADR-152): a single planar checkerboard's corner grid is centrosymmetric — the reversed corner ordering fits a ghost camera pose with IDENTICAL reprojection error, so per-board flip disambiguation is mathematically ill-posed. solve_two_board_extrinsics solves the joint wall+floor set over all 4 flip combinations, where the minimum is unique — an independent reason the TWO-checkerboard method is required, beyond what PerceptAlign states. 15 headless pytest tests green (synthetic corners: extrinsics recovery incl. ghost resolution, bundle round-trip + hash stability, ray transforms w/ distortion + cross-resolution, no-calibration byte identity). Co-Authored-By: RuFlo <ruv@ruv.net> * feat(benchmarks): WiFlow-STD reproduction harness + measurement (a) results (ADR-152 §2.2) Shipped checkpoint REFUTED (0.08% PCK@20, wrong keypoint normalization); 6 reproducibility defects documented (broken imports, corrupted dataset tail with float32-max garbage that NaN-poisons fp16 BatchNorm, unreachable test phase). After repairs, retraining with upstream defaults reproduces 96.09% PCK@20 full-test / 96.61% corruption-free (published 97.25%) on RTX 5080. Claims graded MEASURED-EQUIVALENT; 2.23M params + ~0.055 GFLOPs verified. Third-party code/weights/data stay out of tree (gitignored). Co-Authored-By: claude-flow <ruv@ruv.net> * feat: ADR-152 Rust integrations + ADR-153 802.11bf protocol model - calibration: GeometryEmbedding — 32-slot permutation-invariant NodeGeometry featurization for future LoRA-head conditioning (ADR-152 §2.1.2); derived SpecialistBank::geometry_embedding() accessor; 59 tests - train: MaePretrainConfig + patchify/random-mask with UNSW measured recipe (80% masking, (30,3) patches; ADR-152 §2.3, arXiv 2511.18792); strict no-truncate/no-NaN policy; proptest properties - train: WiFlowStdModel — tch-gated port of the verified ~96%-PCK@20 WiFlow-STD architecture (ADR-152 §2.2 beyond-SOTA); ungated param formula pinned to 2,225,042; 15/17-keypoint support; 239 crate tests - hardware: ieee80211bf forward-compatibility protocol model (ADR-153): SpecProfile gates, SensingCapabilities negotiation, required ConsentMode, session FSM, SensingTransport + SimTransport + OpportunisticCsiBridge; full acceptance checklist covered; 156+4 tests - deps: ruvector bumps per ADR-152 §2.6 survey (mincut/solver 2.0.6, attention 2.1.0, gnn 2.2.0); vendor/ruvector synced to a083bd77f - docs: ADR-153 accepted; ADR-152 §2.2 status, §2.4 amendment, §2.6 added Workspace: 162 test suites green (--no-default-features); Python proof PASS. Known pre-existing flake: homecore-api env_empty_falls_back_to_defaults (unserialized env-var mutation) — untouched, follow-up. Co-Authored-By: claude-flow <ruv@ruv.net> * docs: CHANGELOG + CLAUDE.md entries for ADR-152 integrations and ADR-153 Co-Authored-By: claude-flow <ruv@ruv.net> * fix(train): repair tch-backend bit-rot — gated path compiles and tests run again Mechanical API refresh against current tch: Vec::from(Tensor) -> try_from (+ explicit flatten), numel() usize cast, Rem/div ops -> remainder() / divide_scalar_mode(floor) — the latter fixed a silent true-division bug in heatmap argmax decoding; clamp(1.0, f64::MAX) -> clamp_min (torch 2.x scalar overflow panic); petgraph EdgeRef import; missing EvalMetrics and verify_checkpoint_dir APIs that tests documented. wiflow_std roundtrip test uses safetensors (.pt _save_parameters roundtrip broken in torch 2.11 Windows). Gated: 349 passed (incl. all 20 wiflow_std); ungated: unchanged. Known pre-existing: gaussian-heatmap convention mismatch (2 tests), proof seed race under parallel threads — documented, deliberate follow-ups. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(train): WiFlow-STD PyTorch->tch weight import + numerical parity proof export_to_safetensors.py maps the retrained checkpoint (295 tensors -> 248 mapped, param sum exactly 2,225,042; num_batches_tracked dropped) into a tch-loadable safetensors plus a deterministic parity fixture. Gated #[ignore] integration test loads it strictly and asserts forward-pass agreement: max abs diff 1.192e-7 on the seed-42 fixture. dump_variable_names test makes the tch name layout authoritative. Zero architecture discrepancies found. Co-Authored-By: claude-flow <ruv@ruv.net> * fix: workflow-review findings — BN gamma init, ThresholdParams serde, init docs Concurrent validation workflow (2 review lanes + adversarial verification, 13 agents): 5 confirmed findings, 3 refuted. Fixes: - wiflow_std: pin BatchNorm gamma to 1.0 (tch default draws Uniform(0,1) — silently halves activations in from-scratch training; loaded checkpoints unaffected, parity re-verified after the change) - wiflow_std: document the conv-init divergences vs the reference's effective kaiming_normal(fan_out) re-init (from-scratch dynamics only) - ieee80211bf: ThresholdParams deserialization validates via try_from so the <=100 invariant holds for untrusted payloads (+ rejection test) Benchmarks (release, ruvzen): GeometryEmbedding 1.84us/call (542k/s), MAE tokenization 7.38us/window (135k/s), 802.11bf FSM 8.9M events/s — nothing suspicious. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-152 §2.1.4 gate resolved — PerceptAlign repo MIT, dataset on HF Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): edge optimization measured + measurement (b) blocked + 92.9% retraction Edge optimization (ADR-152 optimize track): ONNX Runtime fp32 is the CPU latency win (3.2 ms/window, ~3.4x faster than torch, parity 2.4e-7); ORT dynamic int8 reaches 2.44 MB (paper's ~2.2 MB claim plausible only via conv-capable toolchains; -0.16pt PCK@20, +18% MPJPE, 2x slower); torch dynamic quant converts 0% of this conv-only model; fp16 halves storage free but is slower on CPU. Measurement (b) BLOCKED-ON-DATA: only 1,077 paired ESP32 windows exist (stop rule <2k). Forensic recheck of the surviving April holdout RETRACTS the ADR-079 '92.9% PCK@20' figure: constant-output model, absolute (not torso) threshold, 69 near-static frames — mean predictor scores 100% under that protocol; torso-PCK@20 is 19.1%. Corroborates PR #535. Stale citations removed from user-guide, readme-details, ADR-152 §2.1.3; no-citation rule extended to ADR-079 accuracy claims. Unblock: >=2k-window multi-pose paired session + torso-PCK re-baseline. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(user-guide): corrected camera-supervised collection tutorial Step 0 CSI-rate check + session-length math (window yield = frames/20 — the May session's 8x under-delivery was a ~12 Hz CSI rate, not an aligner bug); two-checkerboard calibration step (ADR-152 §2.1.3); pose-variety and confidence guidance; torso-normalized PCK + temporal-split + pred-variance eval protocol (lessons from the 92.9% retraction); scale presets re-keyed to realistic window counts. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): static PTQ int8 (calibrated) results + overnight capture script Conv-only static QDQ beats dynamic int8 on accuracy (PCK@20 96.61-96.63% vs 96.52%, MPJPE +10% vs +18% over fp32) at ~equal size/latency; all-ops QDQ strictly worse (int8 activations through attention glue). Entropy calibration verified bit-identical to MinMax on this data. Deployment: ONNX fp32 for speed (3.2ms), static conv-only QDQ for smallest (2.53MB). Also: scripts/overnight-empty-capture.py — segmented UDP CSI recorder for empty-room baselines (no glob collisions, detach-safe). Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): measurement (b) MEASURED — optimization transfer only, mean-pose baseline wins WiFlow-STD fine-tuned on 2,046 fresh single-room ESP32 paired windows (temporal 70/15/15, 70->540 adapter, K=17): pretrained-init 65% PCK@20 vs scratch 0% (optimization transfer) but frozen-trunk ~0% (no feature transfer), and NOTHING beats the mean-pose baseline (95.9% PCK@20 — single subject, near-static normalized coords). Honesty gates held: pred std 0.0113 (non-constant model) but mean-baseline dominance means no citable CSI->pose capability from this data. ADR-152 open question 1 answered partially; definitive answer needs multi-subject/position data. Two new aligner findings: heterogeneous csi_shape with silent zero-padding (~20%), and extractCsiMatrix's transposed shape label (frame-major data, [nSc, nFrames] label) — fixes pending. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): efficiency sweep MEASURED — half model dominates full reference Compact WiFlow-STD variants on the same data/split/protocol: half (843,834 params, 0.38x) strictly dominates the 2.23M reference (PCK@20 96.62 vs 96.61, PCK@50 99.47 vs 99.11, MPJPE 0.00898 vs 0.0094) — the published architecture is over-parameterized for its own benchmark. quarter (338k) 96.05%; tiny (56,290 params, 1/39.5) holds 94.11% — a ~220KB fp32 edge candidate. In-domain caveats recorded; cross-domain untested. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(train): compact WiFlow-STD presets in Rust + tiny edge artifact (ADR-152) WiFlowStdConfig gains half()/quarter()/tiny() mirroring the overnight sweep exactly: TcnGroupsMode (Fixed/Gcd/Depthwise), input_pw_groups, derived stride schedule and decoder-mid (all default to upstream behavior; legacy serde JSON unaffected). Param formulas pin to trained ground truth first try: 843,834 / 338,600 / 56,290; default 2,225,042 pin and 1.192e-7 parity unchanged. 248 tests green. Tiny edge artifact (tiny_edge_bench.py): ONNX fp32 = 295 KB, 0.66 ms/win (~1,500/s CPU), 94.11% PCK@20 (matches sweep clean-test exactly; parity 1.49e-7). Static int8 is a bad trade at this scale (-1.43pt, +19% MPJPE, -16% size, slower) — recorded as negative result. Export note: width-16 breaks AdaptiveAvgPool((15,1)) TorchScript export; replaced by exact mean+matmul equivalent, proven by parity. Co-Authored-By: claude-flow <ruv@ruv.net> * fix: resolve all 10 confirmed code-review findings (7-angle review, 20/20 verified) wiflow_std: min_feature_width (default 15) replaces the keypoints->stride coupling — for_keypoints(17) now provably builds the trained [2,2,2,2] graph and pools 15->17, matching the validated Python protocol (pinned by tests); param_count() total on invalid configs; random_mask returns Result and rejects non-finite/out-of-range ratios; trainer checkpoints switched to safetensors (.pt VarStore roundtrip broken on Windows torch 2.11). ieee80211bf: SBP proxy now re-triggers instances and relays reports via Action::RelaySbpReport -> SensingFrame::SbpReport (clients consume via their existing path); missed_instances reset on success = consecutive semantics; SessionTable gains a guarded SBP entry point + unknown-id drop counter; initiator-role sessions reject inbound setup/SBP requests (RejectedNotSupported) closing the idle hijack; StartSetup/StartSbp outside Idle return InvalidStateForCommand; SBP validation unified through evaluate_setup with a 1:1 SetupStatus->SbpStatus mapping. events.rs split out to honor the 500-line cap. calibration/cli: enrollment geometry now actually reaches trained banks — both production call sites attach .with_geometry; --geometry flag on train-room and POST /enroll/geometry + train-body geometry on calibrate-serve give production a recording surface; geometry-free banks log the ADR-152 §2.1.2 note. benchmarks: corruption masks committed as ground truth (unregenerable after in-place cleaning; verified bit-identical regeneration from the pristine copy) + generate_corruption_masks.py producer; _bench_common.py dedups the 5x-copied shim/evaluate/seed/remap (post-refactor PCK@20 re-verified equal to the last digit); remote scripts get the mmap patch; tiny_edge --calib validated multiple-of-64; onnx_bench --help no longer executes (and overwrote) the export — artifact restored byte-exact. Workspace: 2,963 tests passed, 0 failed; Python proof PASS. Co-Authored-By: claude-flow <ruv@ruv.net> * ci: build workspace tests without debuginfo — runner disk exhaustion The combined 38-crate debug target exceeds the GitHub runner's disk ('final link failed: No space left on device'); the same tree measured 151GB locally with full debuginfo. CARGO_PROFILE_{DEV,TEST}_DEBUG=0 shrinks the target ~5-10x; debuginfo serves no purpose in CI test runs. Co-Authored-By: claude-flow <ruv@ruv.net>
417 lines
16 KiB
Python
417 lines
16 KiB
Python
#!/usr/bin/env python3
|
|
"""Camera-room calibration library for WiFi pose ground truth (ADR-152 S2.1.3).
|
|
|
|
Implements the PerceptAlign-style two-checkerboard alignment adopted in
|
|
ADR-152 S2.1.3 to defend the ADR-079 camera-supervised pipeline against
|
|
"coordinate overfitting" (arXiv 2601.12252, MobiCom'26): models regressing
|
|
CSI to raw camera-frame coordinates memorize the deployment layout and
|
|
collapse cross-layout. The fix is to express camera AND WiFi transceivers
|
|
in one shared 3D room frame, and stamp every training label with the
|
|
calibration + transceiver geometry that produced it.
|
|
|
|
Used by:
|
|
scripts/calibrate-camera-room.py (produces the calibration bundle)
|
|
scripts/collect-ground-truth.py (consumes it via --calibration)
|
|
|
|
Room frame convention (right-handed, meters):
|
|
origin = a designated wall/floor corner of the room
|
|
+x = along the origin wall
|
|
+y = into the room (away from the origin wall)
|
|
+z = up
|
|
|
|
No-depth limitation (IMPORTANT): a single 2D camera keypoint constrains
|
|
only a *ray* in the room frame, not a 3D point. The transform helpers here
|
|
therefore return unit bearing rays from the camera center -- a projective
|
|
alignment. Consumers that need metric 3D points must supply a depth
|
|
assumption downstream (floor-plane intersection, known subject height,
|
|
multi-view triangulation, ...). Raw image coordinates are always preserved
|
|
alongside the room-frame rays so training can choose either representation.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import hashlib
|
|
import json
|
|
from datetime import datetime, timezone
|
|
from pathlib import Path
|
|
|
|
import cv2
|
|
import numpy as np
|
|
|
|
BUNDLE_SCHEMA_VERSION = 1
|
|
BUNDLE_METHOD = "two-checkerboard"
|
|
|
|
# Default checkerboard: 9x6 inner corners, 25 mm squares (a common print).
|
|
DEFAULT_BOARD_COLS = 9
|
|
DEFAULT_BOARD_ROWS = 6
|
|
DEFAULT_SQUARE_SIZE_MM = 25.0
|
|
|
|
_AXIS_TOKENS = {
|
|
"+x": (1.0, 0.0, 0.0), "-x": (-1.0, 0.0, 0.0),
|
|
"+y": (0.0, 1.0, 0.0), "-y": (0.0, -1.0, 0.0),
|
|
"+z": (0.0, 0.0, 1.0), "-z": (0.0, 0.0, -1.0),
|
|
}
|
|
|
|
|
|
def parse_axis(token: str) -> np.ndarray:
|
|
"""Parse an axis token like '+x' or '-z' into a room-frame unit vector."""
|
|
key = token.strip().lower()
|
|
if key in _AXIS_TOKENS:
|
|
return np.array(_AXIS_TOKENS[key], dtype=np.float64)
|
|
raise ValueError(f"Invalid axis token {token!r}; expected one of {sorted(_AXIS_TOKENS)}")
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Checkerboard geometry
|
|
# ---------------------------------------------------------------------------
|
|
|
|
def board_object_points(cols: int, rows: int, square_size_m: float) -> np.ndarray:
|
|
"""Inner-corner positions in the board's own frame (z=0 plane), row-major.
|
|
|
|
Matches the corner ordering of cv2.findChessboardCorners for a
|
|
(cols, rows) pattern: cols varies fastest.
|
|
"""
|
|
pts = np.zeros((rows * cols, 3), dtype=np.float64)
|
|
grid = np.mgrid[0:cols, 0:rows].T.reshape(-1, 2) # (rows*cols, 2), cols fastest
|
|
pts[:, :2] = grid * square_size_m
|
|
return pts
|
|
|
|
|
|
def board_room_points(
|
|
cols: int,
|
|
rows: int,
|
|
square_size_m: float,
|
|
origin: np.ndarray,
|
|
u_axis: np.ndarray,
|
|
v_axis: np.ndarray,
|
|
) -> np.ndarray:
|
|
"""Inner-corner positions in ROOM coordinates for a board placed at a
|
|
known position: first corner at `origin`, columns stepping along
|
|
`u_axis`, rows stepping along `v_axis` (both room-frame unit vectors).
|
|
"""
|
|
local = board_object_points(cols, rows, square_size_m)
|
|
origin = np.asarray(origin, dtype=np.float64)
|
|
u = np.asarray(u_axis, dtype=np.float64)
|
|
v = np.asarray(v_axis, dtype=np.float64)
|
|
return origin[None, :] + local[:, 0:1] * u[None, :] + local[:, 1:2] * v[None, :]
|
|
|
|
|
|
def find_board_corners(image: np.ndarray, cols: int, rows: int) -> np.ndarray | None:
|
|
"""Detect and sub-pixel-refine checkerboard inner corners.
|
|
|
|
Returns (cols*rows, 2) float64 pixel coordinates, or None if not found.
|
|
"""
|
|
gray = image if image.ndim == 2 else cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
|
|
flags = cv2.CALIB_CB_ADAPTIVE_THRESH | cv2.CALIB_CB_NORMALIZE_IMAGE
|
|
found, corners = cv2.findChessboardCorners(gray, (cols, rows), flags=flags)
|
|
if not found:
|
|
return None
|
|
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 1e-3)
|
|
corners = cv2.cornerSubPix(gray, corners, (11, 11), (-1, -1), criteria)
|
|
return corners.reshape(-1, 2).astype(np.float64)
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Intrinsics
|
|
# ---------------------------------------------------------------------------
|
|
|
|
def compute_intrinsics(
|
|
corner_sets: list[np.ndarray],
|
|
image_size: tuple[int, int],
|
|
cols: int,
|
|
rows: int,
|
|
square_size_m: float,
|
|
) -> dict:
|
|
"""Camera intrinsics from N checkerboard views via cv2.calibrateCamera.
|
|
|
|
corner_sets: list of (cols*rows, 2) pixel corner arrays.
|
|
image_size: (width, height) of the calibration images.
|
|
"""
|
|
obj = board_object_points(cols, rows, square_size_m).astype(np.float32)
|
|
obj_pts = [obj for _ in corner_sets]
|
|
img_pts = [c.reshape(-1, 1, 2).astype(np.float32) for c in corner_sets]
|
|
rms, camera_matrix, dist_coeffs, _, _ = cv2.calibrateCamera(
|
|
obj_pts, img_pts, tuple(image_size), None, None
|
|
)
|
|
return {
|
|
"image_size": [int(image_size[0]), int(image_size[1])],
|
|
"camera_matrix": camera_matrix.tolist(),
|
|
"dist_coeffs": dist_coeffs.ravel().tolist(),
|
|
"reprojection_error_px": float(rms),
|
|
"source": "computed",
|
|
}
|
|
|
|
|
|
def load_intrinsics(path: Path) -> dict:
|
|
"""Load a pre-computed intrinsics JSON ({camera_matrix, dist_coeffs, image_size})."""
|
|
with open(path, "r", encoding="utf-8") as f:
|
|
data = json.load(f)
|
|
# Accept either a bare intrinsics dict or a full calibration bundle.
|
|
intr = data.get("camera_intrinsics", data)
|
|
for key in ("camera_matrix", "dist_coeffs", "image_size"):
|
|
if key not in intr:
|
|
raise ValueError(f"Intrinsics file {path} missing key {key!r}")
|
|
intr = dict(intr)
|
|
intr["source"] = "file"
|
|
return intr
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Extrinsics (camera -> room rigid transform)
|
|
# ---------------------------------------------------------------------------
|
|
|
|
def reprojection_rmse(
|
|
room_points: np.ndarray,
|
|
image_points: np.ndarray,
|
|
rvec: np.ndarray,
|
|
tvec: np.ndarray,
|
|
camera_matrix: np.ndarray,
|
|
dist_coeffs: np.ndarray,
|
|
) -> float:
|
|
proj, _ = cv2.projectPoints(room_points, rvec, tvec, camera_matrix, dist_coeffs)
|
|
err = proj.reshape(-1, 2) - image_points.reshape(-1, 2)
|
|
return float(np.sqrt(np.mean(np.sum(err**2, axis=1))))
|
|
|
|
|
|
def _solve_pnp(
|
|
room_points: np.ndarray,
|
|
image_points: np.ndarray,
|
|
camera_matrix: np.ndarray,
|
|
dist_coeffs: np.ndarray,
|
|
) -> dict | None:
|
|
"""One solvePnP run (room->camera), inverted to camera->room. Returns
|
|
{rotation (3x3 camera->room), translation_m (camera center in room
|
|
frame), rmse_px} or None on failure.
|
|
"""
|
|
ok, rvec, tvec = cv2.solvePnP(
|
|
room_points.reshape(-1, 1, 3),
|
|
image_points.reshape(-1, 1, 2),
|
|
camera_matrix,
|
|
dist_coeffs,
|
|
flags=cv2.SOLVEPNP_ITERATIVE,
|
|
)
|
|
if not ok:
|
|
return None
|
|
rmse = reprojection_rmse(room_points, image_points, rvec, tvec, camera_matrix, dist_coeffs)
|
|
r_room_to_cam, _ = cv2.Rodrigues(rvec)
|
|
r_cam_to_room = r_room_to_cam.T
|
|
camera_center_room = (-r_cam_to_room @ tvec).ravel()
|
|
return {
|
|
"rotation": r_cam_to_room.tolist(),
|
|
"translation_m": camera_center_room.tolist(),
|
|
"rmse_px": rmse,
|
|
}
|
|
|
|
|
|
def solve_extrinsics(
|
|
room_points: np.ndarray,
|
|
image_points: np.ndarray,
|
|
camera_matrix: np.ndarray,
|
|
dist_coeffs: np.ndarray,
|
|
) -> dict:
|
|
"""Solve the camera->room rigid transform from 3D room-frame points and
|
|
their 2D pixel observations.
|
|
|
|
NOTE: the corner grid of a single planar checkerboard is centrosymmetric,
|
|
so the corner ordering returned by findChessboardCorners (which may
|
|
enumerate from either board end) cannot be disambiguated from one board
|
|
alone -- the reversed ordering fits a ghost pose with identical
|
|
reprojection error. Use solve_two_board_extrinsics for the full
|
|
two-checkerboard procedure, where the joint point set breaks the symmetry.
|
|
"""
|
|
ext = _solve_pnp(room_points, image_points, camera_matrix, dist_coeffs)
|
|
if ext is None:
|
|
raise RuntimeError("solvePnP failed")
|
|
return ext
|
|
|
|
|
|
def solve_two_board_extrinsics(
|
|
wall_room: np.ndarray,
|
|
wall_image: np.ndarray,
|
|
floor_room: np.ndarray,
|
|
floor_image: np.ndarray,
|
|
camera_matrix: np.ndarray,
|
|
dist_coeffs: np.ndarray,
|
|
) -> dict:
|
|
"""Joint camera->room solve over both checkerboards (the ADR-152 S2.1.3
|
|
two-checkerboard method).
|
|
|
|
Tries all 4 per-board corner-ordering combinations: each board's ordering
|
|
is individually ambiguous (centrosymmetric grid), but the combined
|
|
wall+floor point set is not, so exactly one combination reaches minimal
|
|
reprojection error. Returns the solve_extrinsics dict plus
|
|
{wall_flipped, floor_flipped, per_board: {wall|floor: {rmse_px}}}.
|
|
"""
|
|
best = None
|
|
for wall_flipped in (False, True):
|
|
for floor_flipped in (False, True):
|
|
wi = wall_image[::-1].copy() if wall_flipped else wall_image
|
|
fi = floor_image[::-1].copy() if floor_flipped else floor_image
|
|
room = np.concatenate([wall_room, floor_room], axis=0)
|
|
img = np.concatenate([wi, fi], axis=0)
|
|
ext = _solve_pnp(room, img, camera_matrix, dist_coeffs)
|
|
if ext is None:
|
|
continue
|
|
if best is None or ext["rmse_px"] < best[0]["rmse_px"]:
|
|
ext["wall_flipped"] = wall_flipped
|
|
ext["floor_flipped"] = floor_flipped
|
|
rvec, _ = cv2.Rodrigues(np.asarray(ext["rotation"]).T)
|
|
tvec = -np.asarray(ext["rotation"]).T @ np.asarray(ext["translation_m"])
|
|
ext["per_board"] = {
|
|
"wall": {"rmse_px": reprojection_rmse(
|
|
wall_room, wi, rvec, tvec, camera_matrix, dist_coeffs)},
|
|
"floor": {"rmse_px": reprojection_rmse(
|
|
floor_room, fi, rvec, tvec, camera_matrix, dist_coeffs)},
|
|
}
|
|
best = (ext,)
|
|
if best is None:
|
|
raise RuntimeError("solvePnP failed for all corner-ordering combinations")
|
|
return best[0]
|
|
|
|
|
|
def extrinsics_consistency(ext_a: dict, ext_b: dict) -> dict:
|
|
"""Angular + translational disagreement between two extrinsic solutions
|
|
(the two single-board solves). Large values mean a mis-entered board
|
|
placement or a bad corner detection.
|
|
"""
|
|
ra = np.asarray(ext_a["rotation"])
|
|
rb = np.asarray(ext_b["rotation"])
|
|
r_delta = ra.T @ rb
|
|
angle = float(np.degrees(np.arccos(np.clip((np.trace(r_delta) - 1.0) / 2.0, -1.0, 1.0))))
|
|
t_delta = float(
|
|
np.linalg.norm(np.asarray(ext_a["translation_m"]) - np.asarray(ext_b["translation_m"]))
|
|
)
|
|
return {"rotation_deg": angle, "translation_m": t_delta}
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Calibration bundle (the artifact written to disk)
|
|
# ---------------------------------------------------------------------------
|
|
|
|
def make_bundle(
|
|
camera_intrinsics: dict,
|
|
camera_to_room_extrinsics: dict,
|
|
checkerboard_spec: dict,
|
|
transceiver_geometry: dict,
|
|
) -> dict:
|
|
return {
|
|
"schema_version": BUNDLE_SCHEMA_VERSION,
|
|
"method": BUNDLE_METHOD,
|
|
"calibrated_at": datetime.now(timezone.utc).isoformat(),
|
|
"room_frame": {
|
|
"description": "right-handed; origin at wall/floor corner; "
|
|
"+x along origin wall, +y into room, +z up",
|
|
"units": "meters",
|
|
},
|
|
"checkerboard_spec": checkerboard_spec,
|
|
"camera_intrinsics": camera_intrinsics,
|
|
"camera_to_room_extrinsics": camera_to_room_extrinsics,
|
|
"transceiver_geometry": transceiver_geometry,
|
|
}
|
|
|
|
|
|
def calibration_id(bundle: dict) -> str:
|
|
"""Stable content hash of a bundle -- stamped onto every emitted sample
|
|
so a label can always be traced to the exact calibration that framed it.
|
|
"""
|
|
canonical = json.dumps(bundle, sort_keys=True, separators=(",", ":"))
|
|
return "sha256:" + hashlib.sha256(canonical.encode("utf-8")).hexdigest()
|
|
|
|
|
|
def save_bundle(bundle: dict, path: Path) -> None:
|
|
path = Path(path)
|
|
path.parent.mkdir(parents=True, exist_ok=True)
|
|
with open(path, "w", encoding="utf-8") as f:
|
|
json.dump(bundle, f, indent=2)
|
|
f.write("\n")
|
|
|
|
|
|
def load_bundle(path: Path) -> dict:
|
|
with open(path, "r", encoding="utf-8") as f:
|
|
bundle = json.load(f)
|
|
for key in ("camera_intrinsics", "camera_to_room_extrinsics", "transceiver_geometry"):
|
|
if key not in bundle:
|
|
raise ValueError(f"Calibration bundle {path} missing key {key!r}")
|
|
return bundle
|
|
|
|
|
|
# ---------------------------------------------------------------------------
|
|
# Keypoint transform (image -> room-frame bearing rays)
|
|
# ---------------------------------------------------------------------------
|
|
|
|
class CalibrationContext:
|
|
"""Pre-computed transform state for a collection session.
|
|
|
|
Scales the bundle's intrinsics to the live capture resolution (MediaPipe
|
|
keypoints are normalized [0,1], so we need the actual frame size to get
|
|
back to pixels before undistorting).
|
|
"""
|
|
|
|
def __init__(self, bundle: dict, frame_w: int, frame_h: int):
|
|
self.bundle = bundle
|
|
self.calibration_id = calibration_id(bundle)
|
|
self.transceiver_geometry = bundle["transceiver_geometry"]
|
|
self.frame_w = int(frame_w)
|
|
self.frame_h = int(frame_h)
|
|
|
|
intr = bundle["camera_intrinsics"]
|
|
k = np.asarray(intr["camera_matrix"], dtype=np.float64)
|
|
cal_w, cal_h = intr["image_size"]
|
|
sx = self.frame_w / float(cal_w)
|
|
sy = self.frame_h / float(cal_h)
|
|
k = k.copy()
|
|
k[0, 0] *= sx
|
|
k[0, 2] *= sx
|
|
k[1, 1] *= sy
|
|
k[1, 2] *= sy
|
|
self.camera_matrix = k
|
|
self.dist_coeffs = np.asarray(intr["dist_coeffs"], dtype=np.float64)
|
|
|
|
ext = bundle["camera_to_room_extrinsics"]
|
|
self.r_cam_to_room = np.asarray(ext["rotation"], dtype=np.float64)
|
|
self.origin_room = np.asarray(ext["translation_m"], dtype=np.float64)
|
|
|
|
def transform_keypoints(self, keypoints_norm: list[list[float]]) -> tuple[np.ndarray, np.ndarray]:
|
|
"""Normalized [0,1] image keypoints -> unit bearing rays in the room
|
|
frame, anchored at the camera center.
|
|
|
|
Projective alignment ONLY (no depth): each returned ray is the locus
|
|
of room positions consistent with the 2D observation. Returns
|
|
(camera_origin_room (3,), ray_dirs (N, 3) unit vectors).
|
|
"""
|
|
pts = np.asarray(keypoints_norm, dtype=np.float64)
|
|
pts_px = pts * np.array([self.frame_w, self.frame_h], dtype=np.float64)
|
|
undist = cv2.undistortPoints(
|
|
pts_px.reshape(-1, 1, 2), self.camera_matrix, self.dist_coeffs
|
|
).reshape(-1, 2)
|
|
rays_cam = np.concatenate([undist, np.ones((len(undist), 1))], axis=1)
|
|
rays_cam /= np.linalg.norm(rays_cam, axis=1, keepdims=True)
|
|
rays_room = (self.r_cam_to_room @ rays_cam.T).T
|
|
return self.origin_room, rays_room
|
|
|
|
|
|
def load_calibration_context(path: Path, frame_w: int, frame_h: int) -> CalibrationContext:
|
|
return CalibrationContext(load_bundle(path), frame_w, frame_h)
|
|
|
|
|
|
def augment_record(record: dict, ctx: CalibrationContext | None) -> dict:
|
|
"""Stamp a ground-truth record with room-frame rays + calibration metadata.
|
|
|
|
With ctx=None this is the identity -- the record (and hence the emitted
|
|
JSONL line) is byte-identical to the pre-calibration ADR-079 format.
|
|
Raw image-coordinate keypoints are kept untouched in both cases; the
|
|
room-frame representation is ADDED, never substituted, so training can
|
|
choose either (ADR-152 S2.1.3).
|
|
"""
|
|
if ctx is None:
|
|
return record
|
|
if record.get("keypoints"):
|
|
_, rays = ctx.transform_keypoints(record["keypoints"])
|
|
record["keypoints_room"] = [[round(float(v), 5) for v in ray] for ray in rays]
|
|
else:
|
|
record["keypoints_room"] = []
|
|
record["camera_origin_room"] = [round(float(v), 5) for v in ctx.origin_room]
|
|
record["calibration_id"] = ctx.calibration_id
|
|
record["transceiver_geometry"] = ctx.transceiver_geometry
|
|
return record
|