mirror of
https://github.com/ruvnet/RuView
synced 2026-06-22 12:23:18 +00:00
17471e93ff
* feat(calibration): NodeGeometry transceiver-geometry recording (ADR-152 §2.1.1) PerceptAlign-motivated geometry capture at enrollment: per-node optional records (position, antenna orientation, inter-node distances, acquisition method) — recorded when known, never required. Event-sourced via EnrollmentEvent::GeometryRecorded (latest recording wins); persisted on SpecialistBank with serde defaults so pre-ADR-152 bank JSON loads cleanly (fixture-proven, and geometry-free banks serialize byte-shape-identical to the old schema); threaded through MultiNodeMixture as data only — the learned geometry embeddings and algorithmic fusion use are §2.1.2, deliberately deferred until the ADR-151 P6 LoRA heads exist. Geometry recorded from now on means banks captured today remain usable for layout-conditioned training later — you can't retroactively add geometry to data you didn't record. 8 new tests (3 geometry, 2 anchor, 2 bank, 1 multistatic) + full-loop extension (2-node geometry, one tape-measured + one unknown, surviving the bank JSON round-trip the runtime loads from). 50/50 calibration (both feature configs) + 23 CLI tests green. Co-Authored-By: RuFlo <ruv@ruv.net> * feat(training): two-checkerboard camera↔room calibration for ADR-079 labels (ADR-152 §2.1.3) Defends the camera-supervised pipeline against PerceptAlign's "coordinate overfitting": MediaPipe keypoints were emitted in raw camera coordinates with no shared frame and no transceiver-geometry metadata — the exact label shape that memorizes deployment layout and collapses cross-layout. - scripts/calibrate-camera-room.py + calibration_lib.py: OpenCV two-checkerboard calibration → versioned bundle JSON (intrinsics, camera→room extrinsics, checkerboard spec, transceiver geometry, sha256 calibration_id). Intrinsics resolve from file > cache > multi-view computation > loud-warning 2-view fallback. - collect-ground-truth.py --calibration <bundle>: every sample gains keypoints_room (unit bearing rays from the camera center in the room frame — documented projective alignment; raw image coords preserved so training chooses), camera_origin_room, calibration_id, and the transceiver geometry stamp. Without the flag, output is byte-identical to before (tested) + a one-line ADR-152 warning. Design finding (recorded for ADR-152): a single planar checkerboard's corner grid is centrosymmetric — the reversed corner ordering fits a ghost camera pose with IDENTICAL reprojection error, so per-board flip disambiguation is mathematically ill-posed. solve_two_board_extrinsics solves the joint wall+floor set over all 4 flip combinations, where the minimum is unique — an independent reason the TWO-checkerboard method is required, beyond what PerceptAlign states. 15 headless pytest tests green (synthetic corners: extrinsics recovery incl. ghost resolution, bundle round-trip + hash stability, ray transforms w/ distortion + cross-resolution, no-calibration byte identity). Co-Authored-By: RuFlo <ruv@ruv.net> * feat(benchmarks): WiFlow-STD reproduction harness + measurement (a) results (ADR-152 §2.2) Shipped checkpoint REFUTED (0.08% PCK@20, wrong keypoint normalization); 6 reproducibility defects documented (broken imports, corrupted dataset tail with float32-max garbage that NaN-poisons fp16 BatchNorm, unreachable test phase). After repairs, retraining with upstream defaults reproduces 96.09% PCK@20 full-test / 96.61% corruption-free (published 97.25%) on RTX 5080. Claims graded MEASURED-EQUIVALENT; 2.23M params + ~0.055 GFLOPs verified. Third-party code/weights/data stay out of tree (gitignored). Co-Authored-By: claude-flow <ruv@ruv.net> * feat: ADR-152 Rust integrations + ADR-153 802.11bf protocol model - calibration: GeometryEmbedding — 32-slot permutation-invariant NodeGeometry featurization for future LoRA-head conditioning (ADR-152 §2.1.2); derived SpecialistBank::geometry_embedding() accessor; 59 tests - train: MaePretrainConfig + patchify/random-mask with UNSW measured recipe (80% masking, (30,3) patches; ADR-152 §2.3, arXiv 2511.18792); strict no-truncate/no-NaN policy; proptest properties - train: WiFlowStdModel — tch-gated port of the verified ~96%-PCK@20 WiFlow-STD architecture (ADR-152 §2.2 beyond-SOTA); ungated param formula pinned to 2,225,042; 15/17-keypoint support; 239 crate tests - hardware: ieee80211bf forward-compatibility protocol model (ADR-153): SpecProfile gates, SensingCapabilities negotiation, required ConsentMode, session FSM, SensingTransport + SimTransport + OpportunisticCsiBridge; full acceptance checklist covered; 156+4 tests - deps: ruvector bumps per ADR-152 §2.6 survey (mincut/solver 2.0.6, attention 2.1.0, gnn 2.2.0); vendor/ruvector synced to a083bd77f - docs: ADR-153 accepted; ADR-152 §2.2 status, §2.4 amendment, §2.6 added Workspace: 162 test suites green (--no-default-features); Python proof PASS. Known pre-existing flake: homecore-api env_empty_falls_back_to_defaults (unserialized env-var mutation) — untouched, follow-up. Co-Authored-By: claude-flow <ruv@ruv.net> * docs: CHANGELOG + CLAUDE.md entries for ADR-152 integrations and ADR-153 Co-Authored-By: claude-flow <ruv@ruv.net> * fix(train): repair tch-backend bit-rot — gated path compiles and tests run again Mechanical API refresh against current tch: Vec::from(Tensor) -> try_from (+ explicit flatten), numel() usize cast, Rem/div ops -> remainder() / divide_scalar_mode(floor) — the latter fixed a silent true-division bug in heatmap argmax decoding; clamp(1.0, f64::MAX) -> clamp_min (torch 2.x scalar overflow panic); petgraph EdgeRef import; missing EvalMetrics and verify_checkpoint_dir APIs that tests documented. wiflow_std roundtrip test uses safetensors (.pt _save_parameters roundtrip broken in torch 2.11 Windows). Gated: 349 passed (incl. all 20 wiflow_std); ungated: unchanged. Known pre-existing: gaussian-heatmap convention mismatch (2 tests), proof seed race under parallel threads — documented, deliberate follow-ups. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(train): WiFlow-STD PyTorch->tch weight import + numerical parity proof export_to_safetensors.py maps the retrained checkpoint (295 tensors -> 248 mapped, param sum exactly 2,225,042; num_batches_tracked dropped) into a tch-loadable safetensors plus a deterministic parity fixture. Gated #[ignore] integration test loads it strictly and asserts forward-pass agreement: max abs diff 1.192e-7 on the seed-42 fixture. dump_variable_names test makes the tch name layout authoritative. Zero architecture discrepancies found. Co-Authored-By: claude-flow <ruv@ruv.net> * fix: workflow-review findings — BN gamma init, ThresholdParams serde, init docs Concurrent validation workflow (2 review lanes + adversarial verification, 13 agents): 5 confirmed findings, 3 refuted. Fixes: - wiflow_std: pin BatchNorm gamma to 1.0 (tch default draws Uniform(0,1) — silently halves activations in from-scratch training; loaded checkpoints unaffected, parity re-verified after the change) - wiflow_std: document the conv-init divergences vs the reference's effective kaiming_normal(fan_out) re-init (from-scratch dynamics only) - ieee80211bf: ThresholdParams deserialization validates via try_from so the <=100 invariant holds for untrusted payloads (+ rejection test) Benchmarks (release, ruvzen): GeometryEmbedding 1.84us/call (542k/s), MAE tokenization 7.38us/window (135k/s), 802.11bf FSM 8.9M events/s — nothing suspicious. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr): ADR-152 §2.1.4 gate resolved — PerceptAlign repo MIT, dataset on HF Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): edge optimization measured + measurement (b) blocked + 92.9% retraction Edge optimization (ADR-152 optimize track): ONNX Runtime fp32 is the CPU latency win (3.2 ms/window, ~3.4x faster than torch, parity 2.4e-7); ORT dynamic int8 reaches 2.44 MB (paper's ~2.2 MB claim plausible only via conv-capable toolchains; -0.16pt PCK@20, +18% MPJPE, 2x slower); torch dynamic quant converts 0% of this conv-only model; fp16 halves storage free but is slower on CPU. Measurement (b) BLOCKED-ON-DATA: only 1,077 paired ESP32 windows exist (stop rule <2k). Forensic recheck of the surviving April holdout RETRACTS the ADR-079 '92.9% PCK@20' figure: constant-output model, absolute (not torso) threshold, 69 near-static frames — mean predictor scores 100% under that protocol; torso-PCK@20 is 19.1%. Corroborates PR #535. Stale citations removed from user-guide, readme-details, ADR-152 §2.1.3; no-citation rule extended to ADR-079 accuracy claims. Unblock: >=2k-window multi-pose paired session + torso-PCK re-baseline. Co-Authored-By: claude-flow <ruv@ruv.net> * docs(user-guide): corrected camera-supervised collection tutorial Step 0 CSI-rate check + session-length math (window yield = frames/20 — the May session's 8x under-delivery was a ~12 Hz CSI rate, not an aligner bug); two-checkerboard calibration step (ADR-152 §2.1.3); pose-variety and confidence guidance; torso-normalized PCK + temporal-split + pred-variance eval protocol (lessons from the 92.9% retraction); scale presets re-keyed to realistic window counts. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): static PTQ int8 (calibrated) results + overnight capture script Conv-only static QDQ beats dynamic int8 on accuracy (PCK@20 96.61-96.63% vs 96.52%, MPJPE +10% vs +18% over fp32) at ~equal size/latency; all-ops QDQ strictly worse (int8 activations through attention glue). Entropy calibration verified bit-identical to MinMax on this data. Deployment: ONNX fp32 for speed (3.2ms), static conv-only QDQ for smallest (2.53MB). Also: scripts/overnight-empty-capture.py — segmented UDP CSI recorder for empty-room baselines (no glob collisions, detach-safe). Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): measurement (b) MEASURED — optimization transfer only, mean-pose baseline wins WiFlow-STD fine-tuned on 2,046 fresh single-room ESP32 paired windows (temporal 70/15/15, 70->540 adapter, K=17): pretrained-init 65% PCK@20 vs scratch 0% (optimization transfer) but frozen-trunk ~0% (no feature transfer), and NOTHING beats the mean-pose baseline (95.9% PCK@20 — single subject, near-static normalized coords). Honesty gates held: pred std 0.0113 (non-constant model) but mean-baseline dominance means no citable CSI->pose capability from this data. ADR-152 open question 1 answered partially; definitive answer needs multi-subject/position data. Two new aligner findings: heterogeneous csi_shape with silent zero-padding (~20%), and extractCsiMatrix's transposed shape label (frame-major data, [nSc, nFrames] label) — fixes pending. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(benchmarks): efficiency sweep MEASURED — half model dominates full reference Compact WiFlow-STD variants on the same data/split/protocol: half (843,834 params, 0.38x) strictly dominates the 2.23M reference (PCK@20 96.62 vs 96.61, PCK@50 99.47 vs 99.11, MPJPE 0.00898 vs 0.0094) — the published architecture is over-parameterized for its own benchmark. quarter (338k) 96.05%; tiny (56,290 params, 1/39.5) holds 94.11% — a ~220KB fp32 edge candidate. In-domain caveats recorded; cross-domain untested. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(train): compact WiFlow-STD presets in Rust + tiny edge artifact (ADR-152) WiFlowStdConfig gains half()/quarter()/tiny() mirroring the overnight sweep exactly: TcnGroupsMode (Fixed/Gcd/Depthwise), input_pw_groups, derived stride schedule and decoder-mid (all default to upstream behavior; legacy serde JSON unaffected). Param formulas pin to trained ground truth first try: 843,834 / 338,600 / 56,290; default 2,225,042 pin and 1.192e-7 parity unchanged. 248 tests green. Tiny edge artifact (tiny_edge_bench.py): ONNX fp32 = 295 KB, 0.66 ms/win (~1,500/s CPU), 94.11% PCK@20 (matches sweep clean-test exactly; parity 1.49e-7). Static int8 is a bad trade at this scale (-1.43pt, +19% MPJPE, -16% size, slower) — recorded as negative result. Export note: width-16 breaks AdaptiveAvgPool((15,1)) TorchScript export; replaced by exact mean+matmul equivalent, proven by parity. Co-Authored-By: claude-flow <ruv@ruv.net> * fix: resolve all 10 confirmed code-review findings (7-angle review, 20/20 verified) wiflow_std: min_feature_width (default 15) replaces the keypoints->stride coupling — for_keypoints(17) now provably builds the trained [2,2,2,2] graph and pools 15->17, matching the validated Python protocol (pinned by tests); param_count() total on invalid configs; random_mask returns Result and rejects non-finite/out-of-range ratios; trainer checkpoints switched to safetensors (.pt VarStore roundtrip broken on Windows torch 2.11). ieee80211bf: SBP proxy now re-triggers instances and relays reports via Action::RelaySbpReport -> SensingFrame::SbpReport (clients consume via their existing path); missed_instances reset on success = consecutive semantics; SessionTable gains a guarded SBP entry point + unknown-id drop counter; initiator-role sessions reject inbound setup/SBP requests (RejectedNotSupported) closing the idle hijack; StartSetup/StartSbp outside Idle return InvalidStateForCommand; SBP validation unified through evaluate_setup with a 1:1 SetupStatus->SbpStatus mapping. events.rs split out to honor the 500-line cap. calibration/cli: enrollment geometry now actually reaches trained banks — both production call sites attach .with_geometry; --geometry flag on train-room and POST /enroll/geometry + train-body geometry on calibrate-serve give production a recording surface; geometry-free banks log the ADR-152 §2.1.2 note. benchmarks: corruption masks committed as ground truth (unregenerable after in-place cleaning; verified bit-identical regeneration from the pristine copy) + generate_corruption_masks.py producer; _bench_common.py dedups the 5x-copied shim/evaluate/seed/remap (post-refactor PCK@20 re-verified equal to the last digit); remote scripts get the mmap patch; tiny_edge --calib validated multiple-of-64; onnx_bench --help no longer executes (and overwrote) the export — artifact restored byte-exact. Workspace: 2,963 tests passed, 0 failed; Python proof PASS. Co-Authored-By: claude-flow <ruv@ruv.net> * ci: build workspace tests without debuginfo — runner disk exhaustion The combined 38-crate debug target exceeds the GitHub runner's disk ('final link failed: No space left on device'); the same tree measured 151GB locally with full debuginfo. CARGO_PROFILE_{DEV,TEST}_DEBUG=0 shrinks the target ~5-10x; debuginfo serves no purpose in CI test runs. Co-Authored-By: claude-flow <ruv@ruv.net>
487 lines
17 KiB
YAML
487 lines
17 KiB
YAML
name: Continuous Integration
|
|
|
|
on:
|
|
push:
|
|
branches: [ main, develop, 'feature/*', 'feat/*', 'hotfix/*' ]
|
|
pull_request:
|
|
branches: [ main, develop ]
|
|
workflow_dispatch:
|
|
|
|
env:
|
|
PYTHON_VERSION: '3.11'
|
|
NODE_VERSION: '18'
|
|
REGISTRY: ghcr.io
|
|
IMAGE_NAME: ${{ github.repository }}
|
|
|
|
jobs:
|
|
# Code Quality and Security Checks
|
|
# The Python codebase moved to `archive/v1/` when the runtime was rewritten in
|
|
# Rust under `v2/`. The lint/format/type/scan checks below still run against
|
|
# the archive for hygiene, but with `continue-on-error: true` everywhere — the
|
|
# archive is frozen reference code, not active development, so a stale lint
|
|
# rule shouldn't gate PRs to the Rust workspace.
|
|
code-quality:
|
|
name: Code Quality & Security
|
|
runs-on: ubuntu-latest
|
|
continue-on-error: true
|
|
steps:
|
|
- name: Checkout code
|
|
continue-on-error: true
|
|
uses: actions/checkout@v4
|
|
with:
|
|
fetch-depth: 0
|
|
|
|
- name: Set up Python
|
|
continue-on-error: true
|
|
uses: actions/setup-python@v6
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
cache: 'pip'
|
|
|
|
- name: Install dependencies
|
|
continue-on-error: true
|
|
run: |
|
|
python -m pip install --upgrade pip
|
|
pip install -r requirements.txt
|
|
pip install black flake8 mypy bandit safety
|
|
|
|
- name: Code formatting check (Black)
|
|
continue-on-error: true
|
|
run: black --check --diff archive/v1/src archive/v1/tests
|
|
|
|
- name: Linting (Flake8)
|
|
continue-on-error: true
|
|
run: flake8 archive/v1/src archive/v1/tests --max-line-length=88 --extend-ignore=E203,W503
|
|
|
|
- name: Type checking (MyPy)
|
|
continue-on-error: true
|
|
run: mypy archive/v1/src --ignore-missing-imports
|
|
|
|
- name: Security scan (Bandit)
|
|
run: bandit -r archive/v1/src -f json -o bandit-report.json
|
|
continue-on-error: true
|
|
|
|
- name: Dependency vulnerability scan (Safety)
|
|
run: safety check --json --output safety-report.json
|
|
continue-on-error: true
|
|
|
|
- name: Upload security reports
|
|
continue-on-error: true
|
|
uses: actions/upload-artifact@v4
|
|
if: always()
|
|
with:
|
|
name: security-reports
|
|
path: |
|
|
bandit-report.json
|
|
safety-report.json
|
|
|
|
# Rust Workspace Tests
|
|
rust-tests:
|
|
name: Rust Workspace Tests
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
# `wifi-densepose-desktop` is a Tauri v2 app — `glib-sys`, `gtk-sys`,
|
|
# `webkit2gtk-sys`, etc. need the Linux dev libraries via pkg-config or the
|
|
# workspace test fails at the build step before any test runs (every recent
|
|
# main CI run has been red on this for exactly this reason). Install the
|
|
# standard Tauri-on-Ubuntu set.
|
|
- name: Install Tauri / GTK / serial system dev libraries
|
|
run: |
|
|
sudo apt-get update
|
|
sudo apt-get install -y --no-install-recommends \
|
|
libglib2.0-dev \
|
|
libgtk-3-dev \
|
|
libsoup-3.0-dev \
|
|
libjavascriptcoregtk-4.1-dev \
|
|
libwebkit2gtk-4.1-dev \
|
|
libayatana-appindicator3-dev \
|
|
librsvg2-dev \
|
|
libxdo-dev \
|
|
libudev-dev \
|
|
libdbus-1-dev \
|
|
libssl-dev \
|
|
pkg-config
|
|
|
|
- name: Install Rust toolchain
|
|
uses: dtolnay/rust-toolchain@stable
|
|
|
|
# Swatinem/rust-cache replaces a naive `actions/cache` of the whole
|
|
# `v2/target`. That manual cache of a 38-crate target dir (multi-GB) was an
|
|
# intermittent failure source — several CI runs this cycle died at the
|
|
# cache/setup step (after toolchain install, before "Run Rust tests"),
|
|
# needing a rerun. rust-cache is purpose-built for Rust: it caches the
|
|
# registry + git + a pruned target, evicts stale deps, and restores far more
|
|
# reliably (and faster) on large workspaces. `workspaces: v2` points it at
|
|
# the v2/ cargo workspace (keys on v2/Cargo.lock, caches v2/target).
|
|
- name: Cache cargo (Swatinem/rust-cache)
|
|
uses: Swatinem/rust-cache@v2
|
|
with:
|
|
workspaces: v2
|
|
|
|
# The 38-crate workspace debug build exhausts the runner's disk when built
|
|
# with full debuginfo (observed: "final link failed: No space left on
|
|
# device" once the engine/benchmark crates landed; the same tree's local
|
|
# debug target measured 151 GB). Debuginfo is useless in CI — tests either
|
|
# pass or print their failure — so build without it; target shrinks ~5-10x.
|
|
- name: Run Rust tests
|
|
working-directory: v2
|
|
env:
|
|
CARGO_PROFILE_DEV_DEBUG: "0"
|
|
CARGO_PROFILE_TEST_DEBUG: "0"
|
|
run: cargo test --workspace --no-default-features
|
|
|
|
- name: Run ADR-147 worldmodel tests
|
|
working-directory: v2
|
|
env:
|
|
CARGO_PROFILE_DEV_DEBUG: "0"
|
|
CARGO_PROFILE_TEST_DEBUG: "0"
|
|
run: cargo test -p wifi-densepose-worldmodel --no-default-features
|
|
|
|
# ADR-134 CIR tests are behind the `cir` feature so the bench dependency
|
|
# (Criterion) only pulls when actually exercised. Run them as a separate
|
|
# step so a CIR-only regression is unambiguously attributable.
|
|
- name: Run ADR-134 CIR tests
|
|
working-directory: v2
|
|
run: cargo test -p wifi-densepose-signal --no-default-features --features cir --tests
|
|
|
|
# ADR-134 + ADR-028 witness guard. The CIR proof runner produces a
|
|
# bit-deterministic SHA-256 over CirEstimator output on the synthetic
|
|
# reference signal. Any algorithmic regression — changes to ISTA
|
|
# convergence, sensing matrix construction, soft-thresholding, or input
|
|
# padding — breaks the hash and fails the build. To regenerate after an
|
|
# *intentional* change:
|
|
# cd v2 && cargo run -p wifi-densepose-signal --bin cir_proof_runner \
|
|
# --release --no-default-features -- --generate-hash \
|
|
# > ../archive/v1/data/proof/expected_cir_features.sha256
|
|
- name: ADR-134 CIR witness proof (determinism guard)
|
|
run: bash scripts/verify-cir-proof.sh
|
|
|
|
- name: ADR-135 calibration witness proof (determinism guard)
|
|
run: bash scripts/verify-calibration-proof.sh
|
|
|
|
# Unit and Integration Tests
|
|
# Python pytest matrix — runs against the archived v1 Python tree.
|
|
# `continue-on-error: true` for the same reason as code-quality above:
|
|
# the archive is frozen reference, not blocking the Rust workspace PRs.
|
|
test:
|
|
name: Tests
|
|
runs-on: ubuntu-latest
|
|
continue-on-error: true
|
|
strategy:
|
|
fail-fast: false
|
|
matrix:
|
|
python-version: ['3.10', '3.11', '3.12']
|
|
services:
|
|
postgres:
|
|
image: postgres:15
|
|
env:
|
|
POSTGRES_PASSWORD: postgres
|
|
POSTGRES_DB: test_wifi_densepose
|
|
options: >-
|
|
--health-cmd pg_isready
|
|
--health-interval 10s
|
|
--health-timeout 5s
|
|
--health-retries 5
|
|
ports:
|
|
- 5432:5432
|
|
|
|
redis:
|
|
image: redis:7
|
|
options: >-
|
|
--health-cmd "redis-cli ping"
|
|
--health-interval 10s
|
|
--health-timeout 5s
|
|
--health-retries 5
|
|
ports:
|
|
- 6379:6379
|
|
|
|
steps:
|
|
- name: Checkout code
|
|
continue-on-error: true
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Python ${{ matrix.python-version }}
|
|
continue-on-error: true
|
|
uses: actions/setup-python@v6
|
|
with:
|
|
python-version: ${{ matrix.python-version }}
|
|
cache: 'pip'
|
|
|
|
- name: Install dependencies
|
|
continue-on-error: true
|
|
run: |
|
|
python -m pip install --upgrade pip
|
|
pip install -r requirements.txt
|
|
pip install pytest-cov pytest-xdist
|
|
|
|
- name: Run unit tests
|
|
continue-on-error: true
|
|
env:
|
|
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test_wifi_densepose
|
|
REDIS_URL: redis://localhost:6379/0
|
|
ENVIRONMENT: test
|
|
run: |
|
|
pytest archive/v1/tests/unit/ -v --cov=archive/v1/src --cov-report=xml --cov-report=html --junitxml=junit.xml
|
|
|
|
- name: Run integration tests
|
|
continue-on-error: true
|
|
env:
|
|
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test_wifi_densepose
|
|
REDIS_URL: redis://localhost:6379/0
|
|
ENVIRONMENT: test
|
|
run: |
|
|
pytest archive/v1/tests/integration/ -v --junitxml=integration-junit.xml
|
|
|
|
- name: Upload coverage reports
|
|
continue-on-error: true
|
|
uses: codecov/codecov-action@v6
|
|
with:
|
|
file: ./coverage.xml
|
|
flags: unittests
|
|
name: codecov-umbrella
|
|
|
|
- name: Upload test results
|
|
continue-on-error: true
|
|
uses: actions/upload-artifact@v4
|
|
if: always()
|
|
with:
|
|
name: test-results-${{ matrix.python-version }}
|
|
path: |
|
|
junit.xml
|
|
integration-junit.xml
|
|
htmlcov/
|
|
|
|
# Performance and Load Tests
|
|
# NOTE: tests/performance/locustfile.py and the src.api.main app path both
|
|
# predate the v1→archive/v1 reorganisation. continue-on-error: true until a
|
|
# proper locust suite is added under archive/v1/tests/performance/.
|
|
performance-test:
|
|
name: Performance Tests
|
|
runs-on: ubuntu-latest
|
|
needs: [test]
|
|
continue-on-error: true
|
|
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Python
|
|
uses: actions/setup-python@v6
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
cache: 'pip'
|
|
|
|
- name: Install dependencies
|
|
run: |
|
|
python -m pip install --upgrade pip
|
|
pip install -r requirements.txt
|
|
pip install pytest # the perf suite is pytest, not locust
|
|
|
|
# No "Start application" step: the gated test (test_frame_budget.py) drives
|
|
# the CSIProcessor pipeline in-process and makes no HTTP calls, so the old
|
|
# uvicorn server + `sleep 10` were dead weight — they only existed for the
|
|
# now-excluded api_throughput/inference_speed tests, and on every run dumped
|
|
# ~50 misleading "router requires hardware setup" ERROR lines for a server
|
|
# no test touched. MOCK_POSE_DATA is server-only and unused here.
|
|
|
|
- name: Run performance tests
|
|
working-directory: archive/v1
|
|
run: |
|
|
# Gate only on the genuine, deterministic perf guard:
|
|
# test_frame_budget.py times the *real* CSIProcessor pipeline against
|
|
# the ADR 50 ms per-frame budget (single-frame, p95 over 100 frames,
|
|
# +Doppler) — a true regression signal.
|
|
#
|
|
# test_api_throughput.py / test_inference_speed.py are excluded: every
|
|
# test there is a TDD red-phase stub (suffix `_should_fail_initially`)
|
|
# that times a *mock that sleeps* — meaningless as a perf signal, with
|
|
# machine-dependent wall-clock asserts (e.g. `actual_rps >= 40`,
|
|
# `batch_time < individual_time`) that are inherently flaky on shared
|
|
# CI runners, plus a cross-class fixture-scope bug. Forcing them green
|
|
# would be manufacturing a false signal; they stay in-repo for local
|
|
# TDD but do not gate CI until the underlying features are implemented.
|
|
#
|
|
# `python -m pytest` (not the bare `pytest` script) puts the cwd
|
|
# (archive/v1) on sys.path so `from src.core...` resolves — the bare
|
|
# script omits cwd and raises ModuleNotFoundError: No module named 'src'.
|
|
# -o addopts="" drops the root pyproject's --cov/--cov-fail-under=100.
|
|
python -m pytest tests/performance/test_frame_budget.py \
|
|
-o addopts="" -v --junitxml=perf-junit.xml
|
|
|
|
- name: Upload performance results
|
|
if: always()
|
|
uses: actions/upload-artifact@v4
|
|
with:
|
|
name: performance-results
|
|
path: archive/v1/perf-junit.xml
|
|
|
|
# Docker Build and Test
|
|
# NOTE: the canonical Docker build for the sensing-server is now
|
|
# `.github/workflows/sensing-server-docker.yml` (multi-registry push, asset
|
|
# smoke tests, bearer-auth smoke tests — #520/#514/#443). This job predates
|
|
# that workflow, points at a non-existent root `Dockerfile` with a
|
|
# non-existent `target: production`, and pushes to a mis-cased image name —
|
|
# `continue-on-error: true` until it's deleted or rewired to call the new
|
|
# workflow, so it doesn't gate the rest of the pipeline.
|
|
docker-build:
|
|
name: Docker Build & Test
|
|
runs-on: ubuntu-latest
|
|
needs: [code-quality, test, rust-tests]
|
|
continue-on-error: true
|
|
steps:
|
|
- name: Checkout code
|
|
continue-on-error: true
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Docker Buildx
|
|
continue-on-error: true
|
|
uses: docker/setup-buildx-action@v3
|
|
|
|
- name: Log in to Container Registry
|
|
continue-on-error: true
|
|
uses: docker/login-action@v3
|
|
with:
|
|
registry: ${{ env.REGISTRY }}
|
|
username: ${{ github.actor }}
|
|
password: ${{ secrets.GITHUB_TOKEN }}
|
|
|
|
- name: Extract metadata
|
|
continue-on-error: true
|
|
id: meta
|
|
uses: docker/metadata-action@v6
|
|
with:
|
|
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
|
|
tags: |
|
|
type=ref,event=branch
|
|
type=ref,event=pr
|
|
type=sha,prefix={{branch}}-
|
|
type=raw,value=latest,enable={{is_default_branch}}
|
|
|
|
- name: Build and push Docker image
|
|
continue-on-error: true
|
|
uses: docker/build-push-action@v7
|
|
with:
|
|
context: .
|
|
target: production
|
|
push: true
|
|
tags: ${{ steps.meta.outputs.tags }}
|
|
labels: ${{ steps.meta.outputs.labels }}
|
|
cache-from: type=gha
|
|
cache-to: type=gha,mode=max
|
|
platforms: linux/amd64,linux/arm64
|
|
|
|
- name: Test Docker image
|
|
continue-on-error: true
|
|
run: |
|
|
docker run --rm -d --name test-container -p 8000:8000 ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
|
|
sleep 10
|
|
curl -f http://localhost:8000/health || exit 1
|
|
docker stop test-container
|
|
|
|
- name: Run container security scan
|
|
continue-on-error: true
|
|
uses: aquasecurity/trivy-action@ed142fd0673e97e23eac54620cfb913e5ce36c25 # v0.36.0
|
|
with:
|
|
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
|
|
format: 'sarif'
|
|
output: 'trivy-results.sarif'
|
|
|
|
- name: Upload Trivy scan results
|
|
continue-on-error: true
|
|
uses: github/codeql-action/upload-sarif@v3
|
|
if: always()
|
|
with:
|
|
sarif_file: 'trivy-results.sarif'
|
|
|
|
# API Documentation
|
|
docs:
|
|
name: API Documentation
|
|
runs-on: ubuntu-latest
|
|
needs: [docker-build]
|
|
if: github.ref == 'refs/heads/main'
|
|
permissions:
|
|
contents: write # gh-pages deploy needs write (GITHUB_TOKEN is read-only by default -> 403)
|
|
steps:
|
|
- name: Checkout code
|
|
uses: actions/checkout@v4
|
|
|
|
- name: Set up Python
|
|
uses: actions/setup-python@v6
|
|
with:
|
|
python-version: ${{ env.PYTHON_VERSION }}
|
|
cache: 'pip'
|
|
|
|
- name: Install dependencies
|
|
run: |
|
|
python -m pip install --upgrade pip
|
|
pip install -r requirements.txt
|
|
|
|
- name: Generate OpenAPI spec
|
|
working-directory: archive/v1
|
|
env:
|
|
MOCK_POSE_DATA: "true" # no CSI hardware in CI
|
|
run: |
|
|
python -c "
|
|
from src.api.main import app
|
|
import json
|
|
with open('openapi.json', 'w') as f:
|
|
json.dump(app.openapi(), f, indent=2)
|
|
"
|
|
|
|
- name: Deploy to GitHub Pages
|
|
uses: peaceiris/actions-gh-pages@v4
|
|
continue-on-error: true # openapi generation above is the real validation; deploy is best-effort (Pages may be disabled)
|
|
with:
|
|
github_token: ${{ secrets.GITHUB_TOKEN }}
|
|
publish_dir: ./docs
|
|
destination_dir: api-docs
|
|
|
|
# Notification
|
|
notify:
|
|
name: Notify
|
|
runs-on: ubuntu-latest
|
|
needs: [code-quality, test, rust-tests, performance-test, docker-build, docs]
|
|
if: always()
|
|
permissions:
|
|
contents: write # required by softprops/action-gh-release
|
|
# GitHub Actions does not allow `secrets.X` directly in step-level `if:`
|
|
# expressions — only `env.X`. Promote the secret to env at job scope so
|
|
# the gating expression below is parseable.
|
|
env:
|
|
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
|
|
steps:
|
|
- name: Notify Slack on success
|
|
if: ${{ env.SLACK_WEBHOOK_URL != '' && needs.code-quality.result == 'success' && needs.test.result == 'success' && needs.docker-build.result == 'success' }}
|
|
uses: 8398a7/action-slack@v3
|
|
with:
|
|
status: success
|
|
channel: '#ci-cd'
|
|
text: '✅ CI pipeline completed successfully for ${{ github.ref }}'
|
|
|
|
- name: Notify Slack on failure
|
|
if: ${{ env.SLACK_WEBHOOK_URL != '' && (needs.code-quality.result == 'failure' || needs.test.result == 'failure' || needs.docker-build.result == 'failure') }}
|
|
uses: 8398a7/action-slack@v3
|
|
with:
|
|
status: failure
|
|
channel: '#ci-cd'
|
|
text: '❌ CI pipeline failed for ${{ github.ref }}'
|
|
|
|
- name: Create GitHub Release
|
|
if: github.ref == 'refs/heads/main' && needs.docker-build.result == 'success'
|
|
uses: softprops/action-gh-release@v2
|
|
with:
|
|
tag_name: v${{ github.run_number }}
|
|
name: Release v${{ github.run_number }}
|
|
body: |
|
|
Automated release from CI pipeline
|
|
|
|
**Changes:**
|
|
${{ github.event.head_commit.message }}
|
|
|
|
**Docker Image:**
|
|
`${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}`
|
|
draft: false
|
|
prerelease: false |