Compare commits

...

10 Commits

Author SHA1 Message Date
rUv 88b835dd89 fix(ci): perf job gates on the real frame-budget guard, not TDD stubs (#915)
After #914 fixed collection, the perf job actually ran the suite and
exposed that test_api_throughput.py / test_inference_speed.py are TDD
red-phase stubs (every test suffixed `_should_fail_initially`) that time
a *mock that sleeps* — not a real perf signal. They carry machine-
dependent wall-clock asserts (actual_rps >= 40, batch_time < individual_time)
that are inherently flaky on shared CI runners, plus a cross-class
fixture-scope bug (`fixture 'standard_model' not found`). Result: 3 failed,
10 errored — by design, not a regression.

Forcing those green would manufacture a false signal. Instead, gate only
on test_frame_budget.py, which times the *real* CSIProcessor pipeline
against the ADR 50 ms per-frame budget (single-frame, p95/100-frames,
+Doppler) — a genuine regression guard. Verified locally: 3 passed.

The stub files remain in-repo for local TDD; they re-enter CI when their
features are implemented and the mock-timing asserts are made deterministic.
2026-06-02 18:31:55 +02:00
rUv f8f08076eb fix(ci): perf tests — use python -m pytest so src import resolves (#914)
The Performance Tests job collected 26 items then aborted with
`ModuleNotFoundError: No module named 'src'` on test_frame_budget.py,
which does `from src.core.csi_processor import CSIProcessor`. The bare
`pytest` console script does not put the cwd (archive/v1) on sys.path;
`python -m pytest` does. pytest aborts the whole session on a collection
error, so this one import masked the entire (otherwise mock-based,
self-contained) perf suite.

Verified locally: bare-script path reproduces the exact error; `-m`
resolves it and test_frame_budget.py passes 3/3. The other two files
(test_api_throughput.py mock server, test_inference_speed.py MockPoseModel
+psutil) are fully self-contained — no test hits the running server.

Closes the last red job in the v1-API CI chain (#910/#911/#913).
2026-06-02 18:12:00 +02:00
rUv 55f6a74e1e Merge pull request #913 from ruvnet/fix/ci-v1-api-perms-locust
ci(v1-api): fix gh-pages 403 + run real pytest perf suite
2026-06-02 17:36:43 +02:00
ruv b5a91c5635 ci(v1-api): install pytest, drop root --cov addopts for perf suite, ascii comment 2026-06-02 17:29:04 +02:00
ruv 308d2fc89d ci(v1-api): fix gh-pages 403 + run real perf suite — green main CI
Two more latent v1-API CI bugs surfaced once #910/#911 let the jobs reach
their later steps:

- API Documentation: openapi generation now succeeds (psutil fix), but the
  gh-pages deploy failed with HTTP 403 — the job had no `permissions` block
  and GITHUB_TOKEN is read-only by default. Add `permissions: contents:
  write`, and make the deploy `continue-on-error` (the openapi generation is
  the real validation; Pages may be disabled).
- Performance Tests: ran `locust -f tests/performance/locustfile.py`, but
  there is no locustfile — the suite is pytest (test_api_throughput.py,
  test_frame_budget.py, test_inference_speed.py). Run pytest instead, with
  working-directory: archive/v1 and MOCK_POSE_DATA=true.

ci.yml validated as well-formed YAML.
2026-06-02 17:26:39 +02:00
rUv 5038e3c8e1 Merge pull request #911 from ruvnet/fix/ci-v1-api-mock-mode
ci(v1-api): MOCK_POSE_DATA + declare psutil — green Performance Tests & API Docs
2026-06-02 06:20:21 -04:00
ruv e239af3636 fix(deps): declare psutil in requirements.txt — green API Documentation CI
The API Documentation job (and any env without locust) failed with
`ModuleNotFoundError: No module named 'psutil'` when importing the app:
psutil is imported by src/api/routers/health.py, services/metrics.py,
commands/status.py, and tasks/monitoring.py, but was never declared as a
dependency — it only happened to be present where locust (Performance
Tests) pulled it in transitively. Declare it explicitly (psutil>=5.9.0).

Verified locally: `from src.api.main import app; app.openapi()` (the exact
docs-job operation) now succeeds.
2026-06-02 12:11:55 +02:00
ruv 4856afbd0c ci(v1-api): run Performance Tests + API Docs with MOCK_POSE_DATA=true
After the DensePoseHead startup fix (#910), the v1 API starts, but the
Performance Tests load-hit the pose endpoints which error "requires real
CSI data" (no hardware in CI, mock_pose_data defaults False), and the
API-docs job imports the app the same way. Set MOCK_POSE_DATA=true on both
jobs so they exercise the mock path. Verified: the env var maps to
settings.mock_pose_data=True (pydantic, no env_prefix).

(Note: Performance Tests is continue-on-error so this is cleanup, not a
run-blocker; the run-level red on main has been transient Docker Hub pull
timeouts on Tests/docker-build, which are infra flakes that pass on re-run.)
2026-06-02 12:04:58 +02:00
rUv 4d205a05c4 Merge pull request #910 from ruvnet/fix/v1-pose-service-densepose-config
fix(v1-api): pass required config to DensePoseHead — green main CI
2026-06-02 05:50:25 -04:00
ruv bc42ae7903 fix(v1-api): pass required config to DensePoseHead — green main CI
The "Continuous Integration" workflow (Performance Tests + API
Documentation jobs) has failed on every main commit since the API start
path was exercised: pose_service._initialize_models() called
`DensePoseHead()` with no args, but DensePoseHead.__init__ requires a
config dict → "TypeError: DensePoseHead.__init__() missing 1 required
positional argument: 'config'" → uvicorn "Application startup failed".

Pass a config: input_channels=256 (matches the modality translator's
output), num_body_parts=24 (DensePose standard), num_uv_coordinates=2.
Both call sites (with/without pose_model_path) fixed.

Verified locally: DensePoseHead(config) + ModalityTranslationNetwork(config)
both construct + eval, clearing the startup TypeError.
2026-06-02 11:42:52 +02:00
3 changed files with 48 additions and 6 deletions
+35 -3
View File
@@ -265,23 +265,50 @@ jobs:
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install locust
pip install pytest # the perf suite is pytest, not locust
- name: Start application
working-directory: archive/v1
env:
# No CSI hardware in CI — serve mock pose data so the pose endpoints
# respond 200 under load instead of erroring "requires real CSI data".
MOCK_POSE_DATA: "true"
run: |
uvicorn src.api.main:app --host 0.0.0.0 --port 8000 &
sleep 10
- name: Run performance tests
working-directory: archive/v1
env:
MOCK_POSE_DATA: "true"
run: |
locust -f tests/performance/locustfile.py --headless --users 50 --spawn-rate 5 --run-time 60s --host http://localhost:8000
# Gate only on the genuine, deterministic perf guard:
# test_frame_budget.py times the *real* CSIProcessor pipeline against
# the ADR 50 ms per-frame budget (single-frame, p95 over 100 frames,
# +Doppler) — a true regression signal.
#
# test_api_throughput.py / test_inference_speed.py are excluded: every
# test there is a TDD red-phase stub (suffix `_should_fail_initially`)
# that times a *mock that sleeps* — meaningless as a perf signal, with
# machine-dependent wall-clock asserts (e.g. `actual_rps >= 40`,
# `batch_time < individual_time`) that are inherently flaky on shared
# CI runners, plus a cross-class fixture-scope bug. Forcing them green
# would be manufacturing a false signal; they stay in-repo for local
# TDD but do not gate CI until the underlying features are implemented.
#
# `python -m pytest` (not the bare `pytest` script) puts the cwd
# (archive/v1) on sys.path so `from src.core...` resolves — the bare
# script omits cwd and raises ModuleNotFoundError: No module named 'src'.
# -o addopts="" drops the root pyproject's --cov/--cov-fail-under=100.
python -m pytest tests/performance/test_frame_budget.py \
-o addopts="" -v --junitxml=perf-junit.xml
- name: Upload performance results
if: always()
uses: actions/upload-artifact@v4
with:
name: performance-results
path: locust_report.html
path: archive/v1/perf-junit.xml
# Docker Build and Test
# NOTE: the canonical Docker build for the sensing-server is now
@@ -367,6 +394,8 @@ jobs:
runs-on: ubuntu-latest
needs: [docker-build]
if: github.ref == 'refs/heads/main'
permissions:
contents: write # gh-pages deploy needs write (GITHUB_TOKEN is read-only by default -> 403)
steps:
- name: Checkout code
uses: actions/checkout@v4
@@ -384,6 +413,8 @@ jobs:
- name: Generate OpenAPI spec
working-directory: archive/v1
env:
MOCK_POSE_DATA: "true" # no CSI hardware in CI
run: |
python -c "
from src.api.main import app
@@ -394,6 +425,7 @@ jobs:
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v4
continue-on-error: true # openapi generation above is the real validation; deploy is best-effort (Pages may be disabled)
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs
+12 -3
View File
@@ -107,16 +107,25 @@ class PoseService:
async def _initialize_models(self):
"""Initialize neural network models."""
try:
# Initialize DensePose model
# Initialize DensePose model. DensePoseHead requires a config
# dict — input_channels matches the modality translator's output
# (256), with the standard DensePose 24 body parts and 2 (U,V)
# coordinates. (Previously called with no args → TypeError at
# startup, which broke the API service.)
densepose_config = {
'input_channels': 256,
'num_body_parts': 24,
'num_uv_coordinates': 2,
}
if self.settings.pose_model_path:
self.densepose_model = DensePoseHead()
self.densepose_model = DensePoseHead(densepose_config)
# Load model weights if path is provided
# model_state = torch.load(self.settings.pose_model_path)
# self.densepose_model.load_state_dict(model_state)
self.logger.info("DensePose model loaded")
else:
self.logger.warning("No pose model path provided, using default model")
self.densepose_model = DensePoseHead()
self.densepose_model = DensePoseHead(densepose_config)
# Initialize modality translation
config = {
+1
View File
@@ -36,3 +36,4 @@ scikit-learn>=1.2.0
# Monitoring dependencies
prometheus-client>=0.16.0
psutil>=5.9.0 # system metrics — imported by health.py / metrics.py / status.py / monitoring.py