Files
ruvnet--RuView/.github/workflows/ci.yml
T
Workflow config file is invalid. Please check your config file: model.ReadWorkflow: yaml: unmarshal errors: line 91: mapping key "with" already defined at line 86
rUv e6f26e9ac9 docs(adr): deep review of the RuView npm surface — ADR-263/264/265 optimization strategies (#1229)
* docs(adr): deep review of the RuView npm surface — ADR-263/264/265 optimization strategies

ADR-263 — @ruvnet/ruview@0.1.0 harness review (O1–O9):
- HIGH: claim-check CLI fails open on empty input (no --text/--file -> PASS exit 0)
- HIGH: MCP stdio server head-of-line blocking (spawnSync verify/calibrate up to 600s)
- MEASURED: optionalDependencies triple the cold npx install (4 pkgs/620kB/71 files
  vs 1 pkg/172kB/22 files with --omit=optional) for a path that never imports them
- maxBuffer truncation, python -c port interpolation, version drift, duplicate skills,
  guardrail METRIC_TERMS substring false positives ('map'/'F1' — found by dogfooding
  claim-check on these very ADRs), zero CI

ADR-264 — @ruvnet/rvagent@0.1.0 + @ruv/ruview-cli review (O1–O9), verified against
the published registry tarball:
- HIGH: exports.require -> dist/index.cjs which is never built nor published
- MEASURED: 44 dead source-map files = 62,698B of the 188kB unpacked payload
- stdio-only server described as dual-transport; mixed dot/underscore tool names;
  double Zod validation + hand-duplicated advertised schemas; 2-fd leak per training
  job; unbounded body in the unwired HTTP scaffold; dead detectCogBinary candidates;
  ruview bin-name collision

ADR-265 — cross-cutting npm distribution strategy: npm-packages.yml CI matrix
(test + pack-content/size gate + tarball-install smoke test), publish-from-CI-only
with npm provenance, version single-sourcing from package.json, bin/namespace
ownership (ruview bin belongs to @ruvnet/ruview), claim-check on package READMEs.

Docs only — no runtime code changed. Index/CHANGELOG/CLAUDE.md/README counts updated.

Co-Authored-By: claude-flow <ruv@ruv.net>
Claude-Session: https://claude.ai/code/session_01WrGfTGKv1oWZ6iwXZACULz

* fix(npm): implement ADR-263/264/265 — harness fail-closed + async MCP, rvagent packaging/transport/naming, npm CI+provenance gate

ADR-263 (@ruvnet/ruview 0.2.0), O1-O9:
- claim-check fails closed on empty input (CLI exit 2, empty_text tool error)
- MCP stdio server dispatches tools/call asynchronously (promise-based spawn);
  ping answers while a 3s fake verify runs — pinned by new e2e test
- optionalDependencies dropped: cold npx installs exactly 1 package
  (MEASURED: was 4 pkgs/620kB/71 files via npm i in a clean prefix)
- bounded rolling output tails replace spawnSync 1MiB maxBuffer
- node_monitor port passed via sys.argv, never spliced into python -c source
- serverInfo.version read from package.json; resources/prompts stubs
- skills single-sourced: prepack sync script generates .claude/skills/ copies
- which() = memoized dep-free PATH scan
- tools underscore-canonical (ruview_claim_check, ...) + dotted aliases
- guardrail precision: word-boundary map/f1/auc/iou, code-span + F1/O2 label
  scrubbing, quantitative-claims-only; packaging reproducer hints
- 30/30 tests (was 17), incl. concurrency e2e + fail-open regression pins

ADR-264 (@ruvnet/rvagent 0.2.0), O1-O9:
- exports fixed: types-first, phantom dist/index.cjs require target removed
- tarball map-free: 127,704B unpacked / 46 files / 0 maps (MEASURED,
  npm pack --dry-run; was 188kB incl. 44 maps referencing unshipped src)
- Streamable HTTP actually wired behind RVAGENT_HTTP_PORT: one transport +
  one MCP server per session (mcp-session-id routing), 1MiB body cap (413),
  port-aware localhost origin gate; dual-transport description now true
- tools renamed underscore-canonical with dotted router-only aliases
- single Zod validation gate; advertised inputSchema generated from the same
  Zod source (zod-to-json-schema)
- train_count: parent log fds closed (was leaking 2/job); job records
  persisted to <jobsDir>/<id>.json (job_status survives restarts); bounded
  log-tail reads
- detectCogBinary probes its candidates instead of dead-coding them
- version from package.json; @types/express dropped; @types/jest -> 29
- README rewritten to match reality (no phantom subcommands/policy layer)
- 99/99 jest tests (incl. new session/body-cap suite + previously-broken
  manifest suite); stdio handshake + HTTP session flow smoke-tested live

ADR-265 D1-D4:
- .github/workflows/npm-packages.yml: 3-package x Node 20/22 gate — tests,
  version-literal grep (D3), pack-content/size gate, tarball-install smoke
  test (catches the ADR-264 F1 class), README claim-check (D4)
- .github/workflows/ruview-npm-release.yml: publish from CI only with
  npm publish --provenance
- @ruv/ruview-cli bin renamed ruview-cli (ruview bin belongs to
  @ruvnet/ruview); version single-sourced
- ci.yml NODE_VERSION 18 -> 20

ADR statuses updated to Accepted/implemented; harness manifest re-pinned;
ADR-263/264/265 + both package READMEs pass claim-check.

Co-Authored-By: claude-flow <ruv@ruv.net>
Claude-Session: https://claude.ai/code/session_01WrGfTGKv1oWZ6iwXZACULz

* perf(rvagent): lazy-load HTTP transport + memoize generated tool schemas

stdio time-to-first-response ~242ms -> ~189ms (-22%; MEASURED, median of
repeated initialize round-trips against dist/index.js in this container).

- ./http-transport.js now imported lazily inside the RVAGENT_HTTP_PORT
  branch: it chain-loads the MCP SDK streamableHttp module (~48ms MEASURED
  via per-module import() timing) which the default stdio path never uses
- toolInputJsonSchema memoized per tool: schemas are static for the process
  lifetime; under the session-per-server HTTP model every session calls
  tools/list, so stop re-walking the Zod tree each time

No behavior change: 99/99 jest tests; HTTP session flow re-smoke-tested
through the lazy import path (initialize -> 200 + mcp-session-id).

Profiled @ruvnet/ruview too and left it alone: 50ms CLI startup vs ~29ms
bare 'node -e ""' floor on the same box (MEASURED) — already near the
interpreter floor with zero dependencies.

Co-Authored-By: claude-flow <ruv@ruv.net>
Claude-Session: https://claude.ai/code/session_01WrGfTGKv1oWZ6iwXZACULz

* ci(ruview-cli): pass jest --passWithNoTests so the private no-test package doesn't fail the npm-packages matrix

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(npm): address 10 verified review findings in harness + rvagent before 0.2.0 publish

harness/ruview (@ruvnet/ruview):
- guardrails: digit gate now sees numbers inside code spans; F1-style
  metric tokens followed by ':' or a nearby number are no longer scrubbed
  (fail-open regressions in the honesty gate)
- mcp-server: tools/call requests serialize through a FIFO promise chain
  (hardware/mutating tools never overlap) while ping/tools/list stay
  immediate; stdin close drains in-flight responses before exit
- tools: which() no longer memoizes negative lookups

tools/ruview-mcp (@ruvnet/rvagent):
- index: realpath invoked-directly guard — library import no longer
  connects a stdio transport to the consumer's process
- http-transport: explicit allowedOrigins is exact-match only (localhost
  any-port convenience applies only with no configured allowlist);
  session map gains maxSessions=64 + 5min idle TTL sweep
- train-count: job records persist the child pid and reconcile stale
  'running' status after a server restart (exit-code marker or dead pid)
- config: cog binary candidates ordered by process.arch

.github/workflows/ruview-npm-release.yml: port the full ADR-265 D1 gate
(version-literal check, unpacked-size budget, tarball-install smoke test)
from npm-packages.yml so the publish path enforces what the header claims.

Tests: harness 30→36, rvagent 99→112, all passing.

Co-Authored-By: claude-flow <ruv@ruv.net>

---------

Co-authored-by: Claude <noreply@anthropic.com>
2026-07-02 13:11:15 -04:00

503 lines
18 KiB
YAML

name: Continuous Integration
on:
push:
branches: [ main, develop, 'feature/*', 'feat/*', 'hotfix/*' ]
pull_request:
branches: [ main, develop ]
workflow_dispatch:
env:
PYTHON_VERSION: '3.11'
NODE_VERSION: '20' # ADR-265: all Node packages in this repo declare engines >= 20
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
# Code Quality and Security Checks
# The Python codebase moved to `archive/v1/` when the runtime was rewritten in
# Rust under `v2/`. The lint/format/type/scan checks below still run against
# the archive for hygiene, but with `continue-on-error: true` everywhere — the
# archive is frozen reference code, not active development, so a stale lint
# rule shouldn't gate PRs to the Rust workspace.
code-quality:
name: Code Quality & Security
runs-on: ubuntu-latest
continue-on-error: true
steps:
- name: Checkout code
continue-on-error: true
uses: actions/checkout@v4
with:
submodules: recursive
fetch-depth: 0
- name: Set up Python
continue-on-error: true
uses: actions/setup-python@v6
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'
- name: Install dependencies
continue-on-error: true
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install black flake8 mypy bandit safety
- name: Code formatting check (Black)
continue-on-error: true
run: black --check --diff archive/v1/src archive/v1/tests
- name: Linting (Flake8)
continue-on-error: true
run: flake8 archive/v1/src archive/v1/tests --max-line-length=88 --extend-ignore=E203,W503
- name: Type checking (MyPy)
continue-on-error: true
run: mypy archive/v1/src --ignore-missing-imports
- name: Security scan (Bandit)
run: bandit -r archive/v1/src -f json -o bandit-report.json
continue-on-error: true
- name: Dependency vulnerability scan (Safety)
run: safety check --json --output safety-report.json
continue-on-error: true
- name: Upload security reports
continue-on-error: true
uses: actions/upload-artifact@v4
if: always()
with:
name: security-reports
path: |
bandit-report.json
safety-report.json
# Rust Workspace Tests
rust-tests:
name: Rust Workspace Tests
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
submodules: recursive
# ADR-262 P1: `wifi-densepose-rufield` path-deps the `vendor/rufield`
# submodule. Without a recursive checkout the workspace build fails to
# resolve those path deps in CI even though it passes locally.
with:
submodules: recursive
# `wifi-densepose-desktop` is a Tauri v2 app — `glib-sys`, `gtk-sys`,
# `webkit2gtk-sys`, etc. need the Linux dev libraries via pkg-config or the
# workspace test fails at the build step before any test runs (every recent
# main CI run has been red on this for exactly this reason). Install the
# standard Tauri-on-Ubuntu set.
- name: Install Tauri / GTK / serial system dev libraries
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
libglib2.0-dev \
libgtk-3-dev \
libsoup-3.0-dev \
libjavascriptcoregtk-4.1-dev \
libwebkit2gtk-4.1-dev \
libayatana-appindicator3-dev \
librsvg2-dev \
libxdo-dev \
libudev-dev \
libdbus-1-dev \
libssl-dev \
pkg-config
- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@stable
# Swatinem/rust-cache replaces a naive `actions/cache` of the whole
# `v2/target`. That manual cache of a 38-crate target dir (multi-GB) was an
# intermittent failure source — several CI runs this cycle died at the
# cache/setup step (after toolchain install, before "Run Rust tests"),
# needing a rerun. rust-cache is purpose-built for Rust: it caches the
# registry + git + a pruned target, evicts stale deps, and restores far more
# reliably (and faster) on large workspaces. `workspaces: v2` points it at
# the v2/ cargo workspace (keys on v2/Cargo.lock, caches v2/target).
- name: Cache cargo (Swatinem/rust-cache)
uses: Swatinem/rust-cache@v2
with:
workspaces: v2
# The 38-crate workspace debug build exhausts the runner's disk when built
# with full debuginfo (observed: "final link failed: No space left on
# device" once the engine/benchmark crates landed; the same tree's local
# debug target measured 151 GB). Debuginfo is useless in CI — tests either
# pass or print their failure — so build without it; target shrinks ~5-10x.
- name: Run Rust tests
working-directory: v2
env:
CARGO_PROFILE_DEV_DEBUG: "0"
CARGO_PROFILE_TEST_DEBUG: "0"
run: cargo test --workspace --no-default-features
- name: Run ADR-147 worldmodel tests
working-directory: v2
env:
CARGO_PROFILE_DEV_DEBUG: "0"
CARGO_PROFILE_TEST_DEBUG: "0"
run: cargo test -p wifi-densepose-worldmodel --no-default-features
# ADR-134 CIR tests are behind the `cir` feature so the bench dependency
# (Criterion) only pulls when actually exercised. Run them as a separate
# step so a CIR-only regression is unambiguously attributable.
- name: Run ADR-134 CIR tests
working-directory: v2
run: cargo test -p wifi-densepose-signal --no-default-features --features cir --tests
# ADR-134 + ADR-028 witness guard. The CIR proof runner produces a
# bit-deterministic SHA-256 over CirEstimator output on the synthetic
# reference signal. Any algorithmic regression — changes to ISTA
# convergence, sensing matrix construction, soft-thresholding, or input
# padding — breaks the hash and fails the build. To regenerate after an
# *intentional* change:
# cd v2 && cargo run -p wifi-densepose-signal --bin cir_proof_runner \
# --release --no-default-features -- --generate-hash \
# > ../archive/v1/data/proof/expected_cir_features.sha256
- name: ADR-134 CIR witness proof (determinism guard)
run: bash scripts/verify-cir-proof.sh
- name: ADR-135 calibration witness proof (determinism guard)
run: bash scripts/verify-calibration-proof.sh
# Unit and Integration Tests
# Python pytest matrix — runs against the archived v1 Python tree.
# `continue-on-error: true` for the same reason as code-quality above:
# the archive is frozen reference, not blocking the Rust workspace PRs.
test:
name: Tests
runs-on: ubuntu-latest
continue-on-error: true
strategy:
fail-fast: false
matrix:
python-version: ['3.10', '3.11', '3.12']
services:
postgres:
image: postgres:15
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test_wifi_densepose
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- name: Checkout code
continue-on-error: true
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Python ${{ matrix.python-version }}
continue-on-error: true
uses: actions/setup-python@v6
with:
python-version: ${{ matrix.python-version }}
cache: 'pip'
- name: Install dependencies
continue-on-error: true
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest-cov pytest-xdist
- name: Run unit tests
continue-on-error: true
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test_wifi_densepose
REDIS_URL: redis://localhost:6379/0
ENVIRONMENT: test
run: |
pytest archive/v1/tests/unit/ -v --cov=archive/v1/src --cov-report=xml --cov-report=html --junitxml=junit.xml
- name: Run integration tests
continue-on-error: true
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test_wifi_densepose
REDIS_URL: redis://localhost:6379/0
ENVIRONMENT: test
run: |
pytest archive/v1/tests/integration/ -v --junitxml=integration-junit.xml
- name: Upload coverage reports
continue-on-error: true
uses: codecov/codecov-action@v6
with:
file: ./coverage.xml
flags: unittests
name: codecov-umbrella
- name: Upload test results
continue-on-error: true
uses: actions/upload-artifact@v4
if: always()
with:
name: test-results-${{ matrix.python-version }}
path: |
junit.xml
integration-junit.xml
htmlcov/
# Performance and Load Tests
# NOTE: tests/performance/locustfile.py and the src.api.main app path both
# predate the v1→archive/v1 reorganisation. continue-on-error: true until a
# proper locust suite is added under archive/v1/tests/performance/.
performance-test:
name: Performance Tests
runs-on: ubuntu-latest
needs: [test]
continue-on-error: true
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest # the perf suite is pytest, not locust
# No "Start application" step: the gated test (test_frame_budget.py) drives
# the CSIProcessor pipeline in-process and makes no HTTP calls, so the old
# uvicorn server + `sleep 10` were dead weight — they only existed for the
# now-excluded api_throughput/inference_speed tests, and on every run dumped
# ~50 misleading "router requires hardware setup" ERROR lines for a server
# no test touched. MOCK_POSE_DATA is server-only and unused here.
- name: Run performance tests
working-directory: archive/v1
run: |
# Gate only on the genuine, deterministic perf guard:
# test_frame_budget.py times the *real* CSIProcessor pipeline against
# the ADR 50 ms per-frame budget (single-frame, p95 over 100 frames,
# +Doppler) — a true regression signal.
#
# test_api_throughput.py / test_inference_speed.py are excluded: every
# test there is a TDD red-phase stub (suffix `_should_fail_initially`)
# that times a *mock that sleeps* — meaningless as a perf signal, with
# machine-dependent wall-clock asserts (e.g. `actual_rps >= 40`,
# `batch_time < individual_time`) that are inherently flaky on shared
# CI runners, plus a cross-class fixture-scope bug. Forcing them green
# would be manufacturing a false signal; they stay in-repo for local
# TDD but do not gate CI until the underlying features are implemented.
#
# `python -m pytest` (not the bare `pytest` script) puts the cwd
# (archive/v1) on sys.path so `from src.core...` resolves — the bare
# script omits cwd and raises ModuleNotFoundError: No module named 'src'.
# -o addopts="" drops the root pyproject's --cov/--cov-fail-under=100.
python -m pytest tests/performance/test_frame_budget.py \
-o addopts="" -v --junitxml=perf-junit.xml
- name: Upload performance results
if: always()
uses: actions/upload-artifact@v4
with:
name: performance-results
path: archive/v1/perf-junit.xml
# Docker Build and Test
# NOTE: the canonical Docker build for the sensing-server is now
# `.github/workflows/sensing-server-docker.yml` (multi-registry push, asset
# smoke tests, bearer-auth smoke tests — #520/#514/#443). This job predates
# that workflow, points at a non-existent root `Dockerfile` with a
# non-existent `target: production`, and pushes to a mis-cased image name —
# `continue-on-error: true` until it's deleted or rewired to call the new
# workflow, so it doesn't gate the rest of the pipeline.
docker-build:
name: Docker Build & Test
runs-on: ubuntu-latest
needs: [code-quality, test, rust-tests]
continue-on-error: true
steps:
- name: Checkout code
continue-on-error: true
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Docker Buildx
continue-on-error: true
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
continue-on-error: true
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
continue-on-error: true
id: meta
uses: docker/metadata-action@v6
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix={{branch}}-
type=raw,value=latest,enable={{is_default_branch}}
- name: Build and push Docker image
continue-on-error: true
uses: docker/build-push-action@v7
with:
context: .
target: production
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
platforms: linux/amd64,linux/arm64
- name: Test Docker image
continue-on-error: true
run: |
docker run --rm -d --name test-container -p 8000:8000 ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
sleep 10
curl -f http://localhost:8000/health || exit 1
docker stop test-container
- name: Run container security scan
continue-on-error: true
uses: aquasecurity/trivy-action@ed142fd0673e97e23eac54620cfb913e5ce36c25 # v0.36.0
with:
image-ref: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
format: 'sarif'
output: 'trivy-results.sarif'
- name: Upload Trivy scan results
continue-on-error: true
uses: github/codeql-action/upload-sarif@v3
if: always()
with:
sarif_file: 'trivy-results.sarif'
# API Documentation
docs:
name: API Documentation
runs-on: ubuntu-latest
needs: [docker-build]
if: github.ref == 'refs/heads/main'
permissions:
contents: write # gh-pages deploy needs write (GITHUB_TOKEN is read-only by default -> 403)
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
submodules: recursive
- name: Set up Python
uses: actions/setup-python@v6
with:
python-version: ${{ env.PYTHON_VERSION }}
cache: 'pip'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Generate OpenAPI spec
working-directory: archive/v1
env:
MOCK_POSE_DATA: "true" # no CSI hardware in CI
run: |
python -c "
from src.api.main import app
import json
with open('openapi.json', 'w') as f:
json.dump(app.openapi(), f, indent=2)
"
- name: Deploy to GitHub Pages
uses: peaceiris/actions-gh-pages@v4
continue-on-error: true # openapi generation above is the real validation; deploy is best-effort (Pages may be disabled)
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
publish_dir: ./docs
destination_dir: api-docs
# Notification
notify:
name: Notify
runs-on: ubuntu-latest
needs: [code-quality, test, rust-tests, performance-test, docker-build, docs]
if: always()
permissions:
contents: write # required by softprops/action-gh-release
# GitHub Actions does not allow `secrets.X` directly in step-level `if:`
# expressions — only `env.X`. Promote the secret to env at job scope so
# the gating expression below is parseable.
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
steps:
- name: Notify Slack on success
if: ${{ env.SLACK_WEBHOOK_URL != '' && needs.code-quality.result == 'success' && needs.test.result == 'success' && needs.docker-build.result == 'success' }}
uses: 8398a7/action-slack@v3
with:
status: success
channel: '#ci-cd'
text: '✅ CI pipeline completed successfully for ${{ github.ref }}'
- name: Notify Slack on failure
if: ${{ env.SLACK_WEBHOOK_URL != '' && (needs.code-quality.result == 'failure' || needs.test.result == 'failure' || needs.docker-build.result == 'failure') }}
uses: 8398a7/action-slack@v3
with:
status: failure
channel: '#ci-cd'
text: '❌ CI pipeline failed for ${{ github.ref }}'
- name: Create GitHub Release
if: github.ref == 'refs/heads/main' && needs.docker-build.result == 'success'
uses: softprops/action-gh-release@v2
with:
tag_name: v${{ github.run_number }}
name: Release v${{ github.run_number }}
body: |
Automated release from CI pipeline
**Changes:**
${{ github.event.head_commit.message }}
**Docker Image:**
`${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}`
draft: false
prerelease: false