mirror of
https://github.com/ruvnet/RuView
synced 2026-06-09 10:13:17 +00:00
main
15 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b6420ac9ba |
fix(server): make synthetic CSI opt-in only (sibling fix to #937) (#979)
Background Issue #937 in the cognitum-v0 appliance repo flagged that the `cognitum-csi-capture` systemd unit shipped `--simulate` by default, silently serving synthetic CSI tagged as production telemetry on `/api/v1/sensor/stream`. That's a textbook trust-eroding pattern — the single most-cited "where's the real data?" evidence external reviewers (#943, #934) point at when they call the project AI-slop. A grep across THIS tree surfaced the exact same anti-pattern in three places: docker/docker-compose.yml:27 # auto (default) — probe ESP32, fall back to simulation docker/docker-entrypoint.sh:14 # CSI_SOURCE — data source: auto (default), ... main.rs:6435 info!("No hardware detected, using simulation"); "simulate" The sensing-server's `auto` source resolver at main.rs:6425-6440 silently fell back to synthetic with only an `info!` log line as the signal. Downstream consumers calling `/api/v1/sensing/latest` or `/ws/sensing` had no in-band way to know they were being served fake data. Fix `auto` now refuses to fall back. When neither ESP32 UDP nor host WiFi is detected, the server logs a clear `error!` explaining the situation and exits 78 (EX_CONFIG). The error message names the two ways to proceed: provision real hardware, or set `--source simulated` / `CSI_SOURCE=simulated` explicitly. Existing operators who already use `--source simulated` (or its legacy `simulate` alias) are unaffected — the alias is preserved for back-compat. Docker entrypoint comment, docker-compose comment, and the Tauri desktop app's source-default path also updated to reflect the new posture. The desktop app keeps its `simulated` default because it's an explicit demo product — the value passed downstream is the *explicit* `simulated`, not `auto`, so the server tags it correctly and never lies about its data source. Validation cargo build -p wifi-densepose-sensing-server --no-default-features cargo test -p wifi-densepose-sensing-server --no-default-features → 122 / 122 pass, build clean (existing pre-fix warnings unchanged). Deployment ⚠ Breaking change for unattended deployments that relied on the `auto → simulated` silent fallback. That is exactly the failure mode this PR fixes: pretending to serve real sensing data when the source is fake. Operators who genuinely want demo mode set `CSI_SOURCE=simulated` explicitly; the error message and the docker-compose comment both point them there. |
||
|
|
c353255672 |
fix: firmware cluster — wasm3 IDF v6.0 build (#946) + swarm TLS stack (#949) + Docker unauth default (#864) (#975)
* fix(firmware,docker): clear three high-severity bugs in one sweep Closes #946 — wasm3 fails on Xtensa GCC 15.2.0 (ESP-IDF v6.0.1) cannot tail-call: machine description does not have a sibcall_epilogue instruction pattern wasm3's `M3_MUSTTAIL return jumpOpImpl(...)` uses `__attribute__((musttail))` which GCC 15 enforces strictly on Xtensa, where the backend never reliably implemented sibling-call epilogues. Define `M3_NO_MUSTTAIL=1` in the wasm3 component compile-defs so the macro expands to plain `return` — slightly slower per opcode dispatch but functionally identical, and the only change needed in this tree. Older IDF / GCC builds accept the define as a no-op so the IDF v5.4 CI build is unchanged. Closes #949 — swarm task stack overflow on Seed TLS init The reporter provisioned with `--seed-url https://...` which exercises TLS, and the task panicked with the FreeRTOS stack-fill sentinel `0xa5a5a5a5` immediately after the bridge init line. `SWARM_TASK_STACK` was 3 KB ("HTTP client uses ~2.5 KB" per the original comment) — fine for plain HTTP, far too small for mbedTLS handshake which alone wants 4-6 KB for the cipher suite + cert chain + ECDH state, plus another 1.5-2 KB for esp_http_client. Bumped to 8192 with the why in the comment. Plain-HTTP deployments waste ~5 KB headroom (negligible PSRAM cost) but the bug class is closed. Closes #864 — Docker default exposes unauthenticated sensing API + WS `docker-entrypoint.sh` started the sensing-server with `--bind-addr 0.0.0.0` AND empty `RUVIEW_API_TOKEN` AND docker-compose published 3000/3001/5005 — anyone on a reachable network segment could read /api/v1/sensing/latest and the /ws/sensing live frame stream. Now the entrypoint refuses to start when: RUVIEW_API_TOKEN is empty AND RUVIEW_ALLOW_UNAUTHENTICATED is not "1" AND RUVIEW_BIND_ADDR is not loopback / localhost / ::1 …and prints exactly which three escape hatches the operator can take (set the token, opt in explicitly, or pin to loopback). Also wires RUVIEW_BIND_ADDR through to --bind-addr so the loopback escape hatch is one env var, not a flag override. cog-ha-matter / homecore routes are excluded from this check since they own their own auth lifecycle. This is a breaking change for unattended LAN deployments — exactly what the reporter asked for. Validation * `idf.py build` for esp32s3 target — succeeds (#946 fix doesn't affect default IDF v5.4 build path). * `idf.py set-target esp32c6 && idf.py build` — succeeds, binary 1015 KB / 45% partition free. * Hardware flash to COM12 (C6) failed with "No serial data received" — XIAO C6 needs manual BOOT-hold+RESET; couldn't drive that without operator. Code is correct per build + review; runtime validation needs the operator to press the BOOT button at flash time. * docker-entrypoint.sh changes are shell-only — exercised by reading the path under the four escape-hatch conditions. Out of scope — cross-repo issues Issues #935 (cognitum-agent mesh panics), #936 (CSI relay routing), and #937 (cognitum-csi-capture --simulate default) reference `cognitum-agent` / `csi-capture` / `csi-relay-routes.json` artifacts that live in the cognitum-v0 appliance repo, not this tree. Issue #954 (CSI callback never fires on S3 v0.6.5/v0.7.0) is not addressed here — the reporter is on the S3 (COM9 in this lab) but the hardware path needs an interactive debug session with a configurable AP traffic source to pin the root cause (MGMT-only filter, traffic filter MAC, or driver-level callback wiring). Will tackle in a follow-up. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(firmware): bump LWIP UDP / WiFi TX buffer pools to ease ENOMEM Hardware validation on COM8 (S3) and COM9 (C6) surfaced a v0.7.0 regression not captured in the existing issue tracker: stock IDF v5.4 defaults (UDP recv mbox = 6, TCPIP recv mbox = 32, WiFi dynamic TX buffers = 32) are too small for the v0.7.0 packet mix once CSI promiscuous mode is active. The boot trace showed `stream_sender: sendto ENOMEM — backing off for 100 ms` repeating every capture cycle, with the csi_collector path reporting `fail #1..5` within seconds of associating to an AP. Modest bumps applied (~3 KB extra heap each): CONFIG_LWIP_UDP_RECVMBOX_SIZE 6 → 32 CONFIG_LWIP_TCPIP_RECVMBOX_SIZE 32 → 64 CONFIG_ESP_WIFI_DYNAMIC_TX_BUFFER_NUM 32 → 64 Empirical 25 s measurement on S3 / COM8 post-fix: csi_collector fail # : 1-5 → 0 (full path drained) stream_sender ENOMEM hits / sec : 8-15 → 8 (capped by 100 ms backoff) CSI cb rate : ~28 cb/s, yield max 18 pps feature_state emit failed : still present A second, more aggressive iteration (DYNAMIC_TX=128, PBUF_POOL=32, TCP SND/WND=16384) was tested and reverted — the ENOMEM count was identical to the modest bump. The residual 8/s is structural: it's the 100 ms backoff window ceiling × the adaptive_controller emit cadence which currently fires roughly every 50 ms instead of the intended 1 Hz. Bigger buffers don't fix that — only rate-limiting the emitter does. Code-level rate-limit refactor is tracked separately to keep this PR scoped to the bundle that landed mechanically. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(firmware): rate-limit feature_state emit from 5 Hz → 1 Hz Completes the ENOMEM cure that the LWIP/WiFi buffer bumps started. Root cause (verified on COM8 / S3 + COM9 / C6) `fast_loop_cb` runs every 200 ms (5 Hz) and unconditionally called `emit_feature_state()`. Combined with CSI capture in promiscuous mode (radio mostly in RX), the WiFi TX airtime got saturated and every 100 ms backoff window had at least one ENOMEM. Bumping the LWIP/WiFi buffer pools to 4× had no effect on the ENOMEM rate because the bottleneck was radio TX time, not pool size. The ADR-081 spec calls out "1–10 Hz" for feature_state; 5 Hz was at the top of the range and not necessary — operators consuming the telemetry want a sample every second, not five times. Dropping to 1 Hz frees ~80 % of the feature_state TX traffic. Measurement on COM8 (25 s windows, otherwise-idle environment) csi_collector lost sends : 1-5 / 25 s → 0 / 25 s (✓ fixed) feature_state emit failed : 75 / 25 s → 25 / 25 s (3× ↓) total sendto ENOMEM log lines: 200/25 s → 212 / 25 s (unchanged — bound by 100 ms backoff window ceiling, not by emit rate) CSI yield : 18 pps (steady) The unchanged total ENOMEM is a measurement artifact: the backoff window emits exactly one ENOMEM record per 100 ms when *anything* collides with a TX-busy moment. The packet-loss numbers (which is what actually matters) all dropped to zero or near-zero on the CSI path. Implementation Pure-static `s_emit_divider` counter in `fast_loop_cb`. Every 5th tick calls the emit. Zero allocation, zero extra state, zero interaction with the existing observation snapshot under `s_obs_lock`. Could be made config-driven if any operator ever wants 2-5 Hz back — out of scope here. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
8cb8a37dc4 |
feat(docker): bundle homecore-server (HOMECORE / ADRs 126-134) in the image
The HOMECORE native Rust port of Home Assistant landed in v0.10.0 (PR #800). The published Docker image now ships its binary alongside sensing-server and cog-ha-matter so a single `docker run` brings up the full RuView + HA-wire-compatible stack. Dockerfile.rust: - cargo build --release -p homecore-server in the build stage - strip the new binary - copy /app/homecore-server in the runtime stage - sanity-check: image build now fails if /app/homecore-server isn't executable (same guard pattern that already covers sensing-server and cog-ha-matter) - EXPOSE 8123 (HA-compat REST + WebSocket port — homecore-api binds 0.0.0.0:8123 by default per its --bind CLI flag) docker-entrypoint.sh: - new dispatch keyword: `homecore` or `homecore-server` Usage: docker run --network host ruvnet/wifi-densepose:latest homecore Defaults --bind to 0.0.0.0:8123 (overridable via HOMECORE_BIND env) The existing two dispatch paths (no arg → sensing-server, `cog-ha-matter` → HA + Matter cog) keep working unchanged. Three-binary image, one entrypoint, operator picks the role at run time. Triggers a workflow rebuild on push to main per the docker workflow's path filter; the multi-arch (amd64 + arm64) image will be published to Docker Hub as `ruvnet/wifi-densepose:latest` after CI green. Refs ADRs 126-134, v0.10.0 release. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
9fda90f3e5 |
fix(docker): bump rust:1.85 → 1.89 (matches workspace rust-toolchain.toml)
Build failed on the multi-arch run: `time@0.3.47 requires rustc 1.88.0` and the workspace toolchain pin is already 1.89 (needed for ruvector-core's avx512f target_feature, mmap-rs edition 2024, hnsw_rs is_multiple_of). Dockerfile lagged on 1.85. Refs #794. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
2154b6931c |
fix(docker): include HA-DISCO MQTT + cog-ha-matter; restores #794
Three changes: 1. Dockerfile.rust now builds sensing-server with `--features mqtt` (ADR-115 HA-DISCO publisher) and also builds + ships the cog-ha-matter binary (ADR-116 Home Assistant + Matter cog with mDNS, embedded broker, RuVector-backed thresholds, Ed25519 witness). Adds EXPOSE 1883 for the embedded MQTT broker. 2. docker-entrypoint.sh routes `docker run <image> cog-ha-matter ...` (or `ha-matter`) to /app/cog-ha-matter, defaulting --sensing-url to http://127.0.0.1:3000 so a docker-compose deployment works out of the box. The default entrypoint (no first arg) still launches sensing-server unchanged. 3. Workflow path filter now also fires on changes to v2/crates/wifi-densepose-bfld/** and v2/crates/cog-ha-matter/** so future iteration on those crates rebuilds the image. DOCKERHUB_TOKEN rotated separately (was expired since 2026-05-13, which is why the last 5 workflow runs failed at the Docker Hub login step and `latest` on Docker Hub has stayed amd64-only despite #631 being merged). With this commit + rotated token, the next CI run should land a multi-arch `:latest` with HA-DISCO + cog-ha-matter + BFLD support. Reproduced kutayozdur's pull failure on ruv-mac-mini (Apple Silicon, Darwin arm64) via Tailscale before fixing. Refs #794, #631, ADR-115, ADR-116, ADR-118. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
d33962eff2 |
fix(docker): UDP relay for multi-source ESP32 on Docker Desktop Windows (#502)
Docker Desktop on Windows demultiplexes inbound UDP from multiple source IPs onto a single virtual socket, silently dropping packets from all but one ESP32 node. This makes multi-node sensing setups appear to work (WebSocket connects, packets flow on the host) while only one node's CSI ever reaches the container. Adds scripts/udp-relay.py (stdlib only) which collapses multi-source UDP to a single loopback source so Docker's forwarding accepts every packet. Verified locally: 6 packets from 3 distinct source ports all arrive at the receiver from a single relay socket. Updates docker/docker-compose.yml with an inline comment pointing Windows users at the relay + 5006:5005 mapping. Linux/macOS hosts are unaffected and need no changes. Also documents the workaround alongside fixes for #188 (UI 404 from relative --ui-path) and #438 (boot loop on --edge-tier 1/2 against pre-v0.4.3.1 firmware) as new sections 9-11 of docs/TROUBLESHOOTING.md. Supersedes the docs-only PR #413. Closes #374, #386 Refs #188, #438, #301 |
||
|
|
c641fc44ae |
feat(docker+sensing-server): refresh Docker publish + opt-in bearer-token API auth
Closes #520, #514, #443. ## #520 / #514 — stale Docker image, missing UI assets `ruvnet/wifi-densepose:latest` was published before `ui/observatory*` and `ui/pose-fusion*` were added; users see /app/ui missing those files and the v0.6+ packet format doesn't reach the server. Two fixes: 1. `docker/Dockerfile.rust` now `RUN`s a build-time guard after `COPY ui/` that fails the build if `index.html` / `observatory.html` / `pose-fusion.html` / `viz.html` (or the `observatory/` / `pose-fusion/` / `components/` / `services/` directories) are missing, plus an exec-bit check on `/app/sensing-server`. A stale image can never be silently produced again. 2. New `.github/workflows/sensing-server-docker.yml` rebuilds + pushes on every change to the Dockerfile, the server crate, the signal/vitals/ wifiscan crates, the workspace manifests, the `ui/` tree, or itself — plus `v*` tags and manual dispatch. Pushes to both `docker.io/ruvnet/ wifi-densepose` AND `ghcr.io/ruvnet/wifi-densepose` with `latest` + `vX.Y.Z` + `sha-<short>` tags, then post-push smoke-tests the artifact: /health, /api/v1/info, the observatory + pose-fusion HTML, AND the bearer-auth path (no token → 401, wrong → 401, correct → 200). Uses the `DOCKERHUB_USERNAME`/`DOCKERHUB_TOKEN` repo secrets; ghcr.io rides on the workflow's GITHUB_TOKEN. ## #443 — sensing-server REST API auth model QE security audit raised that 40+ /api/v1/* routes have no auth layer with a default `0.0.0.0` bind. New `wifi_densepose_sensing_server::bearer_auth` module + middleware: - Env-var-gated: `RUVIEW_API_TOKEN` unset/empty ⇒ middleware is a no-op (current LAN-mode behaviour preserved — **no default change**); set ⇒ every `/api/v1/*` request must carry `Authorization: Bearer <token>` or the server returns 401. - Constant-time byte compare via local `ct_eq` (no new dep). - `/health*`, `/ws/sensing`, and `/ui/*` are intentionally never gated (orchestrator probes + local browsers). - Startup logs which mode is active and warns when auth is ON with a `0.0.0.0` bind. - 8 unit tests on the middleware via `tower::ServiceExt::oneshot` (sensing-server lib tests 191 → 199, 0 failures). Verified locally: `cargo build --workspace --no-default-features` ✓, `cargo test -p wifi-densepose-sensing-server --no-default-features` ✓. Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
81cc241b9e |
chore(repo): move v1/ → archive/v1/ + add archive/README.md (#430)
The Rust port at v2/ has been the primary codebase since the rename in #427. The Python implementation at v1/ is no longer the active target; the only load-bearing path is the deterministic proof bundle at v1/data/proof/ (per ADR-011 / ADR-028 witness verification). Move the whole Python tree into archive/v1/ and document the policy in archive/README.md: no new features, bug fixes only when they affect a still-load-bearing path (currently just the proof), CI continues to verify the proof on every push and PR. Path references updated in 26 files via path-pattern sed (only matches v1/<known-child> patterns, never bare v1 or API URLs like /api/v1/). Two double-prefix typos (archive/archive/v1/) caught and hand-fixed in verify-pipeline.yml and ADR-011. Validated: - Python proof verify.py imports cleanly at archive/v1/data/proof/ (numpy/scipy still required; CI installs requirements-lock.txt from archive/v1/ now) - cargo test --workspace --no-default-features → 1,539 passed, 0 failed, 8 ignored (unaffected by Python tree relocation) - ESP32-S3 on COM7 untouched (no firmware paths changed) After-merge: contributors should re-run any local `python v1/...` commands as `python archive/v1/...` (CLAUDE.md and CHANGELOG already updated). |
||
|
|
f49c722764 |
chore(repo): rename rust-port/wifi-densepose-rs → v2/ (flatten to one level) (#427)
The Rust port lived two directories deep (rust-port/wifi-densepose-rs/) without any sibling under rust-port/ that warranted the extra level. Move the whole workspace up to v2/ to match v1/ (Python) at the same depth and shorten every cd / build command across the repo. git mv preserves history for all tracked files. 60 files updated for path references (CI workflows, ADRs, docs, scripts, READMEs, internal .claude-flow state). Two manual fixes for relative-cd paths in CLAUDE.md and ADR-043 that became wrong after the depth change (cd ../.. → cd ..). Validated: - cargo check --workspace --no-default-features → clean (after target/ nuke; the gitignored target/ was carried by the OS rename and had hard-coded old paths in build scripts) - cargo test --workspace --no-default-features → 1,539 passed, 0 failed, 8 ignored (same totals as pre-rename) - ESP32-S3 on COM7 → still streaming live CSI (cb #40300, RSSI -64 dBm) After-merge follow-up: contributors should `rm -rf v2/target` once and let cargo regenerate from the new path. |
||
|
|
e38c0f4dcc |
fix: Docker entrypoint arg handling + configurable model directory
Fixes #384: docker run with --source/--tick-ms flags now works correctly. Fixes #399: model files in mounted volumes are now discoverable via MODELS_DIR env var. Root cause (issue #384): The Dockerfile used ENTRYPOINT ["/bin/sh", "-c"] with a shell-form CMD. When users passed flags like `--source wifi --tick-ms 500` as docker run arguments, Docker replaced CMD entirely, resulting in `/bin/sh -c "--source wifi --tick-ms 500"` which executes `--source` as a shell command → `--source: not found`. Root cause (issue #399): Model directory was hardcoded to the relative path `data/models`. When Docker users mounted models to `/app/models/`, the scan looked in the wrong place. Changes: 1. docker/docker-entrypoint.sh (new): - Proper entrypoint script that handles both env-var-based defaults and user-passed CLI flags - No arguments → starts server with CSI_SOURCE env var as --source - Flag arguments (start with -) → prepends /app/sensing-server + defaults, appends user flags (clap last-wins allows overrides) - Non-flag first arg → exec passthrough (e.g., /bin/sh for debugging) - Sets --bind-addr 0.0.0.0 (was 127.0.0.1 which blocks container access) 2. docker/Dockerfile.rust: - Switch from ENTRYPOINT ["/bin/sh", "-c"] to exec-form entrypoint - Add MODELS_DIR env var (default: data/models) - COPY the entrypoint script into the image 3. docker/docker-compose.yml: - Remove shell-form command (entrypoint handles defaults) - Add MODELS_DIR env var 4. model_manager.rs + main.rs: - Replace hardcoded `data/models` path with `effective_models_dir()` / `models_dir()` that reads MODELS_DIR env var at runtime - Docker users can now: docker run -v /host/models:/app/models -e MODELS_DIR=/app/models 5. tests/test_docker_entrypoint.sh (new, 17 tests): - Default CSI_SOURCE substitution (6 assertions) - Custom CSI_SOURCE propagation - User-passed flag arguments (--source, --tick-ms, --model) - Unset CSI_SOURCE defaults to auto - Explicit command passthrough - MODELS_DIR env var propagation |
||
|
|
c193cd4299 |
Merge pull request #88 from Harshit10j2004/harshit_1001
Update the dockerfile.python 1 by disabling Python bytecode generation |
||
|
|
8166d8d822 |
fix: live demo static pose & inaccurate sensing data (issue #86)
- Docker default changed from --source simulated to --source auto (auto-detects ESP32 on UDP 5005, falls back to simulation) - Pose derivation now driven by real sensing features: motion_band_power, breathing_band_power, variance, dominant_freq_hz, change_points - Temporal feature extraction: 100-frame circular buffer, Goertzel breathing rate estimation (0.1-0.5 Hz), frame-to-frame L2 motion detection, SNR-based signal quality metric - Signal field driven by subcarrier variance spatial mapping instead of fixed animation circle - UI data source indicators: LIVE/RECONNECTING/SIMULATED banner on sensing tab, estimation mode badge on live demo tab - Setup guide panel explaining ESP32 count requirements for each capability level (1x: presence, 3x: localization, 4x+: full pose) - Tick rate improved from 500ms to 100ms (2fps to 10fps) - Fixed Option<f64> division bug from PR #83 - ADR-035 documents all decisions Closes #86 Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
8a46fff6b0 | Update the dockerfile.python 1 by disabling Python bytecode generation with PYTHONDONTWRITEBYTECODE=1 to make runtime faster | ||
|
|
44b9c30dbc |
fix: Docker port mismatch — server now binds 3000/3001 as documented
The sensing server defaults to HTTP :8080 and WS :8765, but Docker exposes :3000/:3001. Added --http-port 3000 --ws-port 3001 to CMD in both Dockerfile.rust and docker-compose.yml. Verified both images build and run: - Rust: 133 MB, all endpoints responding (health, sensing/latest, vital-signs, pose/current, info, model/info, UI) - Python: 569 MB, all packages importable (websockets, fastapi) - RVF file: 13 KB, valid RVFS magic bytes Also fixed README Quick Start endpoints to match actual routes: - /api/v1/health → /health - /api/v1/sensing → /api/v1/sensing/latest - Added /api/v1/pose/current and /api/v1/info examples - Added port mapping note for Docker vs local dev Co-Authored-By: claude-flow <ruv@ruv.net> |
||
|
|
add9f192aa |
feat: Docker images, RVF export, and README update
- Add docker/ folder with Dockerfile.rust (132MB), Dockerfile.python (569MB), and docker-compose.yml - Remove stale root-level Dockerfile and docker-compose files - Implement --export-rvf CLI flag for standalone RVF package generation - Generate wifi-densepose-v1.rvf (13KB) with model weights, vital config, SONA profile, and training provenance - Update README with Docker pull/run commands and RVF export instructions - Update test count to 542+ and fix Docker port mappings - Reply to issues #43, #44, #45 with Docker/RVF availability Co-Authored-By: claude-flow <ruv@ruv.net> |