Compare commits

...

3 Commits

Author SHA1 Message Date
ruv 84638314a4 fix(docker): bump rust 1.85 -> 1.90 + enforce LF on shell scripts
Two real bugs found while pushing the v0.8.0 image to Docker Hub:

## Rust 1.85 -> 1.90

`hnsw_rs 0.3.4` (transitive via wifi-densepose-ruvector ->
ruvector-attn-mincut -> hnsw_rs) calls `nbp.is_multiple_of(500_000)`.
`is_multiple_of` on unsigned integers was stabilised in Rust 1.87
(rust-lang/rust#128101 — RFC 3565). On 1.85 the compile fails with:

  error[E0658]: use of unstable library feature `unsigned_is_multiple_of`
   --> hnsw_rs-0.3.4/src/hnswio.rs:736:20

Pinned to 1.90 for reproducibility — a comment in the Dockerfile flags
the 1.87 MSRV requirement so a future downgrade can't quietly break it.

## .gitattributes — force LF on shell scripts + Dockerfile

Without a `.gitattributes`, git's default `core.autocrlf=true` on
Windows converts shell scripts to CRLF on checkout. `COPY`ing
`docker/docker-entrypoint.sh` into a Linux image then preserves CRLF.
The shebang line `#!/bin/sh\r\n` causes `exec /app/docker-entrypoint.sh`
to fail with:

  exec /app/docker-entrypoint.sh: no such file or directory

The kernel tries to look up an interpreter literally named `/bin/sh\r`,
which doesn't exist. Container exits immediately. The first v0.8.0
image push (digest sha256:7957…44fa) suffered exactly this; the
re-pushed image (digest sha256:e9f4…d38315) was built on a
renormalised tree.

The .gitattributes rule forces LF for:
  - *.sh / *.bash
  - Dockerfile*
  - docker/* (covers docker-entrypoint.sh + docker-compose.yml)
  - scripts/*
  - `verify` (the proof-replay wrapper — same root cause as if it
    had landed CRLF in someone's clone)

Binary file globs (*.bin, *.wasm, *.rvf, *.pcap, etc.) explicitly
marked binary so text-normalisation never touches them.

## CHANGELOG — drop the false `--introspection` flag claim

The CHANGELOG entry for v0.8.0 said the introspection endpoints were
"off by default, enabled via `--introspection`". That isn't true:
`sensing-server --help` has no such flag. The routes are mounted
unconditionally in `main.rs`. The per-frame `update()` p99 of
0.041 ms (~24× under D4's 1 ms budget) makes always-on viable; the
"off by default" framing came from an earlier draft of ADR-099 that
the implementation outgrew. Corrected.

## Verification

End-to-end smoke test of the pushed image:

  docker run -d -p 13000:3000 -e CSI_SOURCE=simulated     -e SENSING_BIND_ADDR=0.0.0.0 ruvnet/wifi-densepose:v0.8.0

  /health -> {"status":"ok","source":"simulated",...}
  /api/v1/info -> {"backend":"rust","features":{"ruvector":true,"signal_processing":true,...}}
  /api/v1/introspection/snapshot -> {"regime":"unknown",
    "regime_changed":false,"top_k_similarity":[]} (ADR-099 shape exact)
  /ui/observatory.html -> HTTP 200, 15 KB

Published manifest digests:
  ruvnet/wifi-densepose:v0.8.0 -> sha256:e9f4c5af…d38315
  ruvnet/wifi-densepose:latest -> sha256:e9f4c5af…d38315

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-14 13:38:58 -04:00
ruv f396c44751 ci(verify-pipeline): fix stale v1/ working-directory + SECRET_KEY env
Same drift as #559 but in CI: the workflow ran `working-directory: v1`
on the two verify steps, but the Python codebase moved to `archive/v1/`
ages ago. The job failed with:

  An error occurred trying to start process '/usr/bin/bash' with
  working directory '/home/runner/work/RuView/RuView/v1'.
  No such file or directory

Fixed both occurrences (working-directory: v1 -> working-directory:
archive/v1).

Also added `SECRET_KEY` env var to both steps — `verify.py` transitively
imports `src.app` -> `src.config.settings` (since PR #547 introduced
pydantic-settings with a required `secret_key` field). The value is
never used for any auth path in the proof pipeline; it just needs to
satisfy the import chain. Same env-var workaround used locally to make
`./verify` pass.

After this commit, "Verify Pipeline Determinism (3.11)" should go green
on this PR.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-14 11:45:03 -04:00
ruv 86f38c4fc6 fix: first-run breakage (closes #559, #561) + #560 platform-aware diagnosis
Three related fixes — a fresh-clone user hitting any of these would
conclude the project doesn't work; #557's "feels like mock" narrative
is fed in part by these breakages.

## #559 — `./verify` pointed at removed `v1/` paths

The wrapper hard-coded `v1/data/proof` / `v1/src`, but the proof scripts
moved to `archive/v1/` long ago. A fresh clone failed before the
pipeline could even run. User `Fewmanism` provided the exact diff in
the issue. Applied verbatim across four hits (PROOF_DIR, V1_SRC, the
Phase 3 scan-message, and the SKIP-state recovery hint).

  ./verify  # now PASS end-to-end

## #561 — firmware README would misflash and point at the wrong provisioner

Two real bring-up bugs:

1. Manual flash command put the app at `0x10000`. The partition tables
   (`partitions_display.csv`, `partitions_4mb.csv`) define `ota_0` at
   `0x20000`. `0x10000` is the start of `phy_init` data — flashing
   the app binary there would corrupt the PHY init data and the app
   would never run. The QEMU section already had the right `0x20000`,
   so this was an internal contradiction. Both occurrences fixed.

   Also added `0xf000 ota_data_initial.bin` to the manual flash
   command — the release bundle ships this binary and without it the
   bootloader can refuse to boot after a factory wipe.

2. `python scripts/provision.py` referenced the wrong file. There are
   actually TWO `provision.py` files in the repo (`scripts/` — 275
   lines, stale; `firmware/esp32-csi-node/` — 348 lines, has the
   issue #391 full-replace semantics fix). The canonical one is in
   the firmware dir. Both README occurrences fixed to point at the
   canonical path. (The stale `scripts/provision.py` is a separate
   cleanup; the historical ADRs that reference it are intentionally
   not touched.)

## #560 — proof hash mismatches on macOS arm64 / Accelerate

User `Fewmanism` reports that with the same pinned `numpy 1.26.4` /
`scipy 1.14.1` on macOS arm64, the proof's SHA-256 differs from the
published expected hash. The proof passes on linux-x86_64 and
windows-x86_64 (where wheels ship OpenBLAS); it mismatches on
darwin-arm64 (where numpy/scipy use Accelerate.framework). That is
not a code bug — Accelerate's FFT and BLAS produce bit-different
output on identical IEEE 754 inputs from the same backend, and the
proof's bit-exact contract therefore cannot hold across backends.

What this commit changes:

- `verify.py` now prints a RUNTIME ENVIRONMENT block before the
  pipeline runs: platform, machine, Python version, numpy BLAS
  backend. Users on a non-reference backend see the cause up front.
- The FAIL message reorders causes: platform BLAS/FFT backend is
  now the *primary* suspect (not "unlikely"), with a pointer to
  the printed RUNTIME ENVIRONMENT block.
- New `archive/v1/data/proof/REFERENCE_PLATFORMS.md` documents the
  reference platforms (linux-x86_64 + windows-x86_64 with OpenBLAS),
  the expected-MISMATCH platforms (darwin-arm64 with Accelerate,
  any MKL install), and three workable responses for users hitting
  a non-reference backend (run on a reference platform, generate a
  local-reference hash, or use tolerance-based comparison — that
  last one is the roadmap path).

This converts #560 from "the proof is broken on my Mac" to "the proof
has a documented single-backend contract".

## Verification

- `./verify` (Windows x86_64 / OpenBLAS): VERDICT PASS, hash
  `8c0680d7…51c6` matches expected. RUNTIME ENVIRONMENT block prints
  numpy BLAS = `scipy-openblas`.
- `grep -E '0x10000|scripts/provision\.py' firmware/esp32-csi-node/README.md`:
  no matches.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-14 08:45:33 -04:00
8 changed files with 181 additions and 21 deletions
+35
View File
@@ -0,0 +1,35 @@
# Line-ending policy.
#
# `* text=auto` lets git normalise text files to LF in the repository and convert
# to the platform's native line endings on checkout. That default is fine for
# .md / .rs / .toml / .py — broken for shell scripts and Dockerfiles, where
# CRLF on the shebang line causes Linux exec to look for an interpreter named
# `/bin/sh\r` (or similar) and fail with "no such file or directory".
#
# Force LF for anything that ends up executed inside a Linux container or a
# POSIX shell. This is what prevented the v0.8.0 image from booting at first
# build until the entrypoint was renormalised.
* text=auto
*.sh text eol=lf
*.bash text eol=lf
verify text eol=lf
Dockerfile* text eol=lf
docker/* text eol=lf
scripts/* text eol=lf
# Binary blobs that should never be touched by text-normalisation.
*.bin binary
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.zip binary
*.tar binary
*.tgz binary
*.gz binary
*.wasm binary
*.rvf binary
*.task binary
*.csi.jsonl binary
*.pcap binary
+10 -2
View File
@@ -57,7 +57,13 @@ jobs:
"
- name: Run pipeline verification
working-directory: v1
working-directory: archive/v1
env:
# verify.py transitively imports src.app -> src.config.settings, which
# uses pydantic-settings with a required `secret_key` field. The proof
# only needs the import chain to resolve; the value is never used for
# any auth path in the proof pipeline.
SECRET_KEY: ci-proof-replay-only-not-a-real-secret
run: |
echo "=== Running pipeline verification ==="
python data/proof/verify.py
@@ -65,7 +71,9 @@ jobs:
echo "Pipeline verification PASSED."
- name: Run verification twice to confirm determinism
working-directory: v1
working-directory: archive/v1
env:
SECRET_KEY: ci-proof-replay-only-not-a-real-secret
run: |
echo "=== Second run for determinism confirmation ==="
python data/proof/verify.py
+2 -1
View File
@@ -14,7 +14,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
regime classification) and `temporal-compare` (DTW pattern matching) as a
**parallel tap** alongside RuView's existing event pipeline — no replacement,
no behaviour change to the existing `/ws/sensing` fan-out or `wifi-densepose-signal`
DSP. Two new endpoints (off by default, enabled via `--introspection`):
DSP. Two new endpoints (always mounted — the tap is cheap enough at 0.041 ms p99
per-frame `update()` to ship hot by default):
- `GET /ws/introspection` — newline-delimited JSON snapshots streamed at the CSI
frame rate. Each snapshot carries `frame_count`, `regime` (Idle / Periodic /
Transient / Chaotic / Unknown), `lyapunov_exponent`, `attractor_dim`,
@@ -0,0 +1,52 @@
# Reference platforms for `expected_features.sha256`
The hash in `expected_features.sha256` was generated on a specific BLAS / FFT
backend. Numpy + scipy delegate FFT/linear-algebra to platform-native
libraries, and those libraries produce **bit-different output on identical
IEEE 754 inputs** depending on the backend. This is not a bug in the proof
pipeline — it is a property of the underlying numerical libraries. (See
issue #560.)
## Platforms where the hash is expected to MATCH
| Platform | BLAS backend | Status |
|---|---|---|
| `linux-x86_64-gnu` (Python 3.11.x, numpy 1.26.4 from PyPI wheels, scipy 1.14.1) | OpenBLAS | ✅ Reference |
| `windows-x86_64-msvc` (Python 3.11.x / 3.13.x, numpy 1.26.4 from PyPI wheels, scipy 1.14.1) | OpenBLAS | ✅ Reference |
## Platforms where the hash is **expected to MISMATCH**
| Platform | BLAS backend | Why |
|---|---|---|
| `darwin-arm64` (macOS arm64, Apple Silicon) | Accelerate.framework | FFT + matrix kernels differ in last-bit positions; the SHA-256 will differ even with pinned `numpy 1.26.4` / `scipy 1.14.1`. |
| Any environment with MKL installed | Intel MKL | Same root cause as Accelerate: different vectorized FFT path. |
## What to do if you get MISMATCH on a non-reference platform
The pipeline is still correct on your platform — the *output* is bit-different
because the *backend* is bit-different, not because the proof code has a bug.
Three workable responses:
1. **Run the proof on a reference platform** (Linux x86_64 or Windows x86_64
with the PyPI OpenBLAS wheels). This is what CI does.
2. **Generate a new local-reference hash** for your platform and check it
against the same hash on a teammate's machine with the *same* backend:
```bash
# Regenerate from your platform
python archive/v1/data/proof/verify.py --generate-hash
# Commit the new hash to a side file (do NOT overwrite expected_features.sha256
# unless you are publishing a new cross-platform reference)
```
3. **Compare numerical output, not the hash.** A relaxed-tolerance comparison
on the feature vectors (e.g. `np.allclose(features, reference, atol=1e-10)`)
will pass across backends. This is on the roadmap (see issue #560).
## The `verify.py` runtime environment block
Every run of `verify.py` now prints a `RUNTIME ENVIRONMENT` block before the
pipeline runs. Include that block in any issue report — it identifies the
platform + numpy version + BLAS backend in one place.
+58 -5
View File
@@ -116,6 +116,48 @@ def print_source_provenance():
print()
def print_runtime_environment():
"""Print the platform + numpy/scipy BLAS backend.
The proof pipeline's SHA-256 is sensitive to the BLAS / FFT backend
behind numpy + scipy.fft. Different platforms ship different backends
(OpenBLAS on Linux/Windows wheels, Accelerate.framework on macOS arm64,
MKL when installed) and they produce bit-different output on identical
IEEE 754 inputs. Surfacing the backend up front turns an unexplained
MISMATCH into a one-line diagnosis -- see issue #560.
"""
import platform
print(" RUNTIME ENVIRONMENT:")
print(f" Platform : {platform.platform()}")
print(f" Machine : {platform.machine()}")
print(f" Python : {platform.python_version()} ({platform.python_implementation()})")
# numpy BLAS / LAPACK backend.
try:
blas_info = np.__config__.blas_ilp64_opt_info # type: ignore[attr-defined]
backend = getattr(blas_info, "get", lambda *_: None)("libraries", None) or "unknown"
except Exception:
# Newer numpy (>= 1.26) reports via show_config(); fall back to a stringified dump.
try:
import io
buf = io.StringIO()
np.show_config(mode="dicts") if hasattr(np, "show_config") else None
# `show_config(mode='dicts')` returns a dict in numpy >= 1.26.
cfg = np.show_config(mode="dicts") if hasattr(np, "show_config") else {}
if isinstance(cfg, dict):
blas = cfg.get("Build Dependencies", {}).get("blas", {})
backend = blas.get("name", "unknown")
else:
backend = "unknown"
except Exception:
backend = "unknown"
print(f" numpy BLAS : {backend}")
print(" (FFT/BLAS backend affects the hash -- see #560 if MISMATCH on")
print(" macOS arm64 / Accelerate. Reference platforms: linux-x86_64,")
print(" windows-x86_64 with OpenBLAS; see expected_features.sha256.)")
print()
def load_reference_signal(data_path):
"""Load the reference CSI signal from JSON.
@@ -417,6 +459,7 @@ def main():
# ---------------------------------------------------------------
print("[0/4] SOURCE PROVENANCE")
print_source_provenance()
print_runtime_environment()
# ---------------------------------------------------------------
# Step 1: Load and describe reference signal
@@ -518,13 +561,23 @@ def main():
print()
print(" The pipeline output does NOT match the expected hash.")
print()
print(" Possible causes:")
print(" - Numpy/scipy version mismatch (check requirements)")
print(" - Code change in CSI processor that alters numerical output")
print(" - Platform floating-point differences (unlikely for IEEE 754)")
print(" Likely causes, in order of probability:")
print(" 1. Platform BLAS/FFT backend differs from the reference.")
print(" The expected hash was generated on linux-x86_64 +")
print(" windows-x86_64 with OpenBLAS. macOS arm64 ships with")
print(" Accelerate.framework, which produces bit-different FFT")
print(" output on identical inputs (issue #560). Inspect the")
print(" RUNTIME ENVIRONMENT block printed at the top of this run.")
print(" 2. Numpy/scipy version mismatch.")
print(" Install pinned versions: pip install -r archive/v1/requirements-lock.txt")
print(" 3. Real code change in the CSI processor that alters output.")
print(" Investigate the diff against the reference commit.")
print()
print(" To update the expected hash after intentional changes:")
print(" To regenerate the expected hash on a NEW reference platform:")
print(" python verify.py --generate-hash")
print(" (Only do this if you intend to publish a new reference; the")
print(" single-platform contract of expected_features.sha256 is")
print(" documented at the top of that file.)")
print("=" * 72)
sys.exit(1)
+5 -1
View File
@@ -3,7 +3,11 @@
# Multi-stage build for minimal final image
# Stage 1: Build
FROM rust:1.85-bookworm AS builder
# Rust 1.87+ is required: `hnsw_rs 0.3.4` (transitive via wifi-densepose-ruvector ->
# ruvector-attn-mincut) uses `u*::is_multiple_of`, stabilised in 1.87. Pinning to a
# recent stable (1.90) for reproducibility — bump cautiously since reproducible
# builds rely on this.
FROM rust:1.90-bookworm AS builder
WORKDIR /build
+15 -8
View File
@@ -40,15 +40,21 @@ MSYS_NO_PATHCONV=1 docker run --rm \
```bash
python -m esptool --chip esp32s3 --port COM7 --baud 460800 \
write_flash --flash_mode dio --flash_size 8MB \
0x0 firmware/esp32-csi-node/build/bootloader/bootloader.bin \
0x8000 firmware/esp32-csi-node/build/partition_table/partition-table.bin \
0x10000 firmware/esp32-csi-node/build/esp32-csi-node.bin
0x0 firmware/esp32-csi-node/build/bootloader/bootloader.bin \
0x8000 firmware/esp32-csi-node/build/partition_table/partition-table.bin \
0xf000 firmware/esp32-csi-node/build/ota_data_initial.bin \
0x20000 firmware/esp32-csi-node/build/esp32-csi-node.bin
```
> The app slot (`ota_0`) starts at `0x20000` per `partitions_display.csv` /
> `partitions_4mb.csv`. `ota_data_initial.bin` at `0xf000` initialises the OTA
> slot pointer; without it the bootloader can refuse to boot the app after a
> factory wipe.
### 3. Provision WiFi credentials (no reflash needed)
```bash
python scripts/provision.py --port COM7 \
python firmware/esp32-csi-node/provision.py --port COM7 \
--ssid "YourSSID" --password "YourPass" --target-ip 192.168.1.20
```
@@ -254,9 +260,10 @@ Find your serial port: `COM7` on Windows, `/dev/ttyUSB0` on Linux, `/dev/cu.SLAB
```bash
python -m esptool --chip esp32s3 --port COM7 --baud 460800 \
write_flash --flash_mode dio --flash_size 8MB \
0x0 firmware/esp32-csi-node/build/bootloader/bootloader.bin \
0x8000 firmware/esp32-csi-node/build/partition_table/partition-table.bin \
0x10000 firmware/esp32-csi-node/build/esp32-csi-node.bin
0x0 firmware/esp32-csi-node/build/bootloader/bootloader.bin \
0x8000 firmware/esp32-csi-node/build/partition_table/partition-table.bin \
0xf000 firmware/esp32-csi-node/build/ota_data_initial.bin \
0x20000 firmware/esp32-csi-node/build/esp32-csi-node.bin
```
### Serial Monitor
@@ -285,7 +292,7 @@ All settings can be changed at runtime via Non-Volatile Storage (NVS) without re
The easiest way to write NVS settings:
```bash
python scripts/provision.py --port COM7 \
python firmware/esp32-csi-node/provision.py --port COM7 \
--ssid "MyWiFi" \
--password "MyPassword" \
--target-ip 192.168.1.20
+4 -4
View File
@@ -19,9 +19,9 @@
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
PROOF_DIR="${SCRIPT_DIR}/v1/data/proof"
PROOF_DIR="${SCRIPT_DIR}/archive/v1/data/proof"
VERIFY_PY="${PROOF_DIR}/verify.py"
V1_SRC="${SCRIPT_DIR}/v1/src"
V1_SRC="${SCRIPT_DIR}/archive/v1/src"
# Colors (disabled if not a terminal)
if [ -t 1 ]; then
@@ -136,7 +136,7 @@ echo ""
echo -e "${CYAN}[PHASE 3] PRODUCTION CODE INTEGRITY SCAN${RESET}"
echo ""
echo " Scanning ${V1_SRC} for np.random.rand / np.random.randn calls..."
echo " (Excluding v1/src/testing/ -- test helpers are allowed to use random.)"
echo " (Excluding archive/v1/src/testing/ -- test helpers are allowed to use random.)"
echo ""
MOCK_FINDINGS=0
@@ -204,7 +204,7 @@ elif [ $PIPELINE_EXIT -eq 2 ]; then
echo -e " ${YELLOW}${BOLD}RESULT: SKIP${RESET}"
echo ""
echo " No expected hash file to compare against."
echo " Run: python v1/data/proof/verify.py --generate-hash"
echo " Run: python archive/v1/data/proof/verify.py --generate-hash"
echo ""
echo -e "${BOLD}======================================================================${RESET}"
exit 2