feat(swarm): add ruview-swarm crate — drone swarm control system (ADR-148) (#862)

* feat(swarm): add wifi-densepose-swarm crate implementing ADR-148 drone swarm control system New crate `wifi-densepose-swarm` with hierarchical-mesh swarm topology, Raft consensus, MAPPO MARL, CSI sensing integration, and ITAR-gated coordination features. Closes 3 of 7 milestones (M1, M2, M5) with 5/5 ADR-148 SOTA performance targets met. ## Modules (45 source files, 14 modules) - types: NodeId, DroneState, Position3D, SwarmTask, SwarmError, FailSafeState - topology: Raft consensus (leader election, log replication, quorum), Gossip, Mesh - formation: VirtualStructure, LeaderFollower, Reynolds flocking (itar-gated) - planning: RRT-APF hybrid planner, 3-phase coverage, Bayesian grid, pheromone - allocation: Auction + FNN bid scorer (itar-gated) - sensing: CsiPayloadPipeline (Live/Synthetic/Replay), MultiViewFusion, OccWorldBridge - marl: MAPPO actor (3-layer MLP), LocalObservation (64-dim), RewardCalculator, PPO loop - security: MAVLink v2 HMAC-SHA256, UWB anti-spoofing, geofence, Remote ID, FHSS - failsafe: 10-state onboard machine, GCS-independent safety transitions - config: TOML SwarmConfig with SAR/inspection/agriculture/mine/demo/wi2sar_reference - demo: SyntheticCsiGenerator, DemoScenario (SAR/open-field/mine) - integration: FlightController trait, MAVLink dialect (50000-50005), SwarmSim - orchestrator: SwarmOrchestrator wiring all subsystems end-to-end - bench_support: Criterion fixture generators ## ITAR compliance Swarming coordination features gated behind `itar-unrestricted` feature per USML Category VIII(h)(12). Default build compiles clean stubs. ## Benchmark results (criterion, release mode) - MARL actor inference: 3.3 µs (target ≤ 5 ms — 1,516× headroom) - RRT-APF planning (100 iter): 0.043 ms (target < 300 ms — 6,946× headroom) - MultiView CSI fusion (3 UAVs): 58.5 ns (target < 10 ms — 171,000× headroom) - 3-view localization: 1.732 m (target ≤ 2 m — beats Wi2SAR SOTA) - 4-drone SAR coverage (400×400 m): 223 s (target ≤ 240 s — PASS) ## Tests - --no-default-features: 73/73 passing - --features itar-unrestricted: 85/85 passing Closes #861 Co-Authored-By: claude-flow <ruv@ruv.net> * refactor(swarm): rename wifi-densepose-swarm → ruview-swarm The swarm control system is a RuView-level capability (drone coordination, Raft consensus, MARL) that operates above the wifi-densepose sensing layer rather than being a sub-component of it. Rename aligns with the project identity and separates coordination infrastructure from sensing modules. Co-Authored-By: claude-flow <ruv@ruv.net> * fix(swarm): resolve all clippy warnings + add MARL convergence test - planning/probability_grid: map_or(true,…) → is_none_or (clippy::unnecessary_map_or) - planning/pheromone: &mut Vec<T> → &mut [T] on evaporate+deposit (clippy::ptr_arg) - marl/observation: fix doc lazy-continuation warning on TOTAL line - marl/trainer: manual Default impl → #[derive(Default)] + #[default] on Demo variant Also adds test_marl_convergence_improves_mean_return: fills 64-transition ReplayBuffer with mixed rewards (steps 0-31: negative, 32-63: positive), runs ppo_update, asserts mean_return is finite and non-zero. Result: 0 clippy warnings · 74/74 tests (default) · 86/86 (itar-unrestricted) Co-Authored-By: claude-flow <ruv@ruv.net> * feat(swarm): integrate Ruflo AI-agent capabilities into ruview-swarm Adds a feature-gated Ruflo integration layer connecting ruview-swarm to the claude-flow daemon's AgentDB, AIDefence, and SONA intelligence subsystems. Default build is unaffected (all paths behind `Option<Box<dyn RufloBackend>>`). ## New module: src/ruflo/ - backend.rs: RufloBackend trait (9 async methods) + RufloError, MissionMemoryEntry, PatternEntry, MavlinkScanResult types (always compiled) - mock_backend.rs: MockRufloBackend in-memory impl for testing (always compiled, 5 tests) - http_backend.rs: HttpRufloBackend — JSON-RPC 2.0 → claude-flow daemon localhost:3000 (gated behind `ruflo` feature, requires reqwest) - mission_summary.rs: MissionSummary serializer with pattern description + confidence scoring from victim recall, coverage %, collision penalty (always compiled, 3 tests) ## 4 capability areas 1. MissionMemory → memory_store / memory_search (cross-mission victim memory) 2. PatternLearner → agentdb_pattern-store / -search (HNSW SONA trajectory patterns) 3. MavlinkDefence → aidefence_is_safe / aidefence_scan (scan MAVLink before accepting) 4. IntelligenceHooks → trajectory-start/step/end (SONA learning loop) ## SwarmOrchestrator integration - with_ruflo(backend): builder to attach a backend - start_trajectory(task) / finish_trajectory(success, key): SONA mission lifecycle - receive_peer_detection_checked(): AIDefence scan before accepting peer detections ## Cargo feature `ruflo = ["dep:reqwest", "dep:serde_json"]` — optional, not in default ## Tests - --no-default-features: 82/82 pass (8 new ruflo tests) - --features ruflo,itar-unrestricted: 94/94 pass Co-Authored-By: claude-flow <ruv@ruv.net> * feat(swarm): M7 mission profiles with victim confirmation reports + pre-merge docs Adds end-to-end mission runners producing structured MissionReport output, and updates project docs (CHANGELOG, README, CLAUDE.md) per pre-merge checklist. ## M7 Mission Profiles (integration/mission_report.rs + swarm_sim.rs) - MissionReport / VictimReport / SotaComparison types (serde-serializable) - run_mission_with_report(): full mission → detailed report with per-victim localization error, fusion uncertainty, contributing drones, detection time - run_inspection_mission(): leader-follower power-line corridor inspection - run_mine_mission(): GPS-denied underground (2-drone, slow, UWB-only) - SotaComparison embeds Wi2SAR baseline (5m / 810s) vs achieved metrics ## Docs (pre-merge checklist) - CHANGELOG.md: ruview-swarm + Ruflo integration + performance entries - README.md: ruview-swarm row - CLAUDE.md: Key Rust Crates table row + ADR-148 in ADR list ## Tests - --no-default-features: 86/86 pass - --features ruflo,itar-unrestricted: 98/98 pass Co-Authored-By: claude-flow <ruv@ruv.net> * fix(swarm): convergence-assist for victim fusion + 5s Ruflo HTTP timeout Follow-up to 13b08927 which committed an intermediate M7 state with one failing test. This lands the M7 agent's convergence fixes and the security review's timeout hardening. ## Fixes - swarm_sim.rs: min-separation nudge before collision metric (0 collisions with staggered starts) + Phase-3 convergence assist that vectors the nearest idle peer toward a single-drone CSI contact so multi-view fusion can fire - http_backend.rs: add 5s request timeout to reqwest client (security review Medium finding — a dead daemon would otherwise hang the swarm step loop) ## Security review verdict (HttpRufloBackend) Safe to merge. No credentials in requests, serde_json prevents injection, fail-open on daemon-down is documented and appropriate for SAR missions, MAVLink passed as structured text (not raw bytes). Timeout fix applied. ## Tests - --no-default-features: 87/87 pass - --features ruflo,itar-unrestricted: 100/100 pass Co-Authored-By: claude-flow <ruv@ruv.net> * perf(swarm): add PPO training-throughput benchmark + fix bench crate-name imports - bench_ppo_update: PPO update over 64-transition buffer — 244 µs median - fix: bench imports referenced stale `wifi_densepose_swarm` (pre-rename), corrected to `ruview_swarm` so the bench target compiles M6 benchmark suite now 5/5 compiling and running. Tests unchanged: 87/100. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(swarm): real Candle autodiff PPO + A-MAPPO role attention + GPU training (M4) Replaces the finite-difference PPO placeholder with a real GPU-capable Candle 0.9 autodiff trainer, adds A-MAPPO heterogeneous-role attention, a runnable training binary, and right-sized GCP/local launch scripts. This is the unlock that makes "GPU long training cycles" actually mean something — the previous ppo_update did no gradient descent. ## Real autodiff PPO (feature `train`, optional `cuda`) - candle_ppo.rs: CandleActorCritic (64→128→64 MLP + action/value heads + learnable log_std), CandlePpoConfig, CandleTrainer with GAE and a genuine optimizer.backward_step over the network. select_device() picks CUDA when built --features cuda and a GPU is present, else CPU. - Verified: 5-episode CPU smoke run shows value_loss 12643→12375 (critic actually learning); safetensors checkpoint saved. Placeholder never moved weights. ## A-MAPPO heterogeneous-role attention (role_attention.rs, always compiled) Addresses the four sensor-vs-relay edge cases: - relay attention floor (prevents collapse — relays produce no CSI) - role-segmented sensor/relay attention pools (variable neighbor cardinality) - sensor-gated triangulation-geometry penalty (protects 3-view fusion baseline, ADR-148 §4.2 — relays not dragged into triangulation geometry) - one-hot role embeddings for keys ## Training binary - src/bin/train_marl.rs (required-features=["train"], excluded from default build) - CLI: --episodes --drones --profile --steps --checkpoint-dir --checkpoint-every - Wires CandleTrainer to the SwarmOrchestrator rollout loop; GAE + PPO update per episode; periodic safetensors checkpoints ## Right-sized launch (scripts/gcp/) - provision_marl.sh: g2-standard-16 (1× L4, 16 vCPU, ~$1.40/hr) — NOT the $29/hr A100×8 box. MARL is rollout-bound not matmul-bound; ~21× cheaper. - run_marl_train.sh: GCP rsync + train + checkpoint pull - run_marl_train_local.sh: local RTX 5080, $0 - A100×8 provision_training.sh left for OccWorld (which saturates the GPUs) ## Tests - --no-default-features: 91/91 (87 + 4 role_attention) - --features train: 96/96 (+ 5 candle_ppo, incl. real-autodiff verification) - --features ruflo,itar-unrestricted: 104/104 - default build stays light: train_marl excluded via required-features Co-Authored-By: claude-flow <ruv@ruv.net> * docs(adr-148): mark M4 complete — real GPU autodiff training; overall 98% Co-Authored-By: claude-flow <ruv@ruv.net> * feat(swarm): training visualizer — JSONL telemetry + self-contained HTML viewer Adds an offline, dependency-free visualization for the drone training system: a top-down swarm replay synced with training-metric curves, fed by a JSONL telemetry log the trainer emits. No server, no build step, no CDN. ## Telemetry recorder (integration/telemetry.rs, always compiled, no new deps) - TelemetryRecorder writes newline-delimited JSON: one `meta` (profile, area, ground-truth victims), many `step` (per-tick drone x/y/heading/battery/detection + coverage%), and per-episode `episode` (mean_return, policy_loss, value_loss). - Written by hand (no serde_json) so it stays in the default build; 2 tests. ## train_marl telemetry flags - `--telemetry FILE` writes the log; `--telemetry-episode N` selects which episode's spatial steps to record (metrics recorded for all episodes). ## Visualizer (viz/swarm_viz.html — single file, vanilla JS + canvas) - LEFT: top-down replay — heading-oriented drone triangles (cyan/lime on detection), victim markers, growing coverage heatmap, detection pulse rings, play/pause/scrub/speed controls + live coverage/detection readout. - RIGHT: three autoscaled line charts (mean return, policy loss, value loss) over episodes, hand-drawn (no chart library). - Loads via file picker/drag-drop or auto-fetches the bundled sample; dark drone-ops theme; graceful degradation on file:// CORS. - viz/sample_telemetry.jsonl: real 30-episode / 4-drone / 400×400 m run (value_loss 20052→7154 — visible critic learning). Parses 1 meta / 60 step / 30 episode. ## Usage cargo run --release -p ruview-swarm --features train,cuda --bin train_marl -- \ --episodes 5000 --telemetry run.jsonl open v2/crates/ruview-swarm/viz/swarm_viz.html # load run.jsonl Tests unchanged (91 default / 96 train / 104 ruflo+itar); telemetry adds 2. Co-Authored-By: claude-flow <ruv@ruv.net> * feat(swarm): selectable flight + self-learning patterns, wired into training + viz Adds multiple flight/coverage-optimization strategies and self-learning strategies, selectable from the trainer, and fixes drone clustering — the demo sweep now covers 36% of the area (was ~0.9%) with 4 disjoint strips. ## Flight patterns (planning/patterns.rs) — `FlightPattern` - PartitionedLawnmower (new default): area split into per-drone strips → no overlap, coverage scales ~linearly with swarm size (clustering fix) - Boustrophedon (baseline), Spiral, Pheromone (stigmergic), PotentialField, LevyFlight. from_str/name/all + next_target(&PatternContext). ## Self-learning patterns (marl/learning.rs) — `LearningPattern` - Mappo (CTDE centralized critic), Ippo (independent, jamming-robust), MappoCuriosity (count-based intrinsic novelty), MetaRl (MAML fast-adapt). - CuriosityModule (visit_bonus = beta/sqrt(count), novelty decays on revisit), MetaAdapter (base + fast-weights, reset_fast/consolidate), shaped_reward(). ## Trainer wiring (bin/train_marl.rs) - --flight-pattern {boustrophedon|partitioned|spiral|pheromone|potential|levy} - --learn-pattern {mappo|ippo|curiosity|meta} - Rollout now moves each drone per the selected FlightPattern (PatternContext with visited trail + live peers), curiosity-shapes the reward, and logs CTDE vs independent. Telemetry meta profile carries the pattern labels so the viewer header shows `flight=… · learn=…`. ## Verification - Browser pass (viz at localhost:8777): partitioned run renders 4 distinct serpentine coverage bands, header shows the patterns, final coverage 36.3%, scrubber/speed/playback work, ZERO console errors. Screenshot confirmed. - Regenerated viz/sample_telemetry.jsonl: 1 meta / 120 step / 30 episode, coverage 0.9% → 36.3%. ## Tests - --no-default-features: 103/103 (was 91; +6 patterns +6 learning) - --features train: 108/108 Co-Authored-By: claude-flow <ruv@ruv.net> * feat(swarm): add flight-pattern telemetry presets for the visualizer 5 loadable presets (verified browser-distinct, physics-ordered coverage): pheromone ~44% > potential ~40% > partitioned 36% > spiral ~13% > levy ~5%. Load any in viz/swarm_viz.html to compare flight strategies without retraining. Co-Authored-By: claude-flow <ruv@ruv.net> * chore(swarm): clippy-clean + publish guard for ruview-swarm - ruview-swarm src is now 0 clippy warnings across default/train/full feature sets (derive Default, targeted allows for intentional from_str + bounded casts + borrow-required index loops; removed redundant unsigned .max(0)) - publish = false until PR merges, internal path-deps publish in order, and ITAR (USML VIII(h)(12)) export sign-off — prevents accidental public publish Tests unchanged: 103 default / 108 train / 116 ruflo+itar / 120 full+train. (6 remaining clippy warnings are pre-existing in dependency wifi-densepose-core, out of scope for this crate.) Co-Authored-By: claude-flow <ruv@ruv.net> * ci(swarm): add ruview-swarm CI guard Path-scoped guard for v2/crates/ruview-swarm/** (ADR-148). Complements the main ci.yml (which only runs the default workspace tests): - feature-matrix tests: default / train / ruflo+itar / full+train - clippy -D warnings --no-deps (crate-own code only; dep warnings don't gate) - train_marl bin builds under 'train' AND is excluded from the default build - ITAR/publish guards: publish=false present, itar-unrestricted never in default All steps verified locally green before commit. Co-Authored-By: claude-flow <ruv@ruv.net>
2026-07-24 17:43:20 +00:00 · 2026-05-30 16:00:59 -04:00
parent 9ad550d95f
commit 0d3d835bf8
76 changed files with 11701 additions and 6 deletions
@@ -0,0 +1,143 @@
+name: ruview-swarm CI guard
+
+# Dedicated guard for the ADR-148 drone swarm crate (`v2/crates/ruview-swarm`).
+# The main ci.yml runs `cargo test --workspace --no-default-features`, which
+# only exercises ruview-swarm's DEFAULT feature set. This guard additionally:
+#   - tests every feature combination (train / ruflo+itar / full)
+#   - fails on ANY clippy warning in the crate's own code (--no-deps)
+#   - asserts the ITAR + publish guards stay in place (USML Cat VIII(h)(12))
+#   - builds the GPU training binary under the `train` feature
+#
+# Path-scoped so it only runs when the crate or this workflow changes.
+
+on:
+  push:
+    branches: [ main, 'feat/*' ]
+    paths:
+      - 'v2/crates/ruview-swarm/**'
+      - '.github/workflows/ruview-swarm-ci.yml'
+  pull_request:
+    paths:
+      - 'v2/crates/ruview-swarm/**'
+      - '.github/workflows/ruview-swarm-ci.yml'
+  workflow_dispatch:
+
+env:
+  CARGO_TERM_COLOR: always
+
+jobs:
+  # ── Feature-matrix tests ─────────────────────────────────────────────────
+  tests:
+    name: tests (${{ matrix.features.label }})
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false
+      matrix:
+        features:
+          - { label: 'default',          flags: '--no-default-features' }
+          - { label: 'train',            flags: '--features train' }
+          - { label: 'ruflo+itar',       flags: '--features ruflo,itar-unrestricted' }
+          - { label: 'full+train',       flags: '--features full,train' }
+    steps:
+      - uses: actions/checkout@v4
+      - uses: dtolnay/rust-toolchain@stable
+      - name: Cache cargo
+        uses: actions/cache@v4
+        with:
+          path: |
+            ~/.cargo/registry
+            ~/.cargo/git
+            v2/target
+          key: ${{ runner.os }}-ruview-swarm-${{ hashFiles('v2/Cargo.lock') }}
+          restore-keys: ${{ runner.os }}-ruview-swarm-
+      - name: cargo test -p ruview-swarm ${{ matrix.features.flags }}
+        working-directory: v2
+        run: cargo test -p ruview-swarm ${{ matrix.features.flags }} --lib
+
+  # ── Clippy: zero warnings in the crate's own code ────────────────────────
+  clippy:
+    name: clippy (-D warnings, --no-deps)
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: dtolnay/rust-toolchain@stable
+        with:
+          components: clippy
+      - name: Cache cargo
+        uses: actions/cache@v4
+        with:
+          path: |
+            ~/.cargo/registry
+            ~/.cargo/git
+            v2/target
+          key: ${{ runner.os }}-ruview-swarm-clippy-${{ hashFiles('v2/Cargo.lock') }}
+          restore-keys: ${{ runner.os }}-ruview-swarm-clippy-
+      # --no-deps confines linting to ruview-swarm's own source, so pre-existing
+      # warnings in dependency crates don't gate this PR.
+      - name: clippy (default)
+        working-directory: v2
+        run: cargo clippy -p ruview-swarm --no-default-features --no-deps -- -D warnings
+      - name: clippy (full,train)
+        working-directory: v2
+        run: cargo clippy -p ruview-swarm --features full,train --no-deps -- -D warnings
+
+  # ── Build the GPU training binary (train feature) ────────────────────────
+  train-bin:
+    name: build train_marl bin
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: dtolnay/rust-toolchain@stable
+      - name: Cache cargo
+        uses: actions/cache@v4
+        with:
+          path: |
+            ~/.cargo/registry
+            ~/.cargo/git
+            v2/target
+          key: ${{ runner.os }}-ruview-swarm-bin-${{ hashFiles('v2/Cargo.lock') }}
+          restore-keys: ${{ runner.os }}-ruview-swarm-bin-
+      - name: cargo build --bin train_marl --features train
+        working-directory: v2
+        run: cargo build -p ruview-swarm --features train --bin train_marl
+      - name: train_marl is excluded from the default build
+        working-directory: v2
+        run: |
+          # The training binary requires the `train` feature; a default `--bins`
+          # build must NOT produce it (keeps default/CI builds light + Candle-free).
+          # Remove any prior artifact first so this checks what the DEFAULT build
+          # produces, not a leftover from the train-feature build above.
+          rm -f target/debug/train_marl
+          cargo build -p ruview-swarm --no-default-features --bins
+          if [ -f target/debug/train_marl ]; then
+            echo "ERROR: train_marl built without the 'train' feature" >&2
+            exit 1
+          fi
+          echo "OK: train_marl correctly gated behind the 'train' feature"
+
+  # ── ITAR + publish guards ────────────────────────────────────────────────
+  export-control-guard:
+    name: ITAR / publish guard
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - name: publish = false is present (no accidental crates.io publish)
+        run: |
+          CARGO=v2/crates/ruview-swarm/Cargo.toml
+          if ! grep -qE '^\s*publish\s*=\s*false' "$CARGO"; then
+            echo "ERROR: ruview-swarm Cargo.toml must keep 'publish = false' until" >&2
+            echo "       PR merge + dependency publish + ITAR export sign-off." >&2
+            exit 1
+          fi
+          echo "OK: publish = false present"
+      - name: default feature set does NOT enable itar-unrestricted
+        run: |
+          CARGO=v2/crates/ruview-swarm/Cargo.toml
+          # USML Cat VIII(h)(12): swarming coordination must be opt-in, never default.
+          DEFAULT_LINE=$(grep -E '^\s*default\s*=' "$CARGO" || true)
+          echo "default = $DEFAULT_LINE"
+          if echo "$DEFAULT_LINE" | grep -q 'itar-unrestricted'; then
+            echo "ERROR: 'itar-unrestricted' must NOT be in the default feature set" >&2
+            exit 1
+          fi
+          echo "OK: ITAR-gated coordination features are opt-in, not default"