Files
ruvnet--RuView/v2/crates/ruview-swarm/src/marl/observation.rs
T
rUv 0d3d835bf8 feat(swarm): add ruview-swarm crate — drone swarm control system (ADR-148) (#862)
* feat(swarm): add wifi-densepose-swarm crate implementing ADR-148 drone swarm control system

New crate `wifi-densepose-swarm` with hierarchical-mesh swarm topology,
Raft consensus, MAPPO MARL, CSI sensing integration, and ITAR-gated
coordination features. Closes 3 of 7 milestones (M1, M2, M5) with 5/5
ADR-148 SOTA performance targets met.

## Modules (45 source files, 14 modules)

- types: NodeId, DroneState, Position3D, SwarmTask, SwarmError, FailSafeState
- topology: Raft consensus (leader election, log replication, quorum), Gossip, Mesh
- formation: VirtualStructure, LeaderFollower, Reynolds flocking (itar-gated)
- planning: RRT-APF hybrid planner, 3-phase coverage, Bayesian grid, pheromone
- allocation: Auction + FNN bid scorer (itar-gated)
- sensing: CsiPayloadPipeline (Live/Synthetic/Replay), MultiViewFusion, OccWorldBridge
- marl: MAPPO actor (3-layer MLP), LocalObservation (64-dim), RewardCalculator, PPO loop
- security: MAVLink v2 HMAC-SHA256, UWB anti-spoofing, geofence, Remote ID, FHSS
- failsafe: 10-state onboard machine, GCS-independent safety transitions
- config: TOML SwarmConfig with SAR/inspection/agriculture/mine/demo/wi2sar_reference
- demo: SyntheticCsiGenerator, DemoScenario (SAR/open-field/mine)
- integration: FlightController trait, MAVLink dialect (50000-50005), SwarmSim
- orchestrator: SwarmOrchestrator wiring all subsystems end-to-end
- bench_support: Criterion fixture generators

## ITAR compliance

Swarming coordination features gated behind `itar-unrestricted` feature
per USML Category VIII(h)(12). Default build compiles clean stubs.

## Benchmark results (criterion, release mode)

- MARL actor inference: 3.3 µs (target ≤ 5 ms — 1,516× headroom)
- RRT-APF planning (100 iter): 0.043 ms (target < 300 ms — 6,946× headroom)
- MultiView CSI fusion (3 UAVs): 58.5 ns (target < 10 ms — 171,000× headroom)
- 3-view localization: 1.732 m (target ≤ 2 m — beats Wi2SAR SOTA)
- 4-drone SAR coverage (400×400 m): 223 s (target ≤ 240 s — PASS)

## Tests

- --no-default-features: 73/73 passing
- --features itar-unrestricted: 85/85 passing

Closes #861

Co-Authored-By: claude-flow <ruv@ruv.net>

* refactor(swarm): rename wifi-densepose-swarm → ruview-swarm

The swarm control system is a RuView-level capability (drone coordination,
Raft consensus, MARL) that operates above the wifi-densepose sensing layer
rather than being a sub-component of it. Rename aligns with the project
identity and separates coordination infrastructure from sensing modules.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(swarm): resolve all clippy warnings + add MARL convergence test

- planning/probability_grid: map_or(true,…) → is_none_or (clippy::unnecessary_map_or)
- planning/pheromone: &mut Vec<T> → &mut [T] on evaporate+deposit (clippy::ptr_arg)
- marl/observation: fix doc lazy-continuation warning on TOTAL line
- marl/trainer: manual Default impl → #[derive(Default)] + #[default] on Demo variant

Also adds test_marl_convergence_improves_mean_return: fills 64-transition
ReplayBuffer with mixed rewards (steps 0-31: negative, 32-63: positive),
runs ppo_update, asserts mean_return is finite and non-zero.

Result: 0 clippy warnings · 74/74 tests (default) · 86/86 (itar-unrestricted)

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(swarm): integrate Ruflo AI-agent capabilities into ruview-swarm

Adds a feature-gated Ruflo integration layer connecting ruview-swarm to the
claude-flow daemon's AgentDB, AIDefence, and SONA intelligence subsystems.
Default build is unaffected (all paths behind `Option<Box<dyn RufloBackend>>`).

## New module: src/ruflo/

- backend.rs: RufloBackend trait (9 async methods) + RufloError, MissionMemoryEntry,
  PatternEntry, MavlinkScanResult types (always compiled)
- mock_backend.rs: MockRufloBackend in-memory impl for testing (always compiled, 5 tests)
- http_backend.rs: HttpRufloBackend — JSON-RPC 2.0 → claude-flow daemon localhost:3000
  (gated behind `ruflo` feature, requires reqwest)
- mission_summary.rs: MissionSummary serializer with pattern description + confidence
  scoring from victim recall, coverage %, collision penalty (always compiled, 3 tests)

## 4 capability areas

1. MissionMemory   → memory_store / memory_search       (cross-mission victim memory)
2. PatternLearner  → agentdb_pattern-store / -search     (HNSW SONA trajectory patterns)
3. MavlinkDefence  → aidefence_is_safe / aidefence_scan  (scan MAVLink before accepting)
4. IntelligenceHooks → trajectory-start/step/end          (SONA learning loop)

## SwarmOrchestrator integration

- with_ruflo(backend): builder to attach a backend
- start_trajectory(task) / finish_trajectory(success, key): SONA mission lifecycle
- receive_peer_detection_checked(): AIDefence scan before accepting peer detections

## Cargo feature

`ruflo = ["dep:reqwest", "dep:serde_json"]` — optional, not in default

## Tests

- --no-default-features: 82/82 pass (8 new ruflo tests)
- --features ruflo,itar-unrestricted: 94/94 pass

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(swarm): M7 mission profiles with victim confirmation reports + pre-merge docs

Adds end-to-end mission runners producing structured MissionReport output,
and updates project docs (CHANGELOG, README, CLAUDE.md) per pre-merge checklist.

## M7 Mission Profiles (integration/mission_report.rs + swarm_sim.rs)

- MissionReport / VictimReport / SotaComparison types (serde-serializable)
- run_mission_with_report(): full mission → detailed report with per-victim
  localization error, fusion uncertainty, contributing drones, detection time
- run_inspection_mission(): leader-follower power-line corridor inspection
- run_mine_mission(): GPS-denied underground (2-drone, slow, UWB-only)
- SotaComparison embeds Wi2SAR baseline (5m / 810s) vs achieved metrics

## Docs (pre-merge checklist)

- CHANGELOG.md: ruview-swarm + Ruflo integration + performance entries
- README.md: ruview-swarm row
- CLAUDE.md: Key Rust Crates table row + ADR-148 in ADR list

## Tests
- --no-default-features: 86/86 pass
- --features ruflo,itar-unrestricted: 98/98 pass

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix(swarm): convergence-assist for victim fusion + 5s Ruflo HTTP timeout

Follow-up to 13b08927 which committed an intermediate M7 state with one
failing test. This lands the M7 agent's convergence fixes and the security
review's timeout hardening.

## Fixes
- swarm_sim.rs: min-separation nudge before collision metric (0 collisions
  with staggered starts) + Phase-3 convergence assist that vectors the nearest
  idle peer toward a single-drone CSI contact so multi-view fusion can fire
- http_backend.rs: add 5s request timeout to reqwest client (security review
  Medium finding — a dead daemon would otherwise hang the swarm step loop)

## Security review verdict (HttpRufloBackend)
Safe to merge. No credentials in requests, serde_json prevents injection,
fail-open on daemon-down is documented and appropriate for SAR missions,
MAVLink passed as structured text (not raw bytes). Timeout fix applied.

## Tests
- --no-default-features: 87/87 pass
- --features ruflo,itar-unrestricted: 100/100 pass

Co-Authored-By: claude-flow <ruv@ruv.net>

* perf(swarm): add PPO training-throughput benchmark + fix bench crate-name imports

- bench_ppo_update: PPO update over 64-transition buffer — 244 µs median
- fix: bench imports referenced stale `wifi_densepose_swarm` (pre-rename),
  corrected to `ruview_swarm` so the bench target compiles

M6 benchmark suite now 5/5 compiling and running. Tests unchanged: 87/100.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(swarm): real Candle autodiff PPO + A-MAPPO role attention + GPU training (M4)

Replaces the finite-difference PPO placeholder with a real GPU-capable Candle
0.9 autodiff trainer, adds A-MAPPO heterogeneous-role attention, a runnable
training binary, and right-sized GCP/local launch scripts. This is the unlock
that makes "GPU long training cycles" actually mean something — the previous
ppo_update did no gradient descent.

## Real autodiff PPO (feature `train`, optional `cuda`)
- candle_ppo.rs: CandleActorCritic (64→128→64 MLP + action/value heads +
  learnable log_std), CandlePpoConfig, CandleTrainer with GAE and a genuine
  optimizer.backward_step over the network. select_device() picks CUDA when
  built --features cuda and a GPU is present, else CPU.
- Verified: 5-episode CPU smoke run shows value_loss 12643→12375 (critic
  actually learning); safetensors checkpoint saved. Placeholder never moved weights.

## A-MAPPO heterogeneous-role attention (role_attention.rs, always compiled)
Addresses the four sensor-vs-relay edge cases:
- relay attention floor (prevents collapse — relays produce no CSI)
- role-segmented sensor/relay attention pools (variable neighbor cardinality)
- sensor-gated triangulation-geometry penalty (protects 3-view fusion baseline,
  ADR-148 §4.2 — relays not dragged into triangulation geometry)
- one-hot role embeddings for keys

## Training binary
- src/bin/train_marl.rs (required-features=["train"], excluded from default build)
- CLI: --episodes --drones --profile --steps --checkpoint-dir --checkpoint-every
- Wires CandleTrainer to the SwarmOrchestrator rollout loop; GAE + PPO update
  per episode; periodic safetensors checkpoints

## Right-sized launch (scripts/gcp/)
- provision_marl.sh: g2-standard-16 (1× L4, 16 vCPU, ~$1.40/hr) — NOT the
  $29/hr A100×8 box. MARL is rollout-bound not matmul-bound; ~21× cheaper.
- run_marl_train.sh: GCP rsync + train + checkpoint pull
- run_marl_train_local.sh: local RTX 5080, $0
- A100×8 provision_training.sh left for OccWorld (which saturates the GPUs)

## Tests
- --no-default-features: 91/91 (87 + 4 role_attention)
- --features train: 96/96 (+ 5 candle_ppo, incl. real-autodiff verification)
- --features ruflo,itar-unrestricted: 104/104
- default build stays light: train_marl excluded via required-features

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs(adr-148): mark M4 complete — real GPU autodiff training; overall 98%

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(swarm): training visualizer — JSONL telemetry + self-contained HTML viewer

Adds an offline, dependency-free visualization for the drone training system:
a top-down swarm replay synced with training-metric curves, fed by a JSONL
telemetry log the trainer emits. No server, no build step, no CDN.

## Telemetry recorder (integration/telemetry.rs, always compiled, no new deps)
- TelemetryRecorder writes newline-delimited JSON: one `meta` (profile, area,
  ground-truth victims), many `step` (per-tick drone x/y/heading/battery/detection
  + coverage%), and per-episode `episode` (mean_return, policy_loss, value_loss).
- Written by hand (no serde_json) so it stays in the default build; 2 tests.

## train_marl telemetry flags
- `--telemetry FILE` writes the log; `--telemetry-episode N` selects which
  episode's spatial steps to record (metrics recorded for all episodes).

## Visualizer (viz/swarm_viz.html — single file, vanilla JS + canvas)
- LEFT: top-down replay — heading-oriented drone triangles (cyan/lime on
  detection), victim markers, growing coverage heatmap, detection pulse rings,
  play/pause/scrub/speed controls + live coverage/detection readout.
- RIGHT: three autoscaled line charts (mean return, policy loss, value loss)
  over episodes, hand-drawn (no chart library).
- Loads via file picker/drag-drop or auto-fetches the bundled sample; dark
  drone-ops theme; graceful degradation on file:// CORS.
- viz/sample_telemetry.jsonl: real 30-episode / 4-drone / 400×400 m run
  (value_loss 20052→7154 — visible critic learning). Parses 1 meta / 60 step / 30 episode.

## Usage
  cargo run --release -p ruview-swarm --features train,cuda --bin train_marl -- \
      --episodes 5000 --telemetry run.jsonl
  open v2/crates/ruview-swarm/viz/swarm_viz.html  # load run.jsonl

Tests unchanged (91 default / 96 train / 104 ruflo+itar); telemetry adds 2.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(swarm): selectable flight + self-learning patterns, wired into training + viz

Adds multiple flight/coverage-optimization strategies and self-learning
strategies, selectable from the trainer, and fixes drone clustering — the
demo sweep now covers 36% of the area (was ~0.9%) with 4 disjoint strips.

## Flight patterns (planning/patterns.rs) — `FlightPattern`
- PartitionedLawnmower (new default): area split into per-drone strips → no
  overlap, coverage scales ~linearly with swarm size (clustering fix)
- Boustrophedon (baseline), Spiral, Pheromone (stigmergic), PotentialField,
  LevyFlight. from_str/name/all + next_target(&PatternContext).

## Self-learning patterns (marl/learning.rs) — `LearningPattern`
- Mappo (CTDE centralized critic), Ippo (independent, jamming-robust),
  MappoCuriosity (count-based intrinsic novelty), MetaRl (MAML fast-adapt).
- CuriosityModule (visit_bonus = beta/sqrt(count), novelty decays on revisit),
  MetaAdapter (base + fast-weights, reset_fast/consolidate), shaped_reward().

## Trainer wiring (bin/train_marl.rs)
- --flight-pattern {boustrophedon|partitioned|spiral|pheromone|potential|levy}
- --learn-pattern  {mappo|ippo|curiosity|meta}
- Rollout now moves each drone per the selected FlightPattern (PatternContext
  with visited trail + live peers), curiosity-shapes the reward, and logs
  CTDE vs independent. Telemetry meta profile carries the pattern labels so the
  viewer header shows `flight=… · learn=…`.

## Verification
- Browser pass (viz at localhost:8777): partitioned run renders 4 distinct
  serpentine coverage bands, header shows the patterns, final coverage 36.3%,
  scrubber/speed/playback work, ZERO console errors. Screenshot confirmed.
- Regenerated viz/sample_telemetry.jsonl: 1 meta / 120 step / 30 episode,
  coverage 0.9% → 36.3%.

## Tests
- --no-default-features: 103/103 (was 91; +6 patterns +6 learning)
- --features train: 108/108

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(swarm): add flight-pattern telemetry presets for the visualizer

5 loadable presets (verified browser-distinct, physics-ordered coverage):
pheromone ~44% > potential ~40% > partitioned 36% > spiral ~13% > levy ~5%.
Load any in viz/swarm_viz.html to compare flight strategies without retraining.

Co-Authored-By: claude-flow <ruv@ruv.net>

* chore(swarm): clippy-clean + publish guard for ruview-swarm

- ruview-swarm src is now 0 clippy warnings across default/train/full feature
  sets (derive Default, targeted allows for intentional from_str + bounded
  casts + borrow-required index loops; removed redundant unsigned .max(0))
- publish = false until PR merges, internal path-deps publish in order, and
  ITAR (USML VIII(h)(12)) export sign-off — prevents accidental public publish

Tests unchanged: 103 default / 108 train / 116 ruflo+itar / 120 full+train.
(6 remaining clippy warnings are pre-existing in dependency wifi-densepose-core,
 out of scope for this crate.)

Co-Authored-By: claude-flow <ruv@ruv.net>

* ci(swarm): add ruview-swarm CI guard

Path-scoped guard for v2/crates/ruview-swarm/** (ADR-148). Complements the
main ci.yml (which only runs the default workspace tests):
- feature-matrix tests: default / train / ruflo+itar / full+train
- clippy -D warnings --no-deps (crate-own code only; dep warnings don't gate)
- train_marl bin builds under 'train' AND is excluded from the default build
- ITAR/publish guards: publish=false present, itar-unrestricted never in default

All steps verified locally green before commit.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-30 16:00:59 -04:00

219 lines
7.7 KiB
Rust
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
use crate::types::{DroneState, NodeId, Position3D, GridCell, CsiDetection};
/// Local observation vector for a single drone agent.
/// Feeds into the MAPPO actor network.
///
/// Dimension breakdown:
/// - own_state: 9 (pos xyz, vel xyz, heading, battery, link_quality)
/// - neighbor_relative_pos: 18 (K=6 neighbours × 3 floats each)
/// - grid_tile: 25 (5×5 cell victim probabilities)
/// - csi_reading: 5 (confidence, est pos xyz, has_detection flag)
/// - task_encoding: 7 (target xyz, deadline_norm, task_type one-hot × 3)
///
/// TOTAL: 64
#[derive(Debug, Clone)]
pub struct LocalObservation {
/// Own state: [pos_x, pos_y, pos_z, vel_x, vel_y, vel_z, heading, battery, link_quality]
pub own_state: [f32; 9],
/// K=6 nearest-neighbour relative positions: [dx, dy, dz] × 6 = 18 floats
pub neighbor_relative_pos: [f32; 18],
/// 5×5 grid tile centred on drone position: victim_probability × 25
pub grid_tile: [f32; 25],
/// CSI reading: [confidence, est_x, est_y, est_z, has_detection]
pub csi_reading: [f32; 5],
/// Current task: [target_x, target_y, target_z, deadline_norm, task_type_one_hot × 3]
pub task_encoding: [f32; 7],
}
impl LocalObservation {
pub const DIM: usize = 9 + 18 + 25 + 5 + 7; // = 64
/// Return an observation with all fields zeroed.
pub fn zeros() -> Self {
Self {
own_state: [0.0; 9],
neighbor_relative_pos: [0.0; 18],
grid_tile: [0.0; 25],
csi_reading: [0.0; 5],
task_encoding: [0.0; 7],
}
}
pub fn to_vec(&self) -> Vec<f32> {
let mut v = Vec::with_capacity(Self::DIM);
v.extend_from_slice(&self.own_state);
v.extend_from_slice(&self.neighbor_relative_pos);
v.extend_from_slice(&self.grid_tile);
v.extend_from_slice(&self.csi_reading);
v.extend_from_slice(&self.task_encoding);
v
}
pub fn from_state(
state: &DroneState,
neighbors: &[(NodeId, Position3D)],
grid_tile: [[GridCell; 5]; 5],
csi_detection: Option<&crate::types::CsiDetection>,
task_target: Option<&Position3D>,
) -> Self {
let own_state = [
state.position.x as f32 / 1000.0, // normalised to km
state.position.y as f32 / 1000.0,
state.position.z as f32 / 100.0,
state.velocity.vx as f32 / 20.0, // normalised to max speed
state.velocity.vy as f32 / 20.0,
state.velocity.vz as f32 / 5.0,
state.heading_rad as f32 / std::f32::consts::PI,
state.battery_pct / 100.0,
state.link_quality,
];
let mut neighbor_relative_pos = [0.0f32; 18];
for (i, (_, pos)) in neighbors.iter().take(6).enumerate() {
let base = i * 3;
neighbor_relative_pos[base] = (pos.x - state.position.x) as f32 / 100.0;
neighbor_relative_pos[base + 1] = (pos.y - state.position.y) as f32 / 100.0;
neighbor_relative_pos[base + 2] = (pos.z - state.position.z) as f32 / 10.0;
}
let mut grid_flat = [0.0f32; 25];
for (r, row) in grid_tile.iter().enumerate() {
for (c, cell) in row.iter().enumerate() {
grid_flat[r * 5 + c] = cell.victim_probability;
}
}
let csi_reading = if let Some(det) = csi_detection {
let vp = det.victim_position.unwrap_or(state.position);
[det.confidence, (vp.x / 100.0) as f32, (vp.y / 100.0) as f32, (vp.z / 10.0) as f32, 1.0]
} else {
[0.0, 0.0, 0.0, 0.0, 0.0]
};
let task_encoding: [f32; 7] = if let Some(target) = task_target {
[
(target.x / 100.0) as f32,
(target.y / 100.0) as f32,
(target.z / 10.0) as f32,
1.0, // deadline_norm: placeholder
1.0, 0.0, 0.0, // task_type one-hot: CoverCell
]
} else {
[0.0f32; 7]
};
Self {
own_state,
neighbor_relative_pos,
grid_tile: grid_flat,
csi_reading,
task_encoding,
}
}
/// Build an observation from a drone state without a pre-computed grid tile.
/// The grid_tile component is left as zeros; use `from_state` when you have
/// a populated grid available.
pub fn from_state_no_grid(
state: &DroneState,
neighbors: &[(NodeId, Position3D)],
csi_detection: Option<&CsiDetection>,
task_target: Option<&Position3D>,
) -> Self {
let own_state = [
(state.position.x / 1000.0) as f32,
(state.position.y / 1000.0) as f32,
(state.position.z / 100.0) as f32,
(state.velocity.vx / 20.0) as f32,
(state.velocity.vy / 20.0) as f32,
(state.velocity.vz / 5.0) as f32,
(state.heading_rad / std::f64::consts::PI) as f32,
state.battery_pct / 100.0,
state.link_quality,
];
let mut neighbor_relative_pos = [0.0f32; 18];
for (i, (_, pos)) in neighbors.iter().take(6).enumerate() {
let base = i * 3;
neighbor_relative_pos[base] = ((pos.x - state.position.x) / 100.0) as f32;
neighbor_relative_pos[base+1] = ((pos.y - state.position.y) / 100.0) as f32;
neighbor_relative_pos[base+2] = ((pos.z - state.position.z) / 10.0) as f32;
}
let csi_reading = match csi_detection {
Some(det) => {
let vp = det.victim_position.unwrap_or(state.position);
[det.confidence, (vp.x / 100.0) as f32, (vp.y / 100.0) as f32, (vp.z / 10.0) as f32, 1.0]
}
None => [0.0; 5],
};
let task_encoding: [f32; 7] = match task_target {
Some(t) => [(t.x / 100.0) as f32, (t.y / 100.0) as f32, (t.z / 10.0) as f32, 1.0, 1.0, 0.0, 0.0],
None => [0.0; 7],
};
Self {
own_state,
neighbor_relative_pos,
grid_tile: [0.0; 25],
csi_reading,
task_encoding,
}
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::types::{DroneState, NodeId};
#[test]
fn observation_dimension() {
assert_eq!(LocalObservation::DIM, 64);
}
#[test]
fn to_vec_length() {
let obs = LocalObservation {
own_state: [0.0; 9],
neighbor_relative_pos: [0.0; 18],
grid_tile: [0.0; 25],
csi_reading: [0.0; 5],
task_encoding: [0.0; 7],
};
assert_eq!(obs.to_vec().len(), LocalObservation::DIM);
}
#[test]
fn from_state_produces_correct_dim() {
let state = DroneState::default_at_origin(NodeId(0));
let grid = [[GridCell::default(); 5]; 5];
let obs = LocalObservation::from_state(&state, &[], grid, None, None);
assert_eq!(obs.to_vec().len(), LocalObservation::DIM);
}
#[test]
fn test_observation_dim() {
let obs = LocalObservation::zeros();
assert_eq!(obs.to_vec().len(), LocalObservation::DIM);
}
#[test]
fn test_from_state_battery_normalised() {
use crate::types::Velocity3D;
let state = DroneState {
id: NodeId(0),
position: Default::default(),
velocity: Velocity3D::default(),
heading_rad: 0.0,
altitude_agl_m: 30.0,
battery_pct: 75.0,
link_quality: 0.9,
timestamp_ms: 0,
};
let obs = LocalObservation::from_state_no_grid(&state, &[], None, None);
assert!((obs.own_state[7] - 0.75).abs() < 1e-4, "battery should be normalised to 0.75");
}
}