Files
ruvnet--RuView/v2
ruv 9c49ff1a38 feat(adr-110): fleet cardinality gauge wifi_densepose_mesh_node_total
Iter 37 — adds a fleet-summary gauge to the iter-36 Prometheus
exposition. Ops dashboards now answer "how many leaders / followers
/ no-sync nodes are there right now" in one scrape, without having
to scrape every per-node series and aggregate client-side.

  # HELP wifi_densepose_mesh_node_total Per-state node count across the fleet
  # TYPE wifi_densepose_mesh_node_total gauge
  wifi_densepose_mesh_node_total{state="leader"}   1
  wifi_densepose_mesh_node_total{state="follower"} 2
  wifi_densepose_mesh_node_total{state="no_sync"}  0

  - leader / follower split derived from snapshot.is_leader
  - no_sync = total_nodes_in_state - nodes_with_snapshot
    (so a node that has sent CSI frames but never a sync packet
     shows up here, which is what an operator wants to alert on)

Implementation factored as a free function `fleet_role_counts` so the
math is testable without spinning up the axum handler. Same pattern
iter 18 (update_csi_fps_ema) and iter 30 (sync_snapshot) used.

Test added (9/9 sync_snapshot_helper_tests now green):
  fleet_role_counts_classifies_correctly
    Three cases:
      - empty fleet → (0, 0)
      - 1 leader + 2 followers → (1, 2)
      - all-leaders edge case → (2, 0) (election prevents this in
        practice but the gauge math must still be consistent)

Useful Grafana queries this unlocks:
  - sum(wifi_densepose_mesh_node_total{state="follower"})
    → total reachable follower count
  - wifi_densepose_mesh_node_total{state="no_sync"} > 0
    → alert when any node has dropped off the mesh

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 15:08:16 -04:00
..