Files
ruvnet--RuView/ui/pose-fusion/js/video-capture.js
rUv 7c1351fd5d feat(demo): wire all 6 RuVector WASM attention mechanisms into pose fusion
* feat: dual-modal WASM browser pose estimation demo (ADR-058)

Live webcam video + WiFi CSI fusion for real-time pose estimation.
Two parallel CNN pipelines (ruvector-cnn-wasm) with attention-weighted
fusion and dynamic confidence gating. Three modes: Dual, Video-only,
CSI-only. Includes pre-built WASM package (~52KB) for browser deployment.

- ADR-058: Dual-modal architecture design
- ui/pose-fusion.html: Main demo page with dark theme UI
- 7 JS modules: video-capture, csi-simulator, cnn-embedder, fusion-engine,
  pose-decoder, canvas-renderer, main orchestrator
- Pre-built ruvector-cnn-wasm WASM package for browser
- CSI heatmap, embedding space visualization, latency metrics
- WebSocket support for live ESP32 CSI data
- Navigation link added to main dashboard

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: motion-responsive skeleton + through-wall CSI tracking

- Pose decoder now uses per-cell motion grid to track actual arm/head
  positions — raising arms moves the skeleton's arms, head follows
  lateral movement
- Motion grid (10x8 cells) tracks intensity per body zone: head,
  left/right arm upper/mid, legs
- Through-wall mode: when person exits frame, CSI maintains presence
  with slow decay (~10s) and skeleton drifts in exit direction
- CSI simulator persists sensing after video loss, ghost pose renders
  with decreasing confidence
- Reduced temporal smoothing (0.45) for faster response to movement

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: video fills available space + correct WASM path resolution

- Remove fixed aspect-ratio and max-height from video panel so it
  fills the available viewport space without scrolling
- Grid uses 1fr row for content area, overflow:hidden on main grid
- Fix WASM path: resolve relative to JS module file using import.meta.url
  instead of hardcoded ./pkg/ which resolved incorrectly on gh-pages
- Responsive: mobile still gets aspect-ratio constraint

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: live ESP32 CSI pipeline + auto-connect WebSocket

- Add auto-connect to local sensing server WebSocket (ws://localhost:8765)
- Demo shows "Live ESP32" when connected to real CSI data
- Add build_firmware.ps1 for native Windows ESP-IDF builds (no Docker)
- Add read_serial.ps1 for ESP32 serial monitor

Pipeline: ESP32 → UDP:5005 → sensing-server → WS:8765 → browser demo

Co-Authored-By: claude-flow <ruv@ruv.net>

* docs: add ADR-059 live ESP32 CSI pipeline + update README with demo links

- ADR-059: Documents end-to-end ESP32 → sensing server → browser pipeline
- README: Add dual-modal pose fusion demo link, update ADR count to 49
- References issue #245

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: RSSI visualization, RuVector attention WASM, cache-bust fixes

- Add animated RSSI Signal Strength panel with sparkline history
- Fix RuVector WasmMultiHeadAttention retptr calling convention
- Wire up RuVector Multi-Head + Flash Attention in CNN embedder
- Add ambient temporal drift to CSI simulator for visible heatmap animation
- Fix embedding space projection (sparse projection replaces cancelling sum)
- Add auto-scaling to embedding space renderer
- Add cache busters (?v=4) to all ES module imports to prevent stale caches
- Add diagnostic logging for module version verification
- Add RSSI tracking with quality labels and color-coded dBm display
- Includes ruvector-attention-wasm v2.0.5 browser ESM wrapper

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat: 26-keypoint dexterous pose + full RuVector attention pipeline

Pose Decoder (17 → 26 keypoints):
- Add finger approximations: thumb, index, pinky per hand (6 new)
- Add toe tips: left/right foot index (2 new)
- Add neck keypoint (1 new)
- Hand openness driven by arm motion intensity
- Finger positions computed from wrist-elbow axis angles

CNN Embedder (full RuVector WASM pipeline):
- Stage 1: Multi-Head Attention (global spatial reasoning)
- Stage 2: Hyperbolic Attention (hierarchical body-part tree)
- Stage 3: MoE Attention (3 experts: upper/lower/extremities, top-2)
- Blended 40/30/30 weighting → final embedding projection

Canvas Renderer:
- Magenta finger joints with distinct glow
- Cyan toe tips
- White neck keypoint
- Thinner limb lines for hand/foot connections
- Joint count shown in overlay label

CSI Simulator:
- Skip synthetic person state when live ESP32 connected
- Only simulate CSI data in demo mode (was already correct)

Embedding Space:
- Fixed projection: sparse 8-dim projection replaces cancelling sum
- Auto-scaling normalizes point spread to fill canvas

Cache busters bumped to v=5 on all imports.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: centroid-based pose tracking for responsive limb movement

Rewrites pose decoder from intensity-based to position-based tracking:
- Arms now track toward motion centroid in each body zone
- Elbow/wrist positions computed along shoulder→centroid vector
- Legs track toward lower-body zone centroids
- Smoothing reduced from 0.45 to 0.25 for responsiveness
- Zone centroids blend 30% old / 70% new each frame

6 body zones with overlapping coverage:
- Head (top 20%, center cols)
- Left/Right Arm (rows 10-60%, outer cols)
- Torso (rows 15-55%, center cols)
- Left/Right Leg (rows 50-100%, half cols each)

Hand openness now driven by arm spread distance + raise amount.
Cache busters v=6.

Co-Authored-By: claude-flow <ruv@ruv.net>

* fix: remove duplicate lAnkleX/rAnkleX declarations in pose-decoder

Stale code block from old intensity-based tracking was left behind,
re-declaring variables already defined by centroid-based tracking.

Co-Authored-By: claude-flow <ruv@ruv.net>

* feat(demo): wire all 6 RuVector WASM attention mechanisms into pose fusion

- Add WasmLinearAttention and WasmLocalGlobalAttention to browser ESM wrapper
- Add 6 WASM utility functions (batch_normalize, pairwise_distances, etc.)
- Extend CnnEmbedder to 6-stage pipeline: Flash → MHA → Hyperbolic → Linear → MoE → L+G
- Use log-energy softmax blending across all 6 stages
- Wire WASM cosine_similarity and normalize into FusionEngine
- Add RuVector pipeline stats panel to UI (energy, refinement, pose impact)
- Compute embedding-to-joint mapping stats without modifying joint positions
- Center camera prompt with flexbox layout
- Add cache busters v=12

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-03-12 20:59:57 -04:00

236 lines
7.3 KiB
JavaScript

/**
* VideoCapture — getUserMedia webcam capture with frame extraction.
* Provides quality metrics (brightness, motion) for fusion confidence gating.
*/
export class VideoCapture {
constructor(videoElement) {
this.video = videoElement;
this.stream = null;
this.offscreen = document.createElement('canvas');
this.offCtx = this.offscreen.getContext('2d', { willReadFrequently: true });
this.prevFrame = null;
this.motionScore = 0;
this.brightnessScore = 0;
}
async start(constraints = {}) {
const defaultConstraints = {
video: {
width: { ideal: 640 },
height: { ideal: 480 },
facingMode: 'user',
frameRate: { ideal: 30 }
},
audio: false
};
try {
this.stream = await navigator.mediaDevices.getUserMedia(
Object.keys(constraints).length ? constraints : defaultConstraints
);
this.video.srcObject = this.stream;
await this.video.play();
this.offscreen.width = this.video.videoWidth;
this.offscreen.height = this.video.videoHeight;
return true;
} catch (err) {
console.error('[Video] Camera access failed:', err.message);
return false;
}
}
stop() {
if (this.stream) {
this.stream.getTracks().forEach(t => t.stop());
this.stream = null;
}
this.video.srcObject = null;
}
get isActive() {
return this.stream !== null && this.video.readyState >= 2;
}
get width() { return this.video.videoWidth || 640; }
get height() { return this.video.videoHeight || 480; }
/**
* Capture current frame as RGB Uint8Array + compute quality metrics.
* @param {number} targetW - Target width for CNN input
* @param {number} targetH - Target height for CNN input
* @returns {{ rgb: Uint8Array, width: number, height: number, motion: number, brightness: number }}
*/
captureFrame(targetW = 56, targetH = 56) {
if (!this.isActive) return null;
// Draw to offscreen at target resolution
this.offscreen.width = targetW;
this.offscreen.height = targetH;
this.offCtx.drawImage(this.video, 0, 0, targetW, targetH);
const imageData = this.offCtx.getImageData(0, 0, targetW, targetH);
const rgba = imageData.data;
// Convert RGBA → RGB
const pixels = targetW * targetH;
const rgb = new Uint8Array(pixels * 3);
let brightnessSum = 0;
let motionSum = 0;
for (let i = 0; i < pixels; i++) {
const r = rgba[i * 4];
const g = rgba[i * 4 + 1];
const b = rgba[i * 4 + 2];
rgb[i * 3] = r;
rgb[i * 3 + 1] = g;
rgb[i * 3 + 2] = b;
// Luminance for brightness
const lum = 0.299 * r + 0.587 * g + 0.114 * b;
brightnessSum += lum;
// Motion: diff from previous frame
if (this.prevFrame) {
const pr = this.prevFrame[i * 3];
const pg = this.prevFrame[i * 3 + 1];
const pb = this.prevFrame[i * 3 + 2];
motionSum += Math.abs(r - pr) + Math.abs(g - pg) + Math.abs(b - pb);
}
}
this.brightnessScore = brightnessSum / (pixels * 255);
this.motionScore = this.prevFrame ? Math.min(1, motionSum / (pixels * 100)) : 0;
this.prevFrame = new Uint8Array(rgb);
return {
rgb,
width: targetW,
height: targetH,
motion: this.motionScore,
brightness: this.brightnessScore
};
}
/**
* Capture full-resolution RGBA for overlay rendering
* @returns {ImageData|null}
*/
captureFullFrame() {
if (!this.isActive) return null;
this.offscreen.width = this.width;
this.offscreen.height = this.height;
this.offCtx.drawImage(this.video, 0, 0);
return this.offCtx.getImageData(0, 0, this.width, this.height);
}
/**
* Detect motion region + detailed motion grid for body-part tracking.
* Returns bounding box + a grid showing WHERE motion is concentrated.
* @returns {{ x, y, w, h, detected: boolean, motionGrid: number[][], gridCols: number, gridRows: number, exitDirection: string|null }}
*/
detectMotionRegion(targetW = 56, targetH = 56) {
if (!this.isActive || !this.prevFrame) return { detected: false, motionGrid: null };
this.offscreen.width = targetW;
this.offscreen.height = targetH;
this.offCtx.drawImage(this.video, 0, 0, targetW, targetH);
const rgba = this.offCtx.getImageData(0, 0, targetW, targetH).data;
let minX = targetW, minY = targetH, maxX = 0, maxY = 0;
let motionPixels = 0;
const threshold = 25;
// Motion grid: divide frame into cells and track motion intensity per cell
const gridCols = 10;
const gridRows = 8;
const cellW = targetW / gridCols;
const cellH = targetH / gridRows;
const motionGrid = Array.from({ length: gridRows }, () => new Float32Array(gridCols));
const cellPixels = cellW * cellH;
// Also track motion centroid weighted by intensity
let motionCxSum = 0, motionCySum = 0, motionWeightSum = 0;
for (let y = 0; y < targetH; y++) {
for (let x = 0; x < targetW; x++) {
const i = y * targetW + x;
const r = rgba[i * 4], g = rgba[i * 4 + 1], b = rgba[i * 4 + 2];
const pr = this.prevFrame[i * 3], pg = this.prevFrame[i * 3 + 1], pb = this.prevFrame[i * 3 + 2];
const diff = Math.abs(r - pr) + Math.abs(g - pg) + Math.abs(b - pb);
if (diff > threshold * 3) {
motionPixels++;
if (x < minX) minX = x;
if (y < minY) minY = y;
if (x > maxX) maxX = x;
if (y > maxY) maxY = y;
}
// Accumulate per-cell motion intensity
const gc = Math.min(Math.floor(x / cellW), gridCols - 1);
const gr = Math.min(Math.floor(y / cellH), gridRows - 1);
const intensity = diff / (3 * 255); // Normalize 0-1
motionGrid[gr][gc] += intensity / cellPixels;
// Weighted centroid
if (diff > threshold) {
motionCxSum += x * diff;
motionCySum += y * diff;
motionWeightSum += diff;
}
}
}
const detected = motionPixels > (targetW * targetH * 0.02);
// Motion centroid (normalized 0-1)
const motionCx = motionWeightSum > 0 ? motionCxSum / (motionWeightSum * targetW) : 0.5;
const motionCy = motionWeightSum > 0 ? motionCySum / (motionWeightSum * targetH) : 0.5;
// Detect exit direction: if centroid is near edges
let exitDirection = null;
if (detected && motionCx < 0.1) exitDirection = 'left';
else if (detected && motionCx > 0.9) exitDirection = 'right';
else if (detected && motionCy < 0.1) exitDirection = 'up';
else if (detected && motionCy > 0.9) exitDirection = 'down';
// Track last known position for through-wall persistence
if (detected) {
this._lastDetected = {
x: minX / targetW,
y: minY / targetH,
w: (maxX - minX) / targetW,
h: (maxY - minY) / targetH,
cx: motionCx,
cy: motionCy,
exitDirection,
time: performance.now()
};
}
return {
detected,
x: minX / targetW,
y: minY / targetH,
w: (maxX - minX) / targetW,
h: (maxY - minY) / targetH,
coverage: motionPixels / (targetW * targetH),
motionGrid,
gridCols,
gridRows,
motionCx,
motionCy,
exitDirection
};
}
/**
* Get the last known detection info (for through-wall persistence)
*/
get lastDetection() {
return this._lastDetected || null;
}
}