fix(firmware): OTA upload fails closed when no PSK in NVS (RuView#596 audit) (#623)

ota_check_auth() previously returned true when s_ota_psk[0] == '\0'
("permissive for dev"). A freshly-flashed node — or any node where
nobody had provisioned an OTA PSK yet — accepted attacker-controlled
firmware over plain HTTP on port 8032 from any host on the WiFi. No
Secure Boot V2, no signed-image verification, no transport encryption.
Single LAN call could brick or backdoor a node.

This was flagged in the deep security review of PR #596 but was a
PRE-EXISTING bug in main, not new code from that PR — so it stood as
a critical-severity production issue until this commit.

Fix:
- ota_check_auth() now returns false when no PSK is provisioned, with
  ESP_LOGW("OTA rejected: no PSK in NVS …") at the call site so the
  operator can diagnose the rejection from serial logs
- ota_update_init() ESP_LOGW message updated to surface the new posture
  at boot ("upload endpoint will REJECT all requests until provisioned")
- Doc comment on ota_check_auth() rewritten to make the contract
  explicit and reference the audit

The OTA HTTP server itself still starts even when no PSK is set. That
lets the operator run `provision.py --ota-psk <hex>` over USB-CDC to
write the NVS key without reflashing the firmware. The upload endpoint
just refuses every request in the meantime.

Breaking change for any deployment that depended on the unauthenticated
OTA path working out of the box. Documented in CHANGELOG under
[Unreleased] / Security so it's visible at the next release cut.

Fix-marker RuView#596-ota-fail-closed (scripts/fix-markers.json)
requires the new behaviour and forbids the old "permissive for dev"
fallback strings, so a future revert fails CI.
This commit is contained in:
rUv
2026-05-18 08:56:07 -04:00
committed by GitHub
parent b2e2e6d6fd
commit 281c4cb0ce
3 changed files with 34 additions and 6 deletions
+15
View File
@@ -188,6 +188,21 @@
"rationale": "Five endpoints used to embed user-controlled identifiers (session_name, model_id, dataset_id, recording id) into format!() paths with no sanitization, allowing classic '../../etc/passwd' reads, writes, and deletes on the server filesystem. The safe_id helper enforces [A-Za-z0-9._-] only (no leading '.', max 64 chars) and must run before any user input reaches a format!() that builds a path. Removing the helper or skipping it at any of these call sites reintroduces the #615 attack surface.",
"ref": "https://github.com/ruvnet/RuView/issues/615"
},
{
"id": "RuView#596-ota-fail-closed",
"title": "ESP32 OTA upload fails closed when no PSK is provisioned",
"files": ["firmware/esp32-csi-node/main/ota_update.c"],
"require": [
"fail-closed, see RuView#596 audit",
"OTA rejected: no PSK in NVS"
],
"forbid": [
"/auth disabled \\(permissive for dev\\)/",
"/No PSK provisioned \\u2014 auth disabled/"
],
"rationale": "ota_check_auth previously returned true when s_ota_psk[0] == '\\0', so any host on the WiFi could push attacker-controlled firmware to a freshly-flashed node over plain HTTP on port 8032 — no Secure Boot V2, no signed-image verification, single LAN call could brick or backdoor a node. Flagged in the deep-review of PR #596. Fail-closed means the OTA server still starts (so operators can provision a PSK via USB-CDC without reflashing) but the upload endpoint refuses every request until provision.py --ota-psk <hex> writes the NVS key. Reverting this lets the rogue-LAN attack reopen.",
"ref": "https://github.com/ruvnet/RuView/pull/596#pullrequestreview"
},
{
"id": "RuView#560",
"title": "verify.py quantizes features before SHA-256 for cross-platform hash stability",