Compare commits

...

2 Commits

Author SHA1 Message Date
ruv 1f5b7b48c9 cog-ha-matter (ADR-116 P4): witness file persistence + chain-level verify
Closes the witness audit-bundle surface. The hash-chain primitive
+ JSONL serializer from earlier iters only handled one event at a
time; this lands the file-stream surface that operations actually
need:

  * `WitnessChain::write_jsonl(&mut impl Write) -> io::Result<()>`
    — streams every event as one line + `\n`, empty chain writes
    zero bytes
  * `WitnessChain::read_jsonl(impl BufRead) -> Result<WitnessChain,
    WitnessReadError>` — parses event-by-event AND runs chain-level
    `verify()` on the loaded chain, catching reordered or replayed
    prefixes that per-event hashing alone misses

Critical security property: `read_jsonl` calls `WitnessChain::verify`
on the loaded chain BEFORE returning Ok. A forged bundle assembled
from two valid chains pasted together would slip past the
per-event hash check (each event's `this_hash` is internally
consistent) but the cross-event `prev_hash` linkage detects the
seam. Test `read_jsonl_chain_verify_catches_reordered_events`
locks this — swap two events in a 2-event bundle, see Verify error.

Error surface (new `WitnessReadError` enum):
  * `Io { line_no, msg }`           — read failure mid-stream
  * `Parse { line_no, source }`     — per-event from_jsonl_line failure
  * `Verify { source }`             — chain-level verify failure

`line_no` is 1-indexed so an auditor sees the same number their
text editor shows. Blank lines tolerated for hand-edited bundles.

7 new tests:
  * empty chain writes zero bytes
  * write→read round-trips a 3-event chain
  * exactly N newlines for N events; trailing newline present
  * blank lines / leading newline tolerated
  * parse error surfaces with correct line_no
  * reordered events caught by chain-level verify
  * no-trailing-newline still loads the final event

51/51 cog tests green (44 → 51).

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 18:19:05 -04:00
ruv a3478ea3b5 cog-ha-matter (ADR-116 P4): witness JSONL persistence
Third P4 sub-unit: serialize/parse for the witness hash chain so
audit bundles can be written to disk and replayed.

Wire shape (one record per line, alphabetical field order locked):

  {"kind":"...","payload_hex":"...","prev_hash":"...","seq":N,
   "this_hash":"...","timestamp_unix_s":N}

Why alphabetical field order: auditors archive whole bundles and
hash them. A rebuild that reordered fields would silently
invalidate every archival hash — locking the order is what makes
the JSONL stable across compiler / serde-json upgrades.

Why hex everywhere: human-greppable, monospace-friendly, no base64
ambiguity, no Vec<u8> JSON-array ugliness. Same convention as
ADR-101's `binary_sha256`.

Critically, `from_jsonl_line` RE-VERIFIES `this_hash` against
the canonical bytes derived from the parsed fields. A tampered
bundle fires `WitnessParseError::HashMismatch` BEFORE the event
loads — the parser is itself an auditor.

New surfaces:
  * `WitnessHash::from_hex` (with structured length/parse errors)
  * `WitnessEvent::to_jsonl_line`, `from_jsonl_line`
  * `WitnessParseError` enum: Json | MissingField | WrongType |
    HashLength | HashHex | PayloadHex | PayloadLength | HashMismatch
  * private `hex_encode` / `hex_decode` helpers (no `hex` crate dep)

10 new tests:
  * jsonl round-trip preserves all fields
  * jsonl line has no embedded \n / \r (one record per line)
  * jsonl field order is alphabetical (byte-stable archival)
  * parser rejects tampered payload via HashMismatch
  * parser rejects non-hex characters in hash
  * parser rejects missing field
  * hex encode/decode round-trip across empty / single byte / 0xff /
    UTF-8 / arbitrary bytes
  * hex decode rejects odd-length input
  * WitnessHash::from_hex round-trip
  * WitnessHash::from_hex rejects wrong length

44/44 cog tests green (34 → 44).

ADR-116 P4 row enumerates 4 sub-units now:  mDNS record-builder,
 witness chain primitive,  witness JSONL persistence,
 responder + embedded broker + Ed25519 signing.

Co-Authored-By: claude-flow <ruv@ruv.net>
2026-05-23 18:12:59 -04:00
2 changed files with 451 additions and 1 deletions
+1 -1
View File
@@ -95,7 +95,7 @@ Ranked by build cost × user impact:
| **P1** | Research dossier ([`docs/research/ADR-116-ha-matter-cog-research.md`](../research/ADR-116-ha-matter-cog-research.md)) | ✅ **done** — 8 sections, 30+ citations, v1 scope ranked |
| **P2** | Cog crate scaffold (`v2/crates/cog-ha-matter/`) — Cargo.toml + `src/{lib,main,manifest}.rs`, workspace member, CLI args, `--print-manifest` flag, 2 manifest unit tests | ✅ **done**`cargo check` + `cargo test` green |
| **P3** | Wrap existing ADR-115 MQTT publisher as cog entry point | ✅ **wiring done**`main.rs` boots ADR-115's `publisher::spawn` via `runtime::spawn_publisher` thin wrapper, holds a long-lived `broadcast::Sender<VitalsSnapshot>`, awaits Ctrl-C. Live-handle test green without a broker. Next (P3.5): subscribe to sensing-server `/v1/snapshot` WS and republish into the channel. |
| **P4** | Seed-native enhancements (embedded broker, mDNS, witness) | in progress — (a) mDNS service-record builder shipped (`mdns::build_mdns_service`, 6-key locked TXT surface, PII-leak guard). (b) Witness hash-chain primitive shipped (`witness::WitnessChain` — append-only SHA-256 chain with `verify()` catching tampered payload / broken prev_hash / seq gap). (c) Responder (mdns-sd) + embedded rumqttd + Ed25519 signing layer still pending. |
| **P4** | Seed-native enhancements (embedded broker, mDNS, witness) | in progress — (a) mDNS service-record builder. (b) Witness hash-chain primitive. (c) Witness JSONL line serializer. (d) **Witness file persistence shipped** `WitnessChain::{write_jsonl, read_jsonl}` accept any `Write`/`BufRead`, tolerate blank lines, surface `line_no` on parse error, run chain-level `verify()` on load to catch reordered/replayed events. 7 new tests including reorder-detection. (e) Responder (mdns-sd) + embedded rumqttd + Ed25519 signing layer still pending. |
| **P5** | RuVector-backed threshold learning (SONA adaptation) | pending |
| **P6** | Multi-Seed federation (cross-Seed dedup + witness) | pending |
| **P7** | Matter Bridge mode (depends on matter-rs / esp-matter readiness) | pending |
+450
View File
@@ -30,6 +30,8 @@
//! when the chain spans days and the auditor wants O(log n)
//! inclusion proofs.
use std::io::{self, BufRead, Write};
use sha2::{Digest, Sha256};
/// 32-byte hash output. Lifted into a newtype so a future migration
@@ -53,6 +55,22 @@ impl WitnessHash {
}
s
}
/// Parse a 64-char lowercase-hex string back into a `WitnessHash`.
/// Rejects wrong-length input and non-hex characters — used by
/// the JSONL parser when reading audit bundles.
pub fn from_hex(s: &str) -> Result<WitnessHash, WitnessParseError> {
if s.len() != 64 {
return Err(WitnessParseError::HashLength { found: s.len() });
}
let mut out = [0u8; 32];
for (i, byte) in out.iter_mut().enumerate() {
let lo = i * 2;
*byte = u8::from_str_radix(&s[lo..lo + 2], 16)
.map_err(|_| WitnessParseError::HashHex { at: lo })?;
}
Ok(WitnessHash(out))
}
}
/// A single witnessed event. Append-only — once committed to a
@@ -182,6 +200,49 @@ impl WitnessChain {
&self.events
}
/// Stream every event to a JSONL sink. Each event becomes one
/// line terminated by `\n`. Empty chains write zero bytes.
///
/// The caller owns the writer — `File`, `BufWriter`, an
/// in-memory `Vec<u8>` for tests — so this method never
/// allocates beyond per-event line buffers.
pub fn write_jsonl<W: Write>(&self, w: &mut W) -> io::Result<()> {
for ev in &self.events {
w.write_all(ev.to_jsonl_line().as_bytes())?;
w.write_all(b"\n")?;
}
Ok(())
}
/// Read a JSONL audit bundle into a fresh `WitnessChain`. Each
/// non-empty line is parsed via `WitnessEvent::from_jsonl_line`
/// (which re-verifies the stored hash), then the loaded chain
/// is end-to-end verified via [`WitnessChain::verify`] to catch
/// out-of-order events or replayed prefixes.
///
/// Bundle errors surface with their `line_no` (1-indexed) so an
/// auditor can point at the bad record.
pub fn read_jsonl<R: BufRead>(r: R) -> Result<WitnessChain, WitnessReadError> {
let mut chain = WitnessChain::new();
for (i, line_res) in r.lines().enumerate() {
let line_no = i + 1;
let line = line_res.map_err(|e| WitnessReadError::Io {
line_no,
msg: e.to_string(),
})?;
if line.trim().is_empty() {
continue; // tolerate blank lines / trailing \n
}
let ev = WitnessEvent::from_jsonl_line(&line)
.map_err(|source| WitnessReadError::Parse { line_no, source })?;
chain.events.push(ev);
}
chain
.verify()
.map_err(|source| WitnessReadError::Verify { source })?;
Ok(chain)
}
/// Verify every event's `this_hash` matches the canonical bytes,
/// every `prev_hash` matches the predecessor's `this_hash`, and
/// `seq` is gap-free starting at 0.
@@ -223,6 +284,161 @@ pub enum WitnessVerifyError {
HashMismatch { at: usize },
}
#[derive(Debug, thiserror::Error)]
pub enum WitnessReadError {
#[error("io error at line {line_no}: {msg}")]
Io { line_no: usize, msg: String },
#[error("parse error at line {line_no}: {source}")]
Parse {
line_no: usize,
#[source]
source: WitnessParseError,
},
#[error("chain-level verify failed: {source}")]
Verify {
#[source]
source: WitnessVerifyError,
},
}
#[derive(Debug, Clone, PartialEq, Eq, thiserror::Error)]
pub enum WitnessParseError {
#[error("invalid JSON: {0}")]
Json(String),
#[error("missing required field `{0}`")]
MissingField(&'static str),
#[error("field `{field}` has wrong type")]
WrongType { field: &'static str },
#[error("hash hex must be 64 chars, got {found}")]
HashLength { found: usize },
#[error("hash hex parse error at byte offset {at}")]
HashHex { at: usize },
#[error("payload hex parse error at byte offset {at}")]
PayloadHex { at: usize },
#[error("payload hex must be even length, got {found}")]
PayloadLength { found: usize },
#[error("recomputed hash does not match this_hash — bundle is forged or corrupted")]
HashMismatch,
}
fn hex_encode(bytes: &[u8]) -> String {
let mut s = String::with_capacity(bytes.len() * 2);
for b in bytes {
s.push_str(&format!("{b:02x}"));
}
s
}
fn hex_decode(s: &str) -> Result<Vec<u8>, WitnessParseError> {
if s.len() % 2 != 0 {
return Err(WitnessParseError::PayloadLength { found: s.len() });
}
let mut out = Vec::with_capacity(s.len() / 2);
for i in (0..s.len()).step_by(2) {
let byte = u8::from_str_radix(&s[i..i + 2], 16)
.map_err(|_| WitnessParseError::PayloadHex { at: i })?;
out.push(byte);
}
Ok(out)
}
impl WitnessEvent {
/// Serialize one event to a single JSONL line (no trailing
/// newline). The format is the audit-bundle wire shape; tools
/// downstream parse it line-by-line with [`Self::from_jsonl_line`].
///
/// Field ordering is locked alphabetically for byte-stable
/// output across rebuilds — auditors hash whole bundles, so a
/// rebuild that reordered fields would silently invalidate
/// archival hashes.
///
/// Wire shape:
///
/// ```json
/// {"kind":"...","payload_hex":"...","prev_hash":"...","seq":N,"this_hash":"...","timestamp_unix_s":N}
/// ```
pub fn to_jsonl_line(&self) -> String {
// Hand-rolled instead of serde_derive so the wire-format
// ordering is under direct test control.
format!(
"{{\"kind\":{kind},\"payload_hex\":\"{payload}\",\"prev_hash\":\"{prev}\",\"seq\":{seq},\"this_hash\":\"{this}\",\"timestamp_unix_s\":{ts}}}",
kind = serde_json::to_string(&self.kind).expect("string is always serializable"),
payload = hex_encode(&self.payload),
prev = self.prev_hash.to_hex(),
seq = self.seq,
this = self.this_hash.to_hex(),
ts = self.timestamp_unix_s,
)
}
/// Parse one JSONL line back into a `WitnessEvent`. Re-verifies
/// the stored `this_hash` against the canonical bytes — a
/// tampered bundle fires [`WitnessParseError::HashMismatch`]
/// instead of silently loading forged events.
pub fn from_jsonl_line(line: &str) -> Result<WitnessEvent, WitnessParseError> {
let v: serde_json::Value =
serde_json::from_str(line).map_err(|e| WitnessParseError::Json(e.to_string()))?;
let obj = v
.as_object()
.ok_or(WitnessParseError::WrongType { field: "<root>" })?;
let seq = obj
.get("seq")
.ok_or(WitnessParseError::MissingField("seq"))?
.as_u64()
.ok_or(WitnessParseError::WrongType { field: "seq" })?;
let timestamp_unix_s = obj
.get("timestamp_unix_s")
.ok_or(WitnessParseError::MissingField("timestamp_unix_s"))?
.as_u64()
.ok_or(WitnessParseError::WrongType {
field: "timestamp_unix_s",
})?;
let kind = obj
.get("kind")
.ok_or(WitnessParseError::MissingField("kind"))?
.as_str()
.ok_or(WitnessParseError::WrongType { field: "kind" })?
.to_string();
let prev_hash = WitnessHash::from_hex(
obj.get("prev_hash")
.ok_or(WitnessParseError::MissingField("prev_hash"))?
.as_str()
.ok_or(WitnessParseError::WrongType { field: "prev_hash" })?,
)?;
let this_hash = WitnessHash::from_hex(
obj.get("this_hash")
.ok_or(WitnessParseError::MissingField("this_hash"))?
.as_str()
.ok_or(WitnessParseError::WrongType { field: "this_hash" })?,
)?;
let payload = hex_decode(
obj.get("payload_hex")
.ok_or(WitnessParseError::MissingField("payload_hex"))?
.as_str()
.ok_or(WitnessParseError::WrongType {
field: "payload_hex",
})?,
)?;
// Re-verify the stored hash. The on-disk hash is purely
// declarative; this is what makes the JSONL a witness.
let recomputed = hash_event(prev_hash, seq, timestamp_unix_s, &kind, &payload);
if recomputed != this_hash {
return Err(WitnessParseError::HashMismatch);
}
Ok(WitnessEvent {
seq,
prev_hash,
timestamp_unix_s,
kind,
payload,
this_hash,
})
}
}
#[cfg(test)]
mod tests {
use super::*;
@@ -342,4 +558,238 @@ mod tests {
let h2 = hash_event(WitnessHash::GENESIS, 0, 100, "k", b"b");
assert_ne!(h1, h2);
}
// ---- JSONL persistence ----
fn fresh_event() -> WitnessEvent {
let mut c = WitnessChain::new();
c.append("fall_risk_elevated", br#"{"node":"kitchen"}"#, 1779512400);
c.events()[0].clone()
}
#[test]
fn jsonl_round_trip_preserves_all_fields() {
let original = fresh_event();
let line = original.to_jsonl_line();
let parsed = WitnessEvent::from_jsonl_line(&line).expect("clean line round-trips");
assert_eq!(parsed, original);
}
#[test]
fn jsonl_line_has_no_embedded_newline() {
// JSONL is one record per line; an embedded \n in the
// serialized form would corrupt the file format.
let line = fresh_event().to_jsonl_line();
assert!(!line.contains('\n'));
assert!(!line.contains('\r'));
}
#[test]
fn jsonl_field_order_is_alphabetical_for_byte_stability() {
// Auditors archive whole bundles and hash them — reordered
// fields would silently invalidate archival hashes. Lock
// the order with a substring check on a known event.
let line = fresh_event().to_jsonl_line();
let order = ["kind", "payload_hex", "prev_hash", "seq", "this_hash", "timestamp_unix_s"];
let mut last = 0usize;
for field in order {
let pos = line.find(field).unwrap_or_else(|| panic!("missing field `{field}`"));
assert!(pos > last, "field `{field}` out of alphabetical order");
last = pos;
}
}
#[test]
fn jsonl_parser_rejects_tampered_payload() {
let original = fresh_event();
let line = original.to_jsonl_line();
// Flip one nibble in the payload hex — the stored hash
// won't match the recomputed hash.
let tampered = line.replacen("payload_hex\":\"7b", "payload_hex\":\"6b", 1);
assert_ne!(line, tampered, "test fixture didn't flip a byte");
let err = WitnessEvent::from_jsonl_line(&tampered).unwrap_err();
assert!(
matches!(err, WitnessParseError::HashMismatch),
"expected HashMismatch, got {err:?}"
);
}
#[test]
fn jsonl_parser_rejects_non_hex_hash() {
// Replace the hex hash with non-hex chars — must fire a
// structured error, not a panic.
let original = fresh_event();
let line = original.to_jsonl_line();
let broken = line.replacen(
&original.this_hash.to_hex()[..4],
"ZZZZ",
1,
);
let err = WitnessEvent::from_jsonl_line(&broken).unwrap_err();
assert!(matches!(err, WitnessParseError::HashHex { .. }));
}
#[test]
fn jsonl_parser_rejects_missing_field() {
let bad = r#"{"seq":0,"kind":"k","prev_hash":"00","this_hash":"00","timestamp_unix_s":1}"#;
let err = WitnessEvent::from_jsonl_line(bad).unwrap_err();
// Missing payload_hex; should fire MissingField before any
// hex decode happens.
assert!(matches!(err, WitnessParseError::MissingField("payload_hex")
| WitnessParseError::HashLength { .. }));
}
#[test]
fn hex_encode_decode_round_trip() {
let cases: &[&[u8]] = &[
b"",
b"\x00",
b"\xff",
b"hello world",
&[0x00, 0x01, 0xab, 0xcd, 0xef],
];
for c in cases {
let encoded = hex_encode(c);
let decoded = hex_decode(&encoded).unwrap();
assert_eq!(&decoded[..], *c, "round-trip failed for {c:?}");
}
}
#[test]
fn hex_decode_rejects_odd_length() {
let err = hex_decode("abc").unwrap_err();
assert!(matches!(err, WitnessParseError::PayloadLength { found: 3 }));
}
#[test]
fn witness_hash_from_hex_round_trip() {
let h = WitnessHash([0x12; 32]);
let hex = h.to_hex();
let parsed = WitnessHash::from_hex(&hex).unwrap();
assert_eq!(parsed, h);
}
#[test]
fn witness_hash_from_hex_rejects_wrong_length() {
let err = WitnessHash::from_hex("ab").unwrap_err();
assert!(matches!(err, WitnessParseError::HashLength { found: 2 }));
}
// ---- file persistence (write_jsonl / read_jsonl) ----
#[test]
fn write_jsonl_empty_chain_writes_zero_bytes() {
let c = WitnessChain::new();
let mut buf = Vec::new();
c.write_jsonl(&mut buf).unwrap();
assert_eq!(buf, b"");
}
#[test]
fn write_then_read_round_trips_multi_event_chain() {
let mut written = WitnessChain::new();
written.append("a", b"first", 100);
written.append("b", b"second", 101);
written.append("c", br#"{"x":1}"#, 102);
let mut buf = Vec::new();
written.write_jsonl(&mut buf).unwrap();
let read_back = WitnessChain::read_jsonl(buf.as_slice()).unwrap();
assert_eq!(read_back.len(), 3);
assert_eq!(read_back.events(), written.events());
assert_eq!(read_back.tip(), written.tip());
}
#[test]
fn write_jsonl_separates_events_with_newline() {
let mut c = WitnessChain::new();
c.append("a", b"1", 100);
c.append("b", b"2", 101);
let mut buf = Vec::new();
c.write_jsonl(&mut buf).unwrap();
let s = std::str::from_utf8(&buf).unwrap();
// Exactly N newlines for N events.
assert_eq!(s.matches('\n').count(), 2);
assert!(s.ends_with('\n'));
}
#[test]
fn read_jsonl_tolerates_blank_lines() {
let mut c = WitnessChain::new();
c.append("a", b"1", 100);
c.append("b", b"2", 101);
let mut buf = Vec::new();
c.write_jsonl(&mut buf).unwrap();
// Inject blanks — sometimes happens when files are edited.
let with_blanks = format!(
"\n{}\n\n",
std::str::from_utf8(&buf).unwrap().trim_end()
);
let read = WitnessChain::read_jsonl(with_blanks.as_bytes()).unwrap();
assert_eq!(read.len(), 2);
}
#[test]
fn read_jsonl_surfaces_line_no_on_parse_error() {
// Two good events, then one with a flipped payload byte.
let mut c = WitnessChain::new();
c.append("a", b"1", 100);
c.append("b", b"2", 101);
let mut buf = Vec::new();
c.write_jsonl(&mut buf).unwrap();
let mut text = String::from_utf8(buf).unwrap();
let forged = c.events()[0].to_jsonl_line().replacen(
"payload_hex\":\"31",
"payload_hex\":\"32",
1,
);
text.push_str(&forged);
text.push('\n');
let err = WitnessChain::read_jsonl(text.as_bytes()).unwrap_err();
match err {
WitnessReadError::Parse { line_no, .. } => assert_eq!(line_no, 3),
other => panic!("expected Parse error at line 3, got {other:?}"),
}
}
#[test]
fn read_jsonl_chain_verify_catches_reordered_events() {
// Build a chain, then write it out with the events swapped.
// Each individual event still verifies its own hash (because
// its prev_hash is internally consistent with what *it*
// claimed), but the cross-event chain check fires.
let mut original = WitnessChain::new();
original.append("a", b"1", 100);
original.append("b", b"2", 101);
let mut buf = Vec::new();
original.write_jsonl(&mut buf).unwrap();
let lines: Vec<&[u8]> = buf.split(|&b| b == b'\n').filter(|s| !s.is_empty()).collect();
// Reverse order, send through reader.
let mut reversed: Vec<u8> = Vec::new();
reversed.extend_from_slice(lines[1]);
reversed.push(b'\n');
reversed.extend_from_slice(lines[0]);
reversed.push(b'\n');
let err = WitnessChain::read_jsonl(reversed.as_slice()).unwrap_err();
assert!(matches!(err, WitnessReadError::Verify { .. }));
}
#[test]
fn read_jsonl_no_trailing_newline_still_works() {
// BufRead's lines() handles the no-final-newline case; lock
// the behavior so a future swap to a different reader can't
// silently truncate the last event.
let mut c = WitnessChain::new();
c.append("only", b"x", 100);
let mut buf = Vec::new();
c.write_jsonl(&mut buf).unwrap();
// Strip the trailing \n.
if buf.last() == Some(&b'\n') {
buf.pop();
}
let read = WitnessChain::read_jsonl(buf.as_slice()).unwrap();
assert_eq!(read.len(), 1);
}
}