Audit-Chain

The one-liner

Every privileged write in Ujex appends to a sha256-chained append-only ledger. A scheduled verifier pages the oncall on the slightest divergence.

Why we bother

Traditional audit logs are "append-only" by convention: no code deletes them, and nobody looks. In a regulated deployment that's not enough. An attacker with write access to Firestore could quietly erase the rows that show what they did, and the customer's auditor would never know.

The hash-chain replaces the convention with math:

Every entry stores entryHash = sha256(prevHash || payloadHash || timestamp || seq).
payloadHash = sha256(canonical_json(payload)) so content tampering is detectable.
prevHash is the previous entry's entryHash, threading the chain together.

You cannot rewrite the middle without recomputing every hash after it, and you cannot append into the past without breaking the chain tip.

Where the code lives

The primitive is the shared package @ujexdev/audit-chain at packages/audit-chain/ (TypeScript) and ujex-audit-chain at packages/audit-chain-py/ (Python). The two implementations share a canonical-JSON serialization so a chain produced in one verifies in the other — useful for auditor-side tooling.

Ujex's Ujex-specific usage lives in functions/src/lib/audit.ts, which delegates hashing to the shared package and stores entries at audit/{seq} in Firestore. Every trust primitive (Agent Inbox, Memory provenance, Guardrails, Identity, Approvals, Tools, Evidence export, and admin operations) imports append() and calls it on privileged writes.

Wire format

payloadHash = sha256_hex(canonical_json(payload))
entryHash   = sha256_hex(prevHash || payloadHash || str(timestamp_ms) || str(seq))

Canonical JSON rules:

Keys sorted lexicographically at every level.
No whitespace (sort_keys=True, separators=(",",":")).
Unicode kept verbatim (ensure_ascii=False).
Numbers stringified by the language's default str() — there is no custom formatting, so any producer needs to agree on Number-vs-Integer representation. In practice both JS and Python emit "0", "1.0", etc. consistently at integer boundaries.

Genesis entry: prevHash = "".

Storage shape

audit/{seq}  (Firestore, deny-by-default for client writes)
  seq           number   monotonic per-write counter
  timestamp     number   ms since epoch
  prevHash      string   '' for the genesis entry
  payloadHash   string   sha256 of canonical_json(payload)
  hash          string   entryHash (the threaded one)
  payload       object   the audited event (actorKind, actorId, agentId, action, target, meta)

The hourly verifier

A scheduled Cloud Function runs every hour, streams the entire chain in order, recomputes entryHash for each row, compares. On divergence:

Pages the oncall via the alerting channel configured in ops/terraform/monitoring.tf.
Writes a forensic row to audit_breaks/{timestamp} capturing {expectedSeq, firstBadSeq, expectedHash, actualHash}.
Does not stop writes — tampering detection is not mitigation. Deciding what to do is the operator's job.

Who writes, who reads

Writes: Cloud Functions only. Firestore rules deny client writes on audit/{seq} and audit_breaks/{*}.
Reads (customer-facing): via observeGetUsage / observeGetMessages callables, which enforce agent-owner RBAC.
Reads (auditor-facing): signed evidence export emits the chain records, manifest, optional detached signature, and trust bundle the auditor can re-verify with the CLI.

What every append looks like

import {append} from '../lib/audit.js';

await append({
  actorKind: 'agent',            // human | agent | system
  actorId: 'agent-hello',
  agentId: 'agent-hello',
  action: 'postbox.send',        // dotted namespace
  target: 'msg_01hxyz42',
  meta: {to: ['alice@vendor.com'], subject: 'Re: invoice'},
});

All the standard fields land in the chain. Subsystem-specific data goes in meta.

External verification

An auditor can re-verify a signed export without any Ujex code access:

npx @ujexdev/cli audit verify audit-export.json
# or
pip install ujex-audit-chain
python -m ujex_audit_chain verify audit-export.json

Both produce the same verdict — either ok or brokenAt=<seq>.

Design decisions captured in ADRs

Why per-entry timestamp + seq in the hash, rather than just prevHash || payloadHash? So that replaying the chain detects missing rows, not just edits. If row 42 were quietly deleted, everything after would still have internally consistent prevHash pointers — but the seq gap would show up in the verifier's sequence check.
Why not Merkle-tree instead of a linear chain? A Merkle tree is strictly more powerful (fast range-proof inclusion) but also more complex to implement byte-identically across languages. A linear SHA-256 chain is enough for the auditor use cases we've seen; we can add Merkle inclusion proofs in a later version without breaking existing exports.
Why canonical JSON instead of CBOR / ProtoBuf? Debug-ability. Operators and auditors can cat the chain and see what happened. Performance hasn't been a problem at the write rates we target.

The one-liner​

Why we bother​

Where the code lives​

Wire format​

Storage shape​

The hourly verifier​

Who writes, who reads​

What every append looks like​

External verification​

Design decisions captured in ADRs​

See also​