Pkcs11ex.Audit.CanonicalEncoding (pkcs11ex_audit v0.1.0)

Copy Markdown View Source

Stable canonical-bytes encoder for audit-chain hash inputs.

Why this exists

Pkcs11ex.Audit binds each entry's content_hash to a canonical byte representation of its inputs. The previous implementation used :erlang.term_to_binary/2 with the :deterministic flag — that flag fixes intra-version map-key ordering, but the External Term Format itself is not stable across Erlang/OTP releases. A routine OTP upgrade can therefore re-encode the same logical term to different bytes, which invalidates every previously stored content_hash and breaks Pkcs11ex.Audit.verify/1 for the entire chain. The audit library exists to be tamper-evident across time; resting on an unstable encoding undermines exactly that claim.

This module is the documented replacement: a small, explicit byte format I control, versioned at the front so any future format change can coexist with old chains by branching on the version byte.

Format v1 — wire layout

<type tag, 1 byte> <type-specific payload>

Where the type tags are:

0x00  nil
0x01  true
0x02  false
0x03  atom        : <len:32-big> <utf8 bytes of `Atom.to_string/1`>
0x04  integer     : <sign:8 (0 = >=0, 1 = <0)> <len:32-big> <big-endian magnitude>
0x05  binary      : <len:32-big> <bytes>
0x06  list        : <len:32-big of encoded body> <encoded element 1> <encoded element 2> ...
0x07  map         : <len:32-big of encoded body> <encoded (k,v) pairs sorted by encoded-key bytes>
0x08  tuple       : <len:32-big of encoded body> <encoded element 1> <encoded element 2> ...
0x09  DateTime    : <len:32-big> <utf8 bytes of `DateTime.to_iso8601/1`>

Maps are encoded with their (encoded_key, encoded_value) pairs sorted by the byte representation of the encoded keys, breaking ties lexicographically. This is the canonicalisation discipline that makes the encoding stable across map-key insertion order.

Floats are deliberately rejected: float representation has too many cross-platform footguns (NaN, denormals, signed zero) to be safely hashed. Callers needing to log a float should convert to a string or a fixed-precision integer first.

Anything that's not one of the supported types raises ArgumentError. The audit library catches it at the entry point (Pkcs11ex.Audit.append/3) and surfaces it as {:error, {:invalid_payload, term}}.

Format versioning

Pkcs11ex.Audit.compute_hash/4 prefixes the canonical bytes with a single-byte format tag (@hash_format_version in audit.ex). Future format revisions bump the tag; verify-time hash recomputation reads the tag from the stored entry's :hash_format field (if added later) or assumes v1 (current default). Old entries hashed under format v1 stay verifiable forever — that's the point of versioning.

Summary

Functions

Encode term to canonical v1 bytes. See moduledoc for the format.

Functions

encode_v1(term)

@spec encode_v1(term()) :: binary()

Encode term to canonical v1 bytes. See moduledoc for the format.

Raises ArgumentError for unsupported types (floats, references, PIDs, ports, functions, structs other than DateTime).