# `UltraLogLog.Encoding`
[🔗](https://github.com/thatsme/ultra_log_log/blob/v0.1.0/lib/ultra_log_log/encoding.ex#L1)

Register encoding and merge operations for UltraLogLog.

## Algorithmic core

Each 8-bit register byte `r` is interpreted as a 64-bit "hash prefix"
via `unpack/1`, where:

  * the top 6 bits of `r` encode the position of the leading 1
    (offset and negated, see below),
  * the bottom 2 bits (the `T = d = 2` "extra" bits) encode the two
    bits immediately below that leading 1.

The implicit leading 1 is reconstructed by ORing the mantissa with `4`
(binary `100`). `pack/1` inverts the transform, keeping only the
highest set bit plus the two bits below it.

The merge of two registers is `pack(unpack(a) ||| unpack(b))`.
Combining hash prefixes by bitwise OR before re-packing is what makes
the merge a CRDT operation: commutative, associative, idempotent.

## Register normalization (the `+4p - 8` shift)

Nonzero register values are stored shifted by `+4p - 8` relative to
the paper's encoding, so the range always saturates at byte `255`
regardless of precision.

This is a Hash4j implementation choice, **explicitly acknowledged in
Ertl 2024 §4 ("Practical Implementation")** in the paragraph following
Algorithm 4:

> ULL insertions can be implemented entirely branch-free, as
> exemplified by the implementation in our Hash4j Java library
> [...] which however uses the transformation `r → r + 4p − 8` for
> non-zero registers, so that the largest possible register state is
> always `(4w + 3) + 4p − 8 = 255`, the maximum value of a byte.

The paper's own algorithms (Algorithm 3 insert, Algorithm 4 pack,
Algorithm 5 unpack) do *not* include the shift; without it, register
values are bounded by `4w + 3 = 263 - 4p`, leaving byte values
`≥ 264 - 4p` unreachable for any given `p` and wasting byte space.

The shift is invariant under the full algorithm:

  * `encode/2` applies it implicitly — it falls out of the `pack`
    formula combined with Hash4j's choice of bit position for the
    newly observed update (bit `nlz + p - 1` instead of the paper's
    bit `k + 1`, see `add/2` line citations below).
  * `merge_registers/2` is invariant: `pack(unpack(a) ||| unpack(b))`
    moves to the prefix space where merging is bitwise OR; OR
    distributes over the precision-dependent left-shift that the
    pack/unpack pair effectively performs, so merging shifted bytes
    gives the shifted merge.
  * The estimator absorbs the shift as a constant offset in its
    register-to-table-index mapping (Hash4j's `off = 4p + 4` in
    `OptimalFGRAEstimator`). When we port the estimator, the
    constants will match the shifted encoding by construction.

## Reference (Hash4j v0.17.0)

This module is a bit-exact port of the `pack`/`unpack` helpers and
the bit-update step in [`UltraLogLog.java`](https://github.com/dynatrace-oss/hash4j/blob/v0.17.0/src/main/java/com/dynatrace/hash4j/distinctcount/UltraLogLog.java)
at tag **v0.17.0** (commit `95834ea9`), which is the version that
`dynatrace-research/ultraloglog-paper` pins as a git submodule and
validates the paper's numerical results against. Specific line
citations in this module refer to that tag, not main HEAD.

## Reachable register values

Not every value in `0..255` is a possible output of `pack/1`. For
precision `p`, reachable nonzero bytes have a top-6-bit field that
encodes a leading-1 position in `{p-1, ..., 63}` (with the shift,
bytes `< 4p - 4` are unreachable for that `p`). Bytes outside the
reachable set are never produced by the algorithm; for them,
`merge_registers/2` remains total — it returns *some* byte, never
crashes — but the algebraic laws only hold on the reachable subset.
Property tests should canonicalize via `pack(unpack(_))` (or, more
directly, generate via uniform 64-bit prefix → `pack/1`) before
asserting commutativity/associativity/idempotence.

# `encode`

```elixir
@spec encode(non_neg_integer(), 3..26) :: 0..255
```

Encode the result of observing a single hash in an otherwise-empty
register.

`tail` is the original 64-bit hash with the top `p` bits (the register
index) already shifted out — i.e. `tail = (hash <<< p) &&& 0xFFFF_FFFF_FFFF_FFFF`.

Equivalent to the bit-update step of Hash4j `UltraLogLog.java`
v0.17.0 lines 245–262 (the body of `add(long, StateChangeObserver)`):

    int nlz = Long.numberOfLeadingZeros(~(~hashValue << -q));
    hashPrefix |= 1L << (nlz + ~q);
    byte newState = pack(hashPrefix);

with `~q = p - 1` (Java `q = 64 - p`). The leading-zero count
saturates at `64 - p` for the all-zero tail.

# `leading_zeros_64`

```elixir
@spec leading_zeros_64(non_neg_integer()) :: 0..64
```

Count leading zeros of a 64-bit non-negative integer. Returns 64 if zero.

Pure-Elixir reference. A NIF using BMI/LZCNT is an obvious v0.2 optimization.

# `merge_binaries`

```elixir
@spec merge_binaries(binary(), binary()) :: binary()
```

Element-wise merge of two register binaries. Both must be the same size.

# `merge_registers`

```elixir
@spec merge_registers(0..255, 0..255) :: 0..255
```

Merge two register bytes under the UltraLogLog partial order.

Defined as `pack(unpack(a) ||| unpack(b))`. Idempotent, commutative,
and associative on reachable register values; `0` is the identity.
Total over `0..255` — never raises — but only well-behaved
algebraically on the reachable subset.

Ports Hash4j `UltraLogLog.java` v0.17.0 line 301:

    if (otherR != 0) {
      state[i] = pack(unpack(state[i]) | unpack(otherR));
    }

---

*Consult [api-reference.md](api-reference.md) for complete listing*
