ExDataSketch.Codec (ExDataSketch v0.7.1)

Copy Markdown View Source

ExDataSketch-native binary serialization codec (EXSK format).

Provides a stable binary format for serializing and deserializing sketch state. This format is internal to ExDataSketch and is not compatible with Apache DataSketches. For cross-language interop, use the serialize_datasketches/1 and deserialize_datasketches/1 functions on individual sketch modules.

Binary Layout

All multi-byte integers are little-endian.

Offset  Size    Field
------  ------  -----
0       4       Magic bytes: "EXSK" (0x45 0x58 0x53 0x4B)
4       1       Format version (u8, currently 1)
5       1       Sketch ID (u8, see Sketch IDs below)
6       4       Params length (u32 little-endian)
10      N       Params binary (sketch-specific parameters)
10+N    4       State length (u32 little-endian)
14+N    M       State binary (raw sketch state)

Total: 14 + N + M bytes.

Sketch IDs

  • 1: HLL (HyperLogLog)
  • 2: CMS (Count-Min Sketch)
  • 3: Theta
  • 4: KLL (Quantiles)
  • 5: DDSketch (Quantiles)
  • 6: FrequentItems (SpaceSaving)
  • 7: Bloom
  • 8: Cuckoo
  • 9: Quotient
  • 10: CQF (Counting Quotient Filter)
  • 11: XorFilter
  • 12: IBLT
  • 13: REQ (Relative Error Quantiles)
  • 14: MisraGries
  • 15: ULL (UltraLogLog)

Versioning

The format version byte allows forward-compatible changes. Decoders must reject versions they do not support with a clear error message.

Summary

Functions

Decodes an EXSK binary into its components.

Encodes sketch data into the EXSK binary format.

Returns the magic bytes used in the EXSK format header.

Returns the sketch ID constant for Bloom.

Returns the sketch ID constant for CMS.

Returns the sketch ID constant for CQF.

Returns the sketch ID constant for Cuckoo.

Returns the sketch ID constant for DDSketch.

Returns the sketch ID constant for FrequentItems.

Returns the sketch ID constant for HLL.

Returns the sketch ID constant for IBLT.

Returns the sketch ID constant for KLL.

Returns the sketch ID constant for MisraGries.

Returns the sketch ID constant for Quotient.

Returns the sketch ID constant for REQ.

Returns the sketch ID constant for Theta.

Returns the sketch ID constant for ULL.

Returns the sketch ID constant for XorFilter.

Returns the current format version.

Types

decoded()

@type decoded() :: %{
  version: pos_integer(),
  sketch_id: sketch_id(),
  params: binary(),
  state: binary()
}

sketch_id()

@type sketch_id() :: 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15

Functions

decode(arg1)

@spec decode(binary()) :: {:ok, decoded()} | {:error, Exception.t()}

Decodes an EXSK binary into its components.

Returns {:ok, map} on success or {:error, %DeserializationError{}} on failure. The returned map contains :version, :sketch_id, :params, and :state.

Examples

iex> bin = ExDataSketch.Codec.encode(1, 1, <<14>>, <<0, 0>>)
iex> {:ok, decoded} = ExDataSketch.Codec.decode(bin)
iex> decoded.sketch_id
1
iex> decoded.params
<<14>>
iex> decoded.state
<<0, 0>>

iex> ExDataSketch.Codec.decode(<<"BAAD", 1, 1, 0::32, 0::32>>)
{:error, %ExDataSketch.Errors.DeserializationError{message: "deserialization failed: invalid magic bytes, expected EXSK"}}

iex> ExDataSketch.Codec.decode(<<1, 2>>)
{:error, %ExDataSketch.Errors.DeserializationError{message: "deserialization failed: binary too short for EXSK header"}}

encode(sketch_id, version, params_bin, state_bin)

@spec encode(sketch_id(), pos_integer(), binary(), binary()) :: binary()

Encodes sketch data into the EXSK binary format.

Parameters

  • sketch_id - sketch type identifier (1=HLL, 2=CMS, 3=Theta, 4=KLL, 5=DDSketch, 6=FrequentItems, 7=Bloom, 8=Cuckoo, 9=Quotient, 10=CQF, 11=XorFilter, 12=IBLT, 13=REQ, 14=MisraGries, 15=ULL)
  • version - format version (use Codec.version/0 for current)
  • params_bin - binary-encoded sketch parameters
  • state_bin - raw sketch state binary

Examples

iex> bin = ExDataSketch.Codec.encode(1, 1, <<14>>, <<0, 0, 0>>)
iex> <<"EXSK", 1, 1, _rest::binary>> = bin
iex> byte_size(bin)
18

magic()

@spec magic() :: binary()

Returns the magic bytes used in the EXSK format header.

Examples

iex> ExDataSketch.Codec.magic()
"EXSK"

sketch_id_bloom()

@spec sketch_id_bloom() :: sketch_id()

Returns the sketch ID constant for Bloom.

Examples

iex> ExDataSketch.Codec.sketch_id_bloom()
7

sketch_id_cms()

@spec sketch_id_cms() :: sketch_id()

Returns the sketch ID constant for CMS.

Examples

iex> ExDataSketch.Codec.sketch_id_cms()
2

sketch_id_cqf()

@spec sketch_id_cqf() :: sketch_id()

Returns the sketch ID constant for CQF.

Examples

iex> ExDataSketch.Codec.sketch_id_cqf()
10

sketch_id_cuckoo()

@spec sketch_id_cuckoo() :: sketch_id()

Returns the sketch ID constant for Cuckoo.

Examples

iex> ExDataSketch.Codec.sketch_id_cuckoo()
8

sketch_id_ddsketch()

@spec sketch_id_ddsketch() :: sketch_id()

Returns the sketch ID constant for DDSketch.

Examples

iex> ExDataSketch.Codec.sketch_id_ddsketch()
5

sketch_id_fi()

@spec sketch_id_fi() :: sketch_id()

Returns the sketch ID constant for FrequentItems.

Examples

iex> ExDataSketch.Codec.sketch_id_fi()
6

sketch_id_hll()

@spec sketch_id_hll() :: sketch_id()

Returns the sketch ID constant for HLL.

Examples

iex> ExDataSketch.Codec.sketch_id_hll()
1

sketch_id_iblt()

@spec sketch_id_iblt() :: sketch_id()

Returns the sketch ID constant for IBLT.

Examples

iex> ExDataSketch.Codec.sketch_id_iblt()
12

sketch_id_kll()

@spec sketch_id_kll() :: sketch_id()

Returns the sketch ID constant for KLL.

Examples

iex> ExDataSketch.Codec.sketch_id_kll()
4

sketch_id_mg()

@spec sketch_id_mg() :: sketch_id()

Returns the sketch ID constant for MisraGries.

Examples

iex> ExDataSketch.Codec.sketch_id_mg()
14

sketch_id_quotient()

@spec sketch_id_quotient() :: sketch_id()

Returns the sketch ID constant for Quotient.

Examples

iex> ExDataSketch.Codec.sketch_id_quotient()
9

sketch_id_req()

@spec sketch_id_req() :: sketch_id()

Returns the sketch ID constant for REQ.

Examples

iex> ExDataSketch.Codec.sketch_id_req()
13

sketch_id_theta()

@spec sketch_id_theta() :: sketch_id()

Returns the sketch ID constant for Theta.

Examples

iex> ExDataSketch.Codec.sketch_id_theta()
3

sketch_id_ull()

@spec sketch_id_ull() :: sketch_id()

Returns the sketch ID constant for ULL.

Examples

iex> ExDataSketch.Codec.sketch_id_ull()
15

sketch_id_xor()

@spec sketch_id_xor() :: sketch_id()

Returns the sketch ID constant for XorFilter.

Examples

iex> ExDataSketch.Codec.sketch_id_xor()
11

version()

@spec version() :: pos_integer()

Returns the current format version.

Examples

iex> ExDataSketch.Codec.version()
1