ExDataSketch.DDSketch (ExDataSketch v0.7.1)

Copy Markdown View Source

DDSketch quantiles sketch for value-relative-accuracy quantile estimation.

DDSketch uses logarithmic bucket mapping to provide quantile estimates with a guaranteed relative error bound on the value (not the rank). This makes it ideal for latency percentiles and telemetry where relative accuracy matters more than rank accuracy.

Accuracy

The alpha parameter (relative_accuracy) controls bucket width. A query for quantile q returns a value v such that the true value v' satisfies v' * (1 - alpha) <= v <= v' * (1 + alpha).

alphaRelative ErrorTypical Use
0.055%Coarse monitoring
0.011%Standard telemetry
0.0050.5%High-precision SLOs
0.0010.1%Scientific measurement

Constraints

  • Only non-negative float64 values are accepted. Negative values, NaN, and Inf are rejected with an error.
  • Zero is tracked in a dedicated zero_count and does not flow through the logarithmic index mapping.

Binary State Layout (DDS1)

All multi-byte fields are little-endian.

HEADER (fixed 96 bytes):
  magic:           4 bytes  "DDS1"
  version:         u8       1
  flags:           u8       bit0=has_negative_support (0 for v0.2.1)
  reserved:        u16      0
  alpha:           f64 LE   relative accuracy parameter
  gamma:           f64 LE   (1 + alpha) / (1 - alpha)
  log_gamma:       f64 LE   :math.log(gamma)
  min_indexable:   f64 LE   smallest positive value mappable through log
  n:               u64 LE   total count (including zeros)
  zero_count:      u64 LE
  min_value:       f64 LE   (NaN sentinel for empty)
  max_value:       f64 LE   (NaN sentinel for empty)
  sparse_count:    u32 LE   number of sparse bin entries
  dense_min_index: i32 LE   (0 when dense_len=0)
  dense_len:       u32 LE   (0 in v0.2.1)
  reserved2:       u32 LE   0

BODY:
  Sparse region:   sparse_count x (index: i32 LE, count: u32 LE)
  Dense region:    dense_len x u32 LE counts (empty in v0.2.1)

Options

  • :alpha - relative accuracy, float in (0.0, 1.0) (default: 0.01). Smaller values give tighter accuracy bounds but use more buckets.
  • :backend - backend module (default: ExDataSketch.Backend.Pure).

Merge Properties

DDSketch merge is associative and commutative. Both sketches must have identical alpha parameters to merge.

Summary

Functions

Returns the total number of items inserted into the sketch.

Deserializes an EXSK binary into a DDSketch.

Creates a new DDSketch from an enumerable of numeric items.

Returns the maximum value seen by the sketch, or nil if empty.

Merges two DDSketch instances.

Merges a non-empty enumerable of DDSketch instances into one.

Returns a 2-arity merge function suitable for combining sketches.

Returns the minimum value seen by the sketch, or nil if empty.

Creates a new DDSketch.

Returns the approximate value at the given normalized rank.

Returns approximate values at multiple normalized ranks.

Returns the approximate normalized rank of a given value.

Returns a 2-arity reducer function suitable for Enum.reduce/3 and similar.

Serializes the sketch to the ExDataSketch-native EXSK binary format.

Returns the size of the sketch state in bytes.

Updates the sketch with a single non-negative numeric value.

Updates the sketch with multiple non-negative numeric values in a single pass.

Types

t()

@type t() :: %ExDataSketch.DDSketch{
  backend: module(),
  opts: keyword(),
  state: binary()
}

Functions

count(dd_sketch)

@spec count(t()) :: non_neg_integer()

Returns the total number of items inserted into the sketch.

Examples

iex> ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.count()
0

deserialize(binary)

@spec deserialize(binary()) :: {:ok, t()} | {:error, Exception.t()}

Deserializes an EXSK binary into a DDSketch.

Returns {:ok, sketch} on success or {:error, reason} on failure.

Examples

iex> ExDataSketch.DDSketch.deserialize(<<"invalid">>)
{:error, %ExDataSketch.Errors.DeserializationError{message: "deserialization failed: invalid magic bytes, expected EXSK"}}

from_enumerable(enumerable, opts \\ [])

@spec from_enumerable(
  Enumerable.t(),
  keyword()
) :: t()

Creates a new DDSketch from an enumerable of numeric items.

Equivalent to new(opts) |> update_many(enumerable).

Options

Same as new/1.

Examples

iex> sketch = ExDataSketch.DDSketch.from_enumerable([1.0, 2.0, 3.0], alpha: 0.01)
iex> ExDataSketch.DDSketch.count(sketch)
3

max_value(dd_sketch)

@spec max_value(t()) :: float() | nil

Returns the maximum value seen by the sketch, or nil if empty.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many([3.0, 1.0, 2.0])
iex> ExDataSketch.DDSketch.max_value(sketch)
3.0

merge(sketch, dd_sketch)

@spec merge(t(), t()) :: t()

Merges two DDSketch instances.

Both sketches must have the same alpha parameter. The result is a sketch whose quantile estimates approximate the union of both input multisets.

Raises ExDataSketch.Errors.IncompatibleSketchesError if the sketches have different alpha parameters.

Examples

iex> a = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many([1.0, 2.0])
iex> b = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many([3.0, 4.0])
iex> merged = ExDataSketch.DDSketch.merge(a, b)
iex> ExDataSketch.DDSketch.count(merged)
4

merge_many(sketches)

@spec merge_many(Enumerable.t()) :: t()

Merges a non-empty enumerable of DDSketch instances into one.

Raises Enum.EmptyError if the enumerable is empty.

Examples

iex> sketches = Enum.map(1..3, fn i ->
...>   ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update(i * 1.0)
...> end)
iex> merged = ExDataSketch.DDSketch.merge_many(sketches)
iex> ExDataSketch.DDSketch.count(merged)
3

merger(opts \\ [])

@spec merger(keyword()) :: (t(), t() -> t())

Returns a 2-arity merge function suitable for combining sketches.

The returned function calls merge/2 on two sketches.

Examples

iex> is_function(ExDataSketch.DDSketch.merger(), 2)
true

min_value(dd_sketch)

@spec min_value(t()) :: float() | nil

Returns the minimum value seen by the sketch, or nil if empty.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many([3.0, 1.0, 2.0])
iex> ExDataSketch.DDSketch.min_value(sketch)
1.0

new(opts \\ [])

@spec new(keyword()) :: t()

Creates a new DDSketch.

Options

  • :alpha - relative accuracy, float in (0.0, 1.0) (default: 0.01).
  • :backend - backend module (default: ExDataSketch.Backend.Pure).

Examples

iex> sketch = ExDataSketch.DDSketch.new(alpha: 0.01)
iex> sketch.opts
[alpha: 0.01]
iex> ExDataSketch.DDSketch.count(sketch)
0

quantile(dd_sketch, rank)

@spec quantile(t(), float()) :: float() | nil

Returns the approximate value at the given normalized rank.

The rank must be in the range [0.0, 1.0], where 0.0 is the minimum and 1.0 is the maximum. For example, quantile(sketch, 0.5) returns the approximate median.

Returns nil if the sketch is empty.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many(1..100)
iex> median = ExDataSketch.DDSketch.quantile(sketch, 0.5)
iex> is_float(median)
true

quantiles(sketch, ranks)

@spec quantiles(t(), [float()]) :: [float() | nil]

Returns approximate values at multiple normalized ranks.

Convenience wrapper around quantile/2 for batch queries.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many(1..100)
iex> [q25, q50, q75] = ExDataSketch.DDSketch.quantiles(sketch, [0.25, 0.5, 0.75])
iex> q25 < q50 and q50 < q75
true

rank(dd_sketch, value)

@spec rank(t(), number()) :: float() | nil

Returns the approximate normalized rank of a given value.

The rank is the fraction of items in the sketch that are less than or equal to the given value. Returns nil if the sketch is empty.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many(1..100)
iex> r = ExDataSketch.DDSketch.rank(sketch, 50.0)
iex> is_float(r)
true

reducer()

@spec reducer() :: (number(), t() -> t())

Returns a 2-arity reducer function suitable for Enum.reduce/3 and similar.

The returned function calls update/2 on each item.

Examples

iex> is_function(ExDataSketch.DDSketch.reducer(), 2)
true

serialize(dd_sketch)

@spec serialize(t()) :: binary()

Serializes the sketch to the ExDataSketch-native EXSK binary format.

The serialized binary includes magic bytes, version, sketch type, parameters, and state. See ExDataSketch.Codec for format details.

Examples

iex> sketch = ExDataSketch.DDSketch.new()
iex> binary = ExDataSketch.DDSketch.serialize(sketch)
iex> <<"EXSK", _rest::binary>> = binary
iex> byte_size(binary) > 0
true

size_bytes(dd_sketch)

@spec size_bytes(t()) :: non_neg_integer()

Returns the size of the sketch state in bytes.

Examples

iex> sketch = ExDataSketch.DDSketch.new()
iex> ExDataSketch.DDSketch.size_bytes(sketch) > 0
true

update(sketch, value)

@spec update(t(), number()) :: t()

Updates the sketch with a single non-negative numeric value.

The value is converted to float64. Negative values, NaN, and Inf are rejected with an ArgumentError.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update(42.0)
iex> ExDataSketch.DDSketch.count(sketch)
1

update_many(sketch, items)

@spec update_many(t(), Enumerable.t()) :: t()

Updates the sketch with multiple non-negative numeric values in a single pass.

More efficient than calling update/2 repeatedly because it decodes and encodes the state binary only once.

Examples

iex> sketch = ExDataSketch.DDSketch.new() |> ExDataSketch.DDSketch.update_many([1.0, 2.0, 3.0])
iex> ExDataSketch.DDSketch.count(sketch)
3