# `ExDataSketch`
[🔗](https://github.com/thanos/ex_data_sketch/blob/main/lib/ex_data_sketch.ex#L1)

Production-grade streaming data sketching algorithms for Elixir.

ExDataSketch provides probabilistic data structures for approximate counting
and frequency estimation on streaming data. All sketch state is stored as
Elixir-owned binaries, enabling straightforward serialization, distribution,
and persistence.

## Sketch Families

- `ExDataSketch.HLL` -- HyperLogLog for cardinality (distinct count) estimation.
- `ExDataSketch.CMS` -- Count-Min Sketch for frequency estimation.
- `ExDataSketch.Theta` -- Theta Sketch for set operations on cardinalities.
- `ExDataSketch.KLL` -- KLL Sketch for rank and quantile estimation.
- `ExDataSketch.DDSketch` -- DDSketch for value-relative-accuracy quantile estimation.
- `ExDataSketch.FrequentItems` -- SpaceSaving for approximate heavy-hitter detection.
- `ExDataSketch.Bloom` -- Bloom filter for probabilistic membership testing.
- `ExDataSketch.Cuckoo` -- Cuckoo filter for membership testing with deletion support.
- `ExDataSketch.Quotient` -- Quotient filter for membership testing with deletion and merge.
- `ExDataSketch.CQF` -- Counting Quotient Filter for multiset membership with approximate counting.
- `ExDataSketch.XorFilter` -- Xor filter for static, immutable membership testing.
- `ExDataSketch.IBLT` -- Invertible Bloom Lookup Table for set reconciliation.
- `ExDataSketch.FilterChain` -- Capability-aware composition framework for membership filters.
- `ExDataSketch.REQ` -- REQ Sketch for relative error quantiles with tail accuracy.
- `ExDataSketch.MisraGries` -- Misra-Gries for deterministic heavy hitter detection.
- `ExDataSketch.Quantiles` -- Facade for quantile sketch algorithms.

## Architecture

- **Binary state**: All sketch state is canonical Elixir binaries. No opaque
  NIF resources.
- **Backend system**: Computation is dispatched through backend modules.
  `ExDataSketch.Backend.Pure` (pure Elixir) is always available.
  `ExDataSketch.Backend.Rust` (optional, precompiled binaries provided) provides NIF acceleration.
- **Serialization**: ExDataSketch-native format (EXSK) for all sketches,
  plus Apache DataSketches interop for Theta CompactSketch.
- **Deterministic hashing**: `ExDataSketch.Hash` provides a stable 64-bit
  hash interface for reproducible results.

## Quick Example

    # Cardinality estimation with HLL
    sketch = ExDataSketch.HLL.new(p: 14)
    sketch = ExDataSketch.update_many(sketch, ["alice", "bob", "alice"])
    ExDataSketch.HLL.estimate(sketch)

    # Frequency estimation with CMS
    sketch = ExDataSketch.CMS.new(width: 2048, depth: 5)
    sketch = ExDataSketch.update_many(sketch, ["page_a", "page_a", "page_b"])
    ExDataSketch.CMS.estimate(sketch, "page_a")

## Integration Patterns

Each sketch module provides convenience functions for ecosystem integration:

- `from_enumerable/2` — build a sketch from any `Enumerable` in one call.
- `merge_many/1` — merge a collection of sketches (e.g. from parallel workers).
- `reducer/1` — returns a 2-arity function for use with `Enum.reduce/3`, Flow, etc.
- `merger/1` — returns a 2-arity function for merging sketches in reduce operations.

See the [Integration Guide](integrations.md) for examples with Flow, Broadway,
Explorer, Nx, and other ecosystem libraries.

See the [Quick Start guide](quick_start.md) for more examples.

# `update_many`

```elixir
@spec update_many(
  ExDataSketch.HLL.t()
  | ExDataSketch.CMS.t()
  | ExDataSketch.Theta.t()
  | ExDataSketch.KLL.t()
  | ExDataSketch.DDSketch.t()
  | ExDataSketch.FrequentItems.t()
  | ExDataSketch.Bloom.t()
  | ExDataSketch.Cuckoo.t()
  | ExDataSketch.Quotient.t()
  | ExDataSketch.CQF.t()
  | ExDataSketch.IBLT.t()
  | ExDataSketch.REQ.t()
  | ExDataSketch.MisraGries.t(),
  Enumerable.t()
) ::
  ExDataSketch.HLL.t()
  | ExDataSketch.CMS.t()
  | ExDataSketch.Theta.t()
  | ExDataSketch.KLL.t()
  | ExDataSketch.DDSketch.t()
  | ExDataSketch.FrequentItems.t()
  | ExDataSketch.Bloom.t()
  | ExDataSketch.Cuckoo.t()
  | ExDataSketch.Quotient.t()
  | ExDataSketch.CQF.t()
  | ExDataSketch.IBLT.t()
  | ExDataSketch.REQ.t()
  | ExDataSketch.MisraGries.t()
```

Updates a sketch with multiple items in a single pass.

Delegates to the appropriate sketch module's `update_many/2` based on
the struct type.

## Examples

    iex> sketch = ExDataSketch.HLL.new(p: 10)
    iex> sketch = ExDataSketch.update_many(sketch, ["a", "b"])
    iex> ExDataSketch.HLL.estimate(sketch) > 0.0
    true

---

*Consult [api-reference.md](api-reference.md) for complete listing*
