# `ExDataSketch.MisraGries`
[🔗](https://github.com/thanos/ex_data_sketch/blob/main/lib/ex_data_sketch/misra_gries.ex#L1)

MisraGries sketch for deterministic heavy hitter detection.

The Misra-Gries algorithm maintains at most `k` counters to track frequent
items in a data stream. It provides a deterministic guarantee: any item
whose true frequency exceeds `n/k` (where `n` is the total count) is
guaranteed to be tracked.

## Algorithm

- **Update(x)**: If x is tracked, increment its counter. If there are fewer
  than k entries, insert x with count 1. Otherwise, decrement all counters
  by 1 and remove any that reach zero.

- **Guarantee**: If an item appears more than `n/k` times, it will be in the
  counter set when queried. The estimated count is a lower bound on the
  true count, with error at most `n/k`.

## Comparison with FrequentItems (SpaceSaving)

| Feature | MisraGries | FrequentItems |
|---------|-----------|---------------|
| Algorithm | Decrement-all | SpaceSaving (min-replacement) |
| Guarantee | Deterministic: freq > n/k always tracked | Probabilistic with error bounds |
| Counter count | At most k | Exactly k |
| Estimate | Lower bound | Estimate with overcount error |

## Binary State Layout (MG01)

All multi-byte fields are little-endian.

    HEADER (22 bytes):
      magic:       4 bytes  "MG01"
      version:     u8       1
      reserved:    u8       0
      k:           u32 LE   max counters
      n:           u64 LE   total count
      entry_count: u32 LE   number of entries

    ENTRIES (variable):
      entry_count x:
        key_len:   u32 LE
        key:       key_len bytes
        count:     u64 LE

## Options

- `:k` - maximum number of counters (default: 10, must be >= 1).
- `:key_encoding` - key encoding policy: `:binary` (default), `:int`,
  or `{:term, :external}`.
- `:backend` - backend module (default: `ExDataSketch.Backend.Pure`).

## Merge Properties

MisraGries merge is **commutative**. Both sketches must have the same
`k` parameter. Count (`n`) is always exactly additive.

# `t`

```elixir
@type t() :: %ExDataSketch.MisraGries{
  backend: module(),
  opts: keyword(),
  state: binary()
}
```

# `count`

```elixir
@spec count(t()) :: non_neg_integer()
```

Returns the total number of items inserted into the sketch.

## Examples

    iex> ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.count()
    0

# `deserialize`

```elixir
@spec deserialize(binary()) :: {:ok, t()} | {:error, Exception.t()}
```

Deserializes an EXSK binary into a MisraGries sketch.

Returns `{:ok, sketch}` on success or `{:error, reason}` on failure.

## Examples

    iex> ExDataSketch.MisraGries.deserialize(<<"invalid">>)
    {:error, %ExDataSketch.Errors.DeserializationError{message: "deserialization failed: invalid magic bytes, expected EXSK"}}

# `entry_count`

```elixir
@spec entry_count(t()) :: non_neg_integer()
```

Returns the number of distinct tracked entries.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update_many(["a", "b", "c"])
    iex> ExDataSketch.MisraGries.entry_count(sketch)
    3

# `estimate`

```elixir
@spec estimate(t(), term()) :: non_neg_integer()
```

Returns the estimated frequency of an item.

The estimate is a lower bound on the true count. If the item is not
tracked, returns 0.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update_many(["a", "a", "b"])
    iex> ExDataSketch.MisraGries.estimate(sketch, "a")
    2

# `frequent`

```elixir
@spec frequent(t(), float()) :: [{term(), non_neg_integer()}]
```

Returns entries whose estimated frequency exceeds the given threshold.

The threshold is a fraction in (0.0, 1.0). Returns entries whose count
is greater than `threshold * count(sketch)`.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new(k: 5)
    iex> sketch = Enum.reduce(1..100, sketch, fn _, s -> ExDataSketch.MisraGries.update(s, "heavy") end)
    iex> sketch = Enum.reduce(1..10, sketch, fn i, s -> ExDataSketch.MisraGries.update(s, "light_#{i}") end)
    iex> frequent = ExDataSketch.MisraGries.frequent(sketch, 0.5)
    iex> Enum.any?(frequent, fn {item, _count} -> item == "heavy" end)
    true

# `from_enumerable`

```elixir
@spec from_enumerable(
  Enumerable.t(),
  keyword()
) :: t()
```

Creates a new MisraGries sketch from an enumerable.

## Examples

    iex> sketch = ExDataSketch.MisraGries.from_enumerable(["a", "b", "a"], k: 5)
    iex> ExDataSketch.MisraGries.count(sketch)
    3

# `merge`

```elixir
@spec merge(t(), t()) :: t()
```

Merges two MisraGries instances.

Both sketches must have the same `k` parameter.

## Examples

    iex> a = ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update_many(["a", "a"])
    iex> b = ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update_many(["a", "b"])
    iex> merged = ExDataSketch.MisraGries.merge(a, b)
    iex> ExDataSketch.MisraGries.count(merged)
    4

# `merge_many`

```elixir
@spec merge_many(Enumerable.t()) :: t()
```

Merges a non-empty enumerable of MisraGries instances into one.

## Examples

    iex> sketches = Enum.map(1..3, fn _ ->
    ...>   ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update("x")
    ...> end)
    iex> merged = ExDataSketch.MisraGries.merge_many(sketches)
    iex> ExDataSketch.MisraGries.count(merged)
    3

# `merger`

```elixir
@spec merger(keyword()) :: (t(), t() -&gt; t())
```

Returns a 2-arity merge function suitable for combining sketches.

## Examples

    iex> is_function(ExDataSketch.MisraGries.merger(), 2)
    true

# `new`

```elixir
@spec new(keyword()) :: t()
```

Creates a new MisraGries sketch.

## Options

- `:k` - maximum number of counters (default: 10, must be >= 1).
- `:key_encoding` - `:binary` (default), `:int`, or `{:term, :external}`.
- `:backend` - backend module (default: `ExDataSketch.Backend.Pure`).

## Examples

    iex> sketch = ExDataSketch.MisraGries.new(k: 10)
    iex> sketch.opts[:k]
    10
    iex> ExDataSketch.MisraGries.count(sketch)
    0

# `reducer`

```elixir
@spec reducer() :: (term(), t() -&gt; t())
```

Returns a 2-arity reducer function suitable for `Enum.reduce/3`.

## Examples

    iex> is_function(ExDataSketch.MisraGries.reducer(), 2)
    true

# `serialize`

```elixir
@spec serialize(t()) :: binary()
```

Serializes the sketch to the ExDataSketch-native EXSK binary format.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new()
    iex> binary = ExDataSketch.MisraGries.serialize(sketch)
    iex> <<"EXSK", _rest::binary>> = binary
    iex> byte_size(binary) > 0
    true

# `size_bytes`

```elixir
@spec size_bytes(t()) :: non_neg_integer()
```

Returns the size of the sketch state in bytes.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new()
    iex> ExDataSketch.MisraGries.size_bytes(sketch) > 0
    true

# `top_k`

```elixir
@spec top_k(t(), non_neg_integer()) :: [{term(), non_neg_integer()}]
```

Returns the top entries sorted by count descending.

Each entry is a `{item, count}` tuple where the item is decoded using
the configured key encoding.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new()
    iex> sketch = ExDataSketch.MisraGries.update_many(sketch, ["a", "a", "a", "b", "b", "c"])
    iex> [{top_item, top_count} | _] = ExDataSketch.MisraGries.top_k(sketch, 2)
    iex> top_item
    "a"
    iex> top_count
    3

# `update`

```elixir
@spec update(t(), term()) :: t()
```

Updates the sketch with a single item.

The item is encoded using the configured key encoding before insertion.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update("hello")
    iex> ExDataSketch.MisraGries.count(sketch)
    1

# `update_many`

```elixir
@spec update_many(t(), Enumerable.t()) :: t()
```

Updates the sketch with multiple items in a single pass.

## Examples

    iex> sketch = ExDataSketch.MisraGries.new() |> ExDataSketch.MisraGries.update_many(["a", "b", "a"])
    iex> ExDataSketch.MisraGries.count(sketch)
    3

---

*Consult [api-reference.md](api-reference.md) for complete listing*
