# `ExDataSketch.Cuckoo`
[🔗](https://github.com/thanos/ex_data_sketch/blob/main/lib/ex_data_sketch/cuckoo.ex#L1)

Cuckoo filter for probabilistic membership testing with deletion support.

A Cuckoo filter is a space-efficient probabilistic data structure that supports
insertion, deletion, and membership queries. It uses partial-key cuckoo hashing
to store compact fingerprints in a bucket-based hash table.

Unlike Bloom filters, Cuckoo filters support deletion. Unlike XOR filters,
Cuckoo filters support incremental insertion. The trade-off is that insertion
can fail when the filter approaches capacity.

## Parameters

- `:capacity` -- expected number of items (default: 10,000). Used to derive
  the number of buckets.
- `:fingerprint_size` -- fingerprint width in bits (default: 8). Supported
  values: 8, 12, 16. Determines the false positive rate:
  FPR ~= 2 * bucket_size / 2^fingerprint_size.
- `:bucket_size` -- slots per bucket (default: 4). Supported values: 2, 4.
  Higher values improve space efficiency at the cost of lookup latency.
- `:max_kicks` -- maximum relocation attempts before declaring full
  (default: 500). Range: 100..2000.
- `:seed` -- hash seed (default: 0).

## False Positive Rates

With default bucket_size=4:

| fingerprint_size | FPR     |
|------------------|---------|
| 8                | ~3.1%   |
| 12               | ~0.2%   |
| 16               | ~0.012% |

## Hash Strategy

Items are hashed via `ExDataSketch.Hash.hash64/1` at the API boundary.
The bucket index is derived from the lower bits and the fingerprint from
the upper bits of the 64-bit hash. Alternate bucket index uses XOR with
a hash of the fingerprint (partial-key cuckoo hashing).

## Deletion Hazard

Only delete items that were definitely inserted. Deleting a false-positive
item removes a legitimate fingerprint, creating a false negative for a
different item. This hazard is inherent to all Cuckoo filters.

## Binary State Layout (CKO1)

See `plans/adr/ADR-102-cuckoo-binary-format.md` for the full specification.
32-byte header followed by a flat bucket array of fingerprint entries.

## Merge Policy

Merge is explicitly not supported. Cuckoo filter merge would require
re-inserting all fingerprints which can fail due to capacity limits.
For mergeable membership filters, use `ExDataSketch.Bloom`.

# `t`

```elixir
@type t() :: %ExDataSketch.Cuckoo{backend: module(), opts: keyword(), state: binary()}
```

# `capabilities`

Returns the set of capabilities supported by the Cuckoo filter.

## Examples

    iex> caps = ExDataSketch.Cuckoo.capabilities()
    iex> :put in caps and :delete in caps and :member? in caps
    true

# `compatible_with?`

```elixir
@spec compatible_with?(t(), t()) :: boolean()
```

Returns `true` if two Cuckoo filters have compatible parameters.

Compatible filters have the same bucket_count, fingerprint_size,
bucket_size, and seed.

## Examples

    iex> a = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> b = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> ExDataSketch.Cuckoo.compatible_with?(a, b)
    true

# `count`

```elixir
@spec count(t()) :: non_neg_integer()
```

Returns the number of items currently stored in the filter.

## Examples

    iex> ExDataSketch.Cuckoo.new(capacity: 100) |> ExDataSketch.Cuckoo.count()
    0

# `delete`

```elixir
@spec delete(t(), term()) :: {:ok, t()} | {:error, :not_found}
```

Deletes a single item from the filter.

Returns `{:ok, cuckoo}` if the item's fingerprint was found and removed,
or `{:error, :not_found}` if no matching fingerprint exists.

WARNING: Only delete items that were definitely inserted. Deleting a
false-positive item removes a legitimate fingerprint belonging to a
different item, creating a false negative.

## Examples

    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.new(capacity: 100) |> ExDataSketch.Cuckoo.put("hello")
    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.delete(cuckoo, "hello")
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "hello")
    false

# `deserialize`

```elixir
@spec deserialize(binary()) :: {:ok, t()} | {:error, Exception.t()}
```

Deserializes an EXSK binary into a Cuckoo filter.

Returns `{:ok, cuckoo}` on success or `{:error, reason}` on failure.

## Examples

    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.new(capacity: 100) |> ExDataSketch.Cuckoo.put("test")
    iex> {:ok, recovered} = ExDataSketch.Cuckoo.deserialize(ExDataSketch.Cuckoo.serialize(cuckoo))
    iex> ExDataSketch.Cuckoo.member?(recovered, "test")
    true

# `from_enumerable`

```elixir
@spec from_enumerable(
  Enumerable.t(),
  keyword()
) :: {:ok, t()} | {:error, :full, t()}
```

Creates a Cuckoo filter from an enumerable of items.

Equivalent to `new(opts) |> put_many(enumerable)`.

Returns `{:ok, cuckoo}` or `{:error, :full, cuckoo}`.

## Examples

    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.from_enumerable(["a", "b", "c"], capacity: 100)
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "a")
    true

# `member?`

```elixir
@spec member?(t(), term()) :: boolean()
```

Tests whether an item may be a member of the set.

Returns `true` if the item is possibly in the set (may be a false positive),
`false` if the item is definitely not in the set.

## Examples

    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.new(capacity: 100) |> ExDataSketch.Cuckoo.put("hello")
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "hello")
    true

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "hello")
    false

# `new`

```elixir
@spec new(keyword()) :: t()
```

Creates a new empty Cuckoo filter.

## Options

- `:capacity` -- expected number of items (default: 10000).
- `:fingerprint_size` -- fingerprint width in bits (default: 8).
  Supported values: 8, 12, 16.
- `:bucket_size` -- slots per bucket (default: 4).
  Supported values: 2, 4.
- `:max_kicks` -- maximum relocation attempts (default: 500).
  Range: 100..2000.
- `:seed` -- hash seed (default: 0).
- `:backend` -- backend module (default: `ExDataSketch.Backend.Pure`).
- `:hash_fn` -- custom hash function `(term -> non_neg_integer)`.

## Examples

    iex> cuckoo = ExDataSketch.Cuckoo.new()
    iex> cuckoo.opts[:capacity]
    10000

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 1000, fingerprint_size: 16)
    iex> cuckoo.opts[:fingerprint_size]
    16

# `put`

```elixir
@spec put(t(), term()) :: {:ok, t()} | {:error, :full}
```

Inserts a single item into the filter.

Returns `{:ok, cuckoo}` on success or `{:error, :full}` if the filter
cannot accommodate the item after max_kicks relocation attempts.

## Examples

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.put(cuckoo, "hello")
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "hello")
    true

# `put!`

```elixir
@spec put!(t(), term()) :: t()
```

Inserts a single item, raising on failure.

## Examples

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> cuckoo = ExDataSketch.Cuckoo.put!(cuckoo, "hello")
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "hello")
    true

# `put_many`

```elixir
@spec put_many(t(), Enumerable.t()) :: {:ok, t()} | {:error, :full, t()}
```

Inserts multiple items in a single pass.

Returns `{:ok, cuckoo}` if all items were inserted, or
`{:error, :full, cuckoo}` with the partially updated filter
if insertion failed partway through.

## Examples

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> {:ok, cuckoo} = ExDataSketch.Cuckoo.put_many(cuckoo, ["a", "b", "c"])
    iex> ExDataSketch.Cuckoo.member?(cuckoo, "a")
    true

# `reducer`

```elixir
@spec reducer() :: (term(), t() -&gt; t())
```

Returns a 2-arity reducer function for use with `Enum.reduce/3`.

The reducer calls `put!/2` and raises if the filter becomes full.

## Examples

    iex> is_function(ExDataSketch.Cuckoo.reducer(), 2)
    true

# `serialize`

```elixir
@spec serialize(t()) :: binary()
```

Serializes the filter to the ExDataSketch-native EXSK binary format.

## Examples

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> binary = ExDataSketch.Cuckoo.serialize(cuckoo)
    iex> <<"EXSK", _rest::binary>> = binary
    iex> byte_size(binary) > 0
    true

# `size_bytes`

```elixir
@spec size_bytes(t()) :: non_neg_integer()
```

Returns the byte size of the state binary.

## Examples

    iex> cuckoo = ExDataSketch.Cuckoo.new(capacity: 100)
    iex> ExDataSketch.Cuckoo.size_bytes(cuckoo) > 0
    true

---

*Consult [api-reference.md](api-reference.md) for complete listing*
