Bloomy.Scalable (Bloomy v0.1.0)

View Source

Scalable bloom filter that dynamically grows to maintain target false positive rate.

A scalable bloom filter automatically adds new bloom filter slices as capacity is reached. Each new slice has a tighter error rate and larger capacity, maintaining the overall target false positive rate even as the data grows.

Growth Strategy

  • Initial capacity: Configurable starting capacity
  • Growth factor: Each new slice is 2x the previous capacity (configurable)
  • Error tightening: Each slice has 0.8x the error rate of the previous (configurable)
  • Combined error rate: Approximately equal to the target rate

Features

  • Automatic expansion without manual intervention
  • Maintains target false positive rate as data grows
  • Optimal for unknown dataset sizes
  • EXLA backend support for all slices

Examples

iex> # Create a scalable bloom filter starting with capacity 1000
iex> filter = Bloomy.Scalable.new(1000, false_positive_rate: 0.01)
iex>
iex> # Add many items - it will automatically grow
iex> filter = Enum.reduce(1..5000, filter, fn i, f ->
iex>   Bloomy.Scalable.add(f, "item_#{i}")
iex> end)
iex>
iex> # Check info - multiple slices created
iex> info = Bloomy.Scalable.info(filter)
iex> info.slices_count > 1
true

Summary

Functions

Add an item to the scalable bloom filter.

Add multiple items to the scalable bloom filter.

Force creation of a new slice even if current is not at capacity.

Get all slices in the scalable bloom filter.

Clear the scalable bloom filter.

Get the current slice (most recent bloom filter).

Get information and statistics about the scalable bloom filter.

Check if an item might be in the scalable bloom filter.

Create a new scalable bloom filter.

Types

t()

@type t() :: %Bloomy.Scalable{
  backend: atom(),
  error_tightening_ratio: float(),
  growth_factor: pos_integer(),
  initial_capacity: pos_integer(),
  items_count: non_neg_integer(),
  slices: [Bloomy.Standard.t()],
  target_false_positive_rate: float()
}

Functions

add(filter, item)

Add an item to the scalable bloom filter.

Automatically creates a new slice if the current slice is at capacity.

Parameters

  • filter - The scalable bloom filter struct
  • item - Item to add

Returns

Updated scalable bloom filter struct.

Examples

iex> filter = Bloomy.Scalable.new(10)
iex> filter = Enum.reduce(1..50, filter, fn i, f ->
iex>   Bloomy.Scalable.add(f, i)
iex> end)
iex> info = Bloomy.Scalable.info(filter)
iex> info.slices_count > 1
true

add_all(filter, items)

Add multiple items to the scalable bloom filter.

Parameters

  • filter - The scalable bloom filter struct
  • items - List of items to add

Returns

Updated scalable bloom filter struct.

Examples

iex> filter = Bloomy.Scalable.new(1000)
iex> filter = Bloomy.Scalable.add_all(filter, ["a", "b", "c"])
iex> filter.items_count
3

add_slice(filter)

Force creation of a new slice even if current is not at capacity.

Useful for manual capacity management or testing.

Parameters

  • filter - The scalable bloom filter struct

Returns

Updated scalable bloom filter struct with a new slice.

all_slices(scalable)

Get all slices in the scalable bloom filter.

Parameters

  • filter - The scalable bloom filter struct

Returns

List of all bloom filter slices (most recent first).

clear(filter)

Clear the scalable bloom filter.

Resets to initial state with a single empty slice.

Parameters

  • filter - The scalable bloom filter struct

Returns

Cleared scalable bloom filter struct.

Examples

iex> filter = Bloomy.Scalable.new(1000)
iex> filter = Bloomy.Scalable.add_all(filter, Enum.to_list(1..2000))
iex> filter.items_count > 0
true
iex> filter = Bloomy.Scalable.clear(filter)
iex> filter.items_count
0
iex> info = Bloomy.Scalable.info(filter)
iex> info.slices_count
1

current_slice(scalable)

Get the current slice (most recent bloom filter).

Parameters

  • filter - The scalable bloom filter struct

Returns

The current active slice.

info(filter)

Get information and statistics about the scalable bloom filter.

Parameters

  • filter - The scalable bloom filter struct

Returns

Map containing filter statistics including per-slice information.

Examples

iex> filter = Bloomy.Scalable.new(1000)
iex> info = Bloomy.Scalable.info(filter)
iex> info.type
:scalable
iex> info.slices_count
1

member?(filter, item)

Check if an item might be in the scalable bloom filter.

Checks all slices - returns true if any slice contains the item.

Parameters

  • filter - The scalable bloom filter struct
  • item - Item to check

Returns

Boolean - true if item might be present, false if definitely not present.

Examples

iex> filter = Bloomy.Scalable.new(1000)
iex> filter = Bloomy.Scalable.add(filter, "hello")
iex> Bloomy.Scalable.member?(filter, "hello")
true
iex> Bloomy.Scalable.member?(filter, "world")
false

new(initial_capacity, opts \\ [])

Create a new scalable bloom filter.

Parameters

  • initial_capacity - Starting capacity for the first slice
  • opts - Keyword list of options:
    • :false_positive_rate - Target false positive rate (default: 0.01)
    • :growth_factor - Capacity multiplier for new slices (default: 2)
    • :error_tightening_ratio - Error rate multiplier for new slices (default: 0.8)
    • :backend - Nx backend to use (default: Nx.default_backend())

Returns

A new Bloomy.Scalable struct.

Examples

iex> filter = Bloomy.Scalable.new(1000)
iex> filter.initial_capacity
1000

iex> filter = Bloomy.Scalable.new(1000, growth_factor: 3, error_tightening_ratio: 0.5)
iex> filter.growth_factor
3