Bloomy.Scalable (Bloomy v0.1.0)
View SourceScalable bloom filter that dynamically grows to maintain target false positive rate.
A scalable bloom filter automatically adds new bloom filter slices as capacity is reached. Each new slice has a tighter error rate and larger capacity, maintaining the overall target false positive rate even as the data grows.
Growth Strategy
- Initial capacity: Configurable starting capacity
- Growth factor: Each new slice is 2x the previous capacity (configurable)
- Error tightening: Each slice has 0.8x the error rate of the previous (configurable)
- Combined error rate: Approximately equal to the target rate
Features
- Automatic expansion without manual intervention
- Maintains target false positive rate as data grows
- Optimal for unknown dataset sizes
- EXLA backend support for all slices
Examples
iex> # Create a scalable bloom filter starting with capacity 1000
iex> filter = Bloomy.Scalable.new(1000, false_positive_rate: 0.01)
iex>
iex> # Add many items - it will automatically grow
iex> filter = Enum.reduce(1..5000, filter, fn i, f ->
iex> Bloomy.Scalable.add(f, "item_#{i}")
iex> end)
iex>
iex> # Check info - multiple slices created
iex> info = Bloomy.Scalable.info(filter)
iex> info.slices_count > 1
true
Summary
Functions
Add an item to the scalable bloom filter.
Add multiple items to the scalable bloom filter.
Force creation of a new slice even if current is not at capacity.
Get all slices in the scalable bloom filter.
Clear the scalable bloom filter.
Get the current slice (most recent bloom filter).
Get information and statistics about the scalable bloom filter.
Check if an item might be in the scalable bloom filter.
Create a new scalable bloom filter.
Types
@type t() :: %Bloomy.Scalable{ backend: atom(), error_tightening_ratio: float(), growth_factor: pos_integer(), initial_capacity: pos_integer(), items_count: non_neg_integer(), slices: [Bloomy.Standard.t()], target_false_positive_rate: float() }
Functions
Add an item to the scalable bloom filter.
Automatically creates a new slice if the current slice is at capacity.
Parameters
filter- The scalable bloom filter structitem- Item to add
Returns
Updated scalable bloom filter struct.
Examples
iex> filter = Bloomy.Scalable.new(10)
iex> filter = Enum.reduce(1..50, filter, fn i, f ->
iex> Bloomy.Scalable.add(f, i)
iex> end)
iex> info = Bloomy.Scalable.info(filter)
iex> info.slices_count > 1
true
Add multiple items to the scalable bloom filter.
Parameters
filter- The scalable bloom filter structitems- List of items to add
Returns
Updated scalable bloom filter struct.
Examples
iex> filter = Bloomy.Scalable.new(1000)
iex> filter = Bloomy.Scalable.add_all(filter, ["a", "b", "c"])
iex> filter.items_count
3
Force creation of a new slice even if current is not at capacity.
Useful for manual capacity management or testing.
Parameters
filter- The scalable bloom filter struct
Returns
Updated scalable bloom filter struct with a new slice.
Get all slices in the scalable bloom filter.
Parameters
filter- The scalable bloom filter struct
Returns
List of all bloom filter slices (most recent first).
Clear the scalable bloom filter.
Resets to initial state with a single empty slice.
Parameters
filter- The scalable bloom filter struct
Returns
Cleared scalable bloom filter struct.
Examples
iex> filter = Bloomy.Scalable.new(1000)
iex> filter = Bloomy.Scalable.add_all(filter, Enum.to_list(1..2000))
iex> filter.items_count > 0
true
iex> filter = Bloomy.Scalable.clear(filter)
iex> filter.items_count
0
iex> info = Bloomy.Scalable.info(filter)
iex> info.slices_count
1
Get the current slice (most recent bloom filter).
Parameters
filter- The scalable bloom filter struct
Returns
The current active slice.
Get information and statistics about the scalable bloom filter.
Parameters
filter- The scalable bloom filter struct
Returns
Map containing filter statistics including per-slice information.
Examples
iex> filter = Bloomy.Scalable.new(1000)
iex> info = Bloomy.Scalable.info(filter)
iex> info.type
:scalable
iex> info.slices_count
1
Check if an item might be in the scalable bloom filter.
Checks all slices - returns true if any slice contains the item.
Parameters
filter- The scalable bloom filter structitem- Item to check
Returns
Boolean - true if item might be present, false if definitely not present.
Examples
iex> filter = Bloomy.Scalable.new(1000)
iex> filter = Bloomy.Scalable.add(filter, "hello")
iex> Bloomy.Scalable.member?(filter, "hello")
true
iex> Bloomy.Scalable.member?(filter, "world")
false
Create a new scalable bloom filter.
Parameters
initial_capacity- Starting capacity for the first sliceopts- Keyword list of options::false_positive_rate- Target false positive rate (default: 0.01):growth_factor- Capacity multiplier for new slices (default: 2):error_tightening_ratio- Error rate multiplier for new slices (default: 0.8):backend- Nx backend to use (default: Nx.default_backend())
Returns
A new Bloomy.Scalable struct.
Examples
iex> filter = Bloomy.Scalable.new(1000)
iex> filter.initial_capacity
1000
iex> filter = Bloomy.Scalable.new(1000, growth_factor: 3, error_tightening_ratio: 0.5)
iex> filter.growth_factor
3