Bloomy.Standard (Bloomy v0.1.0)

View Source

Standard (classic) bloom filter implementation using Nx tensors.

A bloom filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. False positive matches are possible, but false negatives are not.

This implementation uses Nx tensors for high-performance vectorized operations and supports EXLA acceleration.

Features

  • O(k) add and query operations where k is the number of hash functions
  • Vectorized bit operations using Nx
  • EXLA backend support for GPU/CPU acceleration
  • Optimal parameter calculation
  • Statistics and monitoring

Examples

iex> # Create a bloom filter for 1000 items with 1% false positive rate
iex> filter = Bloomy.Standard.new(1000, false_positive_rate: 0.01)
iex>
iex> # Add items
iex> filter = Bloomy.Standard.add(filter, "apple")
iex> filter = Bloomy.Standard.add(filter, "banana")
iex> filter = Bloomy.Standard.add(filter, "orange")
iex>
iex> # Query membership
iex> Bloomy.Standard.member?(filter, "apple")
true
iex> Bloomy.Standard.member?(filter, "grape")
false
iex>
iex> # Get statistics
iex> info = Bloomy.Standard.info(filter)
iex> info.items_count
3

Summary

Functions

Add an item to the bloom filter.

Add multiple items to the bloom filter at once.

Check if the bloom filter is at or over capacity.

Clear the bloom filter (reset to empty state).

Estimate the actual number of items in the filter based on fill ratio.

Get information and statistics about the bloom filter.

Intersection of two bloom filters.

Check if an item might be in the bloom filter.

Create a new standard bloom filter.

Union (merge) two bloom filters.

Types

t()

@type t() :: %Bloomy.Standard{
  backend: atom(),
  bit_array: Bloomy.BitArray.t(),
  items_count: non_neg_integer(),
  params: Bloomy.Params.t()
}

Functions

add(filter, item)

Add an item to the bloom filter.

The item will be hashed using multiple hash functions and the corresponding bits will be set in the underlying bit array.

Parameters

  • filter - The bloom filter struct
  • item - Item to add (any term that can be converted to binary)

Returns

Updated bloom filter struct with the item added.

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter = Bloomy.Standard.add(filter, "hello")
iex> filter.items_count
1

iex> filter = Bloomy.Standard.new(1000)
iex> filter = filter |> Bloomy.Standard.add("a") |> Bloomy.Standard.add("b")
iex> filter.items_count
2

add_all(filter, items)

Add multiple items to the bloom filter at once.

More efficient than calling add/2 multiple times.

Parameters

  • filter - The bloom filter struct
  • items - List of items to add

Returns

Updated bloom filter struct.

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter = Bloomy.Standard.add_all(filter, ["a", "b", "c", "d"])
iex> filter.items_count
4

at_capacity?(filter)

Check if the bloom filter is at or over capacity.

Returns true if the number of items added meets or exceeds the expected capacity.

Parameters

  • filter - The bloom filter struct

Returns

Boolean indicating if filter is at capacity.

Examples

iex> filter = Bloomy.Standard.new(10)
iex> Bloomy.Standard.at_capacity?(filter)
false
iex> filter = Bloomy.Standard.add_all(filter, Enum.to_list(1..10))
iex> Bloomy.Standard.at_capacity?(filter)
true

clear(filter)

Clear the bloom filter (reset to empty state).

Parameters

  • filter - The bloom filter struct

Returns

Cleared bloom filter struct.

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter = Bloomy.Standard.add(filter, "test")
iex> filter.items_count
1
iex> filter = Bloomy.Standard.clear(filter)
iex> filter.items_count
0

estimate_count(filter)

Estimate the actual number of items in the filter based on fill ratio.

Uses the fill ratio to estimate item count, which may be more accurate than the tracked count if items have been added multiple times.

Parameters

  • filter - The bloom filter struct

Returns

Estimated number of unique items.

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter = Bloomy.Standard.add_all(filter, ["a", "b", "c"])
iex> Bloomy.Standard.estimate_count(filter) > 0
true

info(filter)

Get information and statistics about the bloom filter.

Parameters

  • filter - The bloom filter struct

Returns

Map containing:

  • :type - Type of bloom filter (:standard)
  • :capacity - Expected capacity
  • :size - Bit array size
  • :hash_functions - Number of hash functions used
  • :items_count - Number of items added
  • :fill_ratio - Proportion of bits set (0.0 to 1.0)
  • :false_positive_rate - Desired false positive rate
  • :actual_false_positive_rate - Actual false positive rate based on fill
  • :backend - Nx backend in use

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter = Bloomy.Standard.add(filter, "test")
iex> info = Bloomy.Standard.info(filter)
iex> info.type
:standard
iex> info.items_count
1

intersect(filter1, filter2)

Intersection of two bloom filters.

Creates a new bloom filter that only contains items present in both filters. Both filters must have the same parameters.

Note: The intersection may have false positives and the items_count is an estimate.

Parameters

  • filter1 - First bloom filter
  • filter2 - Second bloom filter

Returns

New bloom filter containing intersection of both filters.

Examples

iex> f1 = Bloomy.Standard.new(1000) |> Bloomy.Standard.add_all(["a", "b", "c"])
iex> f2 = Bloomy.Standard.new(1000) |> Bloomy.Standard.add_all(["b", "c", "d"])
iex> f_intersect = Bloomy.Standard.intersect(f1, f2)
iex> Bloomy.Standard.member?(f_intersect, "b")
true

member?(filter, item)

Check if an item might be in the bloom filter.

Returns

  • true - The item might be in the set (or could be a false positive)
  • false - The item is definitely not in the set

Parameters

  • filter - The bloom filter struct
  • item - Item to check

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter = Bloomy.Standard.add(filter, "hello")
iex> Bloomy.Standard.member?(filter, "hello")
true
iex> Bloomy.Standard.member?(filter, "world")
false

new(capacity, opts \\ [])

Create a new standard bloom filter.

Parameters

  • capacity - Expected number of items to store
  • opts - Keyword list of options:
    • :false_positive_rate - Desired false positive rate (default: 0.01 or 1%)
    • :backend - Nx backend to use (default: Nx.default_backend())

Returns

A new Bloomy.Standard struct.

Examples

iex> filter = Bloomy.Standard.new(1000)
iex> filter.params.capacity
1000

iex> filter = Bloomy.Standard.new(10000, false_positive_rate: 0.001)
iex> filter.params.false_positive_rate
0.001

union(filter1, filter2)

Union (merge) two bloom filters.

Creates a new bloom filter containing all items from both filters. Both filters must have the same parameters (size and hash functions).

Parameters

  • filter1 - First bloom filter
  • filter2 - Second bloom filter

Returns

New bloom filter containing union of both filters.

Examples

iex> f1 = Bloomy.Standard.new(1000) |> Bloomy.Standard.add("a")
iex> f2 = Bloomy.Standard.new(1000) |> Bloomy.Standard.add("b")
iex> f_union = Bloomy.Standard.union(f1, f2)
iex> Bloomy.Standard.member?(f_union, "a") and Bloomy.Standard.member?(f_union, "b")
true