Bloomy.Params (Bloomy v0.1.0)

View Source

Parameter calculation utilities for bloom filters.

This module provides functions to calculate optimal bloom filter parameters based on expected number of items and desired false positive rate.

Key Formulas

  • Bit array size: m = -(n * ln(p)) / (ln(2)^2)
  • Number of hash functions: k = (m / n) * ln(2)
  • Actual false positive rate: p = (1 - e^(-kn/m))^k

Where:

  • n = expected number of items
  • p = desired false positive rate
  • m = size of bit array
  • k = number of hash functions

Examples

iex> params = Bloomy.Params.calculate(1000, 0.01)
iex> params.size > 0
true
iex> params.hash_functions > 0
true

Summary

Functions

Calculate optimal bloom filter parameters.

Calculate the capacity for a given size and false positive rate.

Estimate number of items in a bloom filter based on fill ratio.

Calculate fill ratio (proportion of bits set) given number of items.

Calculate optimal number of hash functions.

Calculate optimal bit array size given capacity and false positive rate.

Get a summary of the bloom filter parameters as a readable map.

Validate if parameters are reasonable for a bloom filter.

Types

t()

@type t() :: %Bloomy.Params{
  actual_false_positive_rate: float(),
  capacity: pos_integer(),
  false_positive_rate: float(),
  hash_functions: pos_integer(),
  size: pos_integer()
}

Functions

calculate(capacity, false_positive_rate)

Calculate optimal bloom filter parameters.

Given expected capacity and desired false positive rate, calculates the optimal bit array size and number of hash functions.

Parameters

  • capacity - Expected number of items to store
  • false_positive_rate - Desired false positive rate (e.g., 0.01 for 1%)

Returns

A Bloomy.Params struct with calculated parameters.

Examples

iex> params = Bloomy.Params.calculate(1000, 0.01)
iex> params.capacity
1000
iex> params.false_positive_rate
0.01

iex> params = Bloomy.Params.calculate(10000, 0.001)
iex> params.size > 10000
true

calculate_false_positive_rate(size, hash_functions, items_count)

Calculate expected false positive rate.

Uses the formula: p = (1 - e^(-kn/m))^k

Parameters

  • size - Bit array size
  • hash_functions - Number of hash functions
  • items_count - Number of items stored

Returns

Expected false positive rate as a float.

Examples

iex> rate = Bloomy.Params.calculate_false_positive_rate(9586, 7, 1000)
iex> rate < 0.011
true

iex> rate = Bloomy.Params.calculate_false_positive_rate(9586, 7, 2000)
iex> rate > 0.01
true

capacity_for_size(size, false_positive_rate)

Calculate the capacity for a given size and false positive rate.

Inverse of optimal_size/2.

Parameters

  • size - Bit array size
  • false_positive_rate - Desired false positive rate

Returns

Maximum recommended capacity (positive integer).

Examples

iex> capacity = Bloomy.Params.capacity_for_size(9586, 0.01)
iex> capacity >= 900 and capacity <= 1100
true

estimate_item_count(size, hash_functions, fill_ratio)

Estimate number of items in a bloom filter based on fill ratio.

Uses the formula derived from fill ratio calculation.

Parameters

  • size - Bit array size
  • hash_functions - Number of hash functions
  • fill_ratio - Observed fill ratio (0.0 to 1.0)

Returns

Estimated number of items (integer).

Examples

iex> count = Bloomy.Params.estimate_item_count(1000, 7, 0.5)
iex> count > 0
true

expected_fill_ratio(size, hash_functions, items_count)

Calculate fill ratio (proportion of bits set) given number of items.

Uses the formula: fill_ratio = 1 - e^(-kn/m)

Parameters

  • size - Bit array size
  • hash_functions - Number of hash functions
  • items_count - Number of items stored

Returns

Expected fill ratio as a float between 0 and 1.

Examples

iex> ratio = Bloomy.Params.expected_fill_ratio(1000, 7, 100)
iex> ratio > 0 and ratio < 1
true

optimal_hash_functions(size, capacity)

Calculate optimal number of hash functions.

Uses the formula: k = (m / n) * ln(2)

Parameters

  • size - Bit array size
  • capacity - Expected number of items

Returns

Optimal number of hash functions (positive integer).

Examples

iex> Bloomy.Params.optimal_hash_functions(9586, 1000)
7

iex> Bloomy.Params.optimal_hash_functions(143776, 10000)
10

optimal_size(capacity, false_positive_rate)

Calculate optimal bit array size given capacity and false positive rate.

Uses the formula: m = -(n * ln(p)) / (ln(2)^2)

Parameters

  • capacity - Expected number of items
  • false_positive_rate - Desired false positive rate

Returns

Optimal bit array size (positive integer).

Examples

iex> Bloomy.Params.optimal_size(1000, 0.01)
9586

iex> Bloomy.Params.optimal_size(10000, 0.001)
143776

summary(params)

Get a summary of the bloom filter parameters as a readable map.

Parameters

Returns

A map with human-readable parameter descriptions.

Examples

iex> params = Bloomy.Params.calculate(1000, 0.01)
iex> summary = Bloomy.Params.summary(params)
iex> summary.capacity
1000

validate(params)

Validate if parameters are reasonable for a bloom filter.

Checks if the parameters will result in acceptable performance and memory usage.

Parameters

Returns

{:ok, params} if valid, or {:error, reason} if invalid.

Examples

iex> params = Bloomy.Params.calculate(1000, 0.01)
iex> {:ok, _} = Bloomy.Params.validate(params)

iex> params = %Bloomy.Params{size: 0, hash_functions: 0, capacity: 0, false_positive_rate: 0.5, actual_false_positive_rate: 0.5}
iex> {:error, _} = Bloomy.Params.validate(params)