Bloomex v1.2.0 Bloomex View Source

This module implements a Scalable Bloom Filter.

Examples

iex> bf = Bloomex.scalable(1000, 0.1, 0.1, 2)
%Bloomex.ScalableBloom...

iex> bf = Bloomex.add(bf, 5)
%Bloomex.ScalableBloom...

iex> Bloomex.member?(bf, 5)
true

iex> bf = Bloomex.add(bf, 100)
%Bloomex.ScalableBloom...

iex> Bloomex.member?(bf, 100)
true

iex> Bloomex.member?(bf, 105)
false

iex> Bloomex.member?(bf, 101) # false positive
true

Link to this section Summary

Functions

Returns a bloom filter with the element e added.

Returns the capacity of the bloom filter.

Returns true if the element e exists in the bloom filter, otherwise returns false.

Returns a plain Bloom filter based on the provided arguments

Returns the number of elements currently in the bloom filter.

Link to this section Types

Link to this section Functions

Returns a bloom filter with the element e added.

Link to this function

capacity(arg1)

View Source
capacity(Bloomex.t()) :: pos_integer() | :infinity

Returns the capacity of the bloom filter.

A plain bloom filter will always have a fixed capacity, while a scalable one will always have a theoretically infite capacity.

Link to this function

deserialise(bloom, func \\ fn x -> :erlang.phash2(x, 1 <<< 32) end)

View Source
deserialise(
  binary()
  | maybe_improper_list(
      binary() | maybe_improper_list(any(), binary() | []) | byte(),
      binary() | []
    ),
  any()
) :: Bloomex.t()
Link to this function

deserialize(bloom, func \\ fn x -> :erlang.phash2(x, 1 <<< 32) end)

View Source

See Bloomex.deserialise/2.

Link to this function

member?(bloom, e)

View Source
member?(Bloomex.t(), any()) :: boolean()

Returns true if the element e exists in the bloom filter, otherwise returns false.

Keep in mind that you may get false positives, but never false negatives.

Link to this function

plain(capacity, error, hash_func \\ fn x -> :erlang.phash2(x, 1 <<< 32) end)

View Source
plain(integer(), float(), hash_func()) :: Bloomex.Bloom.t()

Returns a plain Bloom filter based on the provided arguments:

  • capacity, used to calculate the size of each bitvector slice
  • error, the error probability
  • hash_func, a hashing function

If a hash function is not provided then :erlang.phash2/2 will be used with the maximum range possible (2^32).

Restrictions:

  • capacity must be a positive integer
  • error must be a float between 0 and 1
  • hash_func must be a function of type term -> pos_integer

The function follows a rule of thumb due to double hashing where capacity >= 4 / error must hold true.

Link to this function

scalable(capacity, error, error_ratio, growth, hash_func \\ fn x -> :erlang.phash2(x, 1 <<< 32) end)

View Source
scalable(integer(), number(), number(), 1 | 2 | 3, hash_func()) ::
  Bloomex.ScalableBloom.t()

Returns a scalable Bloom filter based on the provided arguments:

  • capacity, the initial capacity before expanding
  • error, the error probability
  • error_ratio, the error probability ratio
  • growth, the growth ratio when full
  • hash_func, a hashing function

If a hash function is not provided then :erlang.phash2/2 will be used with the maximum range possible (2^32).

Restrictions:

  • capacity must be a positive integer
  • error must be a float between 0 and 1
  • error_ratio must be a float between 0 and 1
  • growth must be a positive integer between 1 and 3
  • hash_func must be a function of type term -> pos_integer

The function follows a rule of thumb due to double hashing where capacity >= 4 / (error * (1 - error_ratio)) must hold true.

Link to this function

serialise(bloom)

View Source
serialise(Bloomex.t()) :: binary()

See Bloomex.serialise/1.

Returns the number of elements currently in the bloom filter.