View Source cuckoo_filter (cuckoo_filter v1.0.1)
High-performance, concurrent, and mutable Cuckoo Filter implemented using atomics for Erlang and Elixir.
Summary
Functions
Equivalent to add(Filter, Element, false).
Adds an element to a filter.
Equivalent to add_hash(Filter, Element, false).
Adds an element to a filter by its hash.
Returns the maximum capacity of a filter.
Checks if an element is in a filter.
Checks whether a filter contains a fingerprint at the given index or its alternative index.
Checks if an element is in a filter by its hash.
Deletes an element from a filter.
Deletes an element from a filter by its hash.
Exports a filter as a binary.
Returns the hash value of an element using the hash function of the filter.
Imports filter data from a binary created using export/1.
Equivalent to new(Capacity, []).
Creates a new cuckoo filter with the given capacity and options
Returns number of items in a filter.
Retrieves a cuckoo_filter from persistent_term by its name.
Types
-type cuckoo_filter() :: #cuckoo_filter{buckets :: atomics:atomics_ref(), lock :: spinlock:spinlock(), num_buckets :: pos_integer(), max_hash :: pos_integer(), bucket_size :: pos_integer(), fingerprint_size :: 4 | 8 | 16 | 32 | 64, max_evictions :: non_neg_integer(), hash_function :: fun((any()) -> non_neg_integer())}.
-type filter_name() :: term().
-type fingerprint() :: pos_integer().
-type hash() :: non_neg_integer().
-type index() :: non_neg_integer().
-type option() :: {name, filter_name()} | {fingerprint_size, 4 | 8 | 16 | 32 | 64} | {bucket_size, pos_integer()} | {max_evictions, non_neg_integer()} | {hash_function, fun((term()) -> hash())}.
-type options() :: [option()].
Functions
-spec add(cuckoo_filter() | filter_name(), term()) -> ok | {error, not_enough_space}.
Equivalent to add(Filter, Element, false).
-spec add(cuckoo_filter() | filter_name(), term(), false) -> ok | {error, not_enough_space}; (cuckoo_filter() | filter_name(), term(), true) -> ok | {ok, Evicted :: {index(), fingerprint()}}.
Adds an element to a filter.
Returns ok if the insertion was successful, but could return {error, not_enough_space}, when the filter is nearing its capacity.
Forced argument is a boolean that indicates whether the insertion should be forced or not. A forced insertion will never return a not_enough_space error, instead when the filter is full, another random element is removed, and the removed element is returned as {ok, {Index, Fingerprint}}. In this case, elements are not relocated, and no lock is acquired.
Forced insertion can only be used with max_evictions set to 0.
-spec add_hash(cuckoo_filter() | filter_name(), hash()) -> ok | {error, not_enough_space}.
Equivalent to add_hash(Filter, Element, false).
-spec add_hash(cuckoo_filter() | filter_name(), hash(), false) -> ok | {error, not_enough_space}; (cuckoo_filter() | filter_name(), hash(), true) -> ok | {ok, Evicted :: {index(), fingerprint()}}.
Adds an element to a filter by its hash.
Same as add/3 except that it accepts the hash of the element instead of the element.
-spec capacity(cuckoo_filter() | filter_name()) -> pos_integer().
Returns the maximum capacity of a filter.
-spec contains(cuckoo_filter() | filter_name(), term()) -> boolean().
Checks if an element is in a filter.
-spec contains_fingerprint(cuckoo_filter() | filter_name(), index(), fingerprint()) -> boolean().
Checks whether a filter contains a fingerprint at the given index or its alternative index.
-spec contains_hash(cuckoo_filter() | filter_name(), hash()) -> boolean().
Checks if an element is in a filter by its hash.
-spec delete(cuckoo_filter() | filter_name(), term()) -> ok | {error, not_found}.
Deletes an element from a filter.
Returns ok if the deletion was successful, and returns {error, not_found} if the element could not be found in the filter.
Note: A cuckoo filter can only delete items that are known to be inserted before. Deleting non inserted items might lead to deletion of another element in case of a hash collision.
-spec delete_hash(cuckoo_filter() | filter_name(), hash()) -> ok | {error, not_found}.
Deletes an element from a filter by its hash.
Same as delete/2 except that it uses the hash of the element instead of the element.
-spec export(cuckoo_filter() | filter_name()) -> binary().
Exports a filter as a binary.
Returned binary can be used to reconstruct the filter again, using import/2 function.
-spec hash(cuckoo_filter() | filter_name(), term()) -> hash().
Returns the hash value of an element using the hash function of the filter.
-spec import(cuckoo_filter() | filter_name(), binary()) -> ok | {error, invalid_data_size}.
Imports filter data from a binary created using export/1.
Returns ok if the import was successful, but could return {ok, invalid_data_size} if the size of the given binary does not match the size of the filter.
-spec new(pos_integer()) -> cuckoo_filter().
Equivalent to new(Capacity, []).
-spec new(pos_integer(), options()) -> cuckoo_filter().
Creates a new cuckoo filter with the given capacity and options
Note that the actual capacity might be higher than the given capacity, because internally number of buckets in a cuckoo filter must be a power of 2.
Possible options are:
{name, Name}If you give it a name, created filter instance will be stored in persistent_term, and later you can access the filter by its name.
{fingerprint_size, FingerprintSize}FingerprintSize can be one of 4, 8, 16, 32, and 64 bits. Default fingerprint size is 16 bits.
{bucket_size, BucketSize}BucketSize must be a non negative integer, and the default value is 4. Higher bucket sizes can reduce insert time considerably since it reduces the number of relocations of existing fingerprints in occupied buckets, but it increases the lookup time, and false positive rate.
{max_evictions, MaxEvictions}MaxEvictions indicates the maximum number of relocation attemps before giving up when inserting a new element.
{hash_function, HashFunction}You can specify a custom hash function that accepts any term as argument and returns an integer hash value. When choosing a hash function, ensure the hash size is equal or greater than the sum of the fingerprint and bucket sizes. By default,
erlang:phash2is used. If a 32bit hash is insufficient, xxh3 hash functions are applied, which require addingxxh3to your project dependencies manually.
-spec size(cuckoo_filter() | filter_name()) -> non_neg_integer().
Returns number of items in a filter.
-spec whereis(filter_name()) -> cuckoo_filter().
Retrieves a cuckoo_filter from persistent_term by its name.