gauzy - probabilistic set membership filters for Gleam

Package Version Hex Docs

gauzy is a Gleam library providing flexible implementations of probabilistic set membership filters. These filters offer space-efficient alternatives to traditional collections with controllable accuracy trade-offs.

NOTE: This library is a WIP and currently only provides Bloom filters!

Installation

gleam add gauzy@1

Overview

Probabilistic data structures are specialized data structures that use randomization to achieve compact representation with controlled error rates. They’re particularly useful when:

Available Data Structures

Bloom Filter

A Bloom filter is a space-efficient probabilistic data structure used to test whether an element is a member of a set. It can tell you:

Key properties:

import gauzy/bloom_filter
import murmur3a

pub fn main() {
  // Use your own hash functions here. They must output an integer hash digest.
  let hash_fn_1 = fn(string_item) {
    murmur3a.hash_string(string_item, 0) |> murmur3a.int_digest
  }
  let hash_fn_2 = fn(string_item) {
    murmur3a.hash_string(string_item, 1) |> murmur3a.int_digest
  }
  let assert Ok(hash_fn_pair) =
    bloom_filter.new_hash_fn_pair(hash_fn_1, hash_fn_2)

  // Create a new Bloom filter that can hold ~10,000 items while maintaining a 1% false positive rate
  let assert Ok(filter) = bloom_filter.new(
    capacity: 10_000,
    target_error_rate: 0.01,
    with_hashes: hash_fn_pair,
  )

  // Insert items
  let assert Ok(filter) = bloom_filter.try_insert(filter, "hello")
  let assert Ok(filter) = bloom_filter.try_insert(filter, "world")

  // Check if items might be in the set
  let in_filter = bloom_filter.might_contain(filter, "hello")  // True
  let also_in_filter = bloom_filter.might_contain(filter, "world")  // True
  let not_in_filter = bloom_filter.might_contain(filter, "goodbye")  // False

  // Get filter properties
  let size = bloom_filter.bit_size(filter)
  let error_rate = bloom_filter.error_rate(filter)
  let hash_count = bloom_filter.hash_fn_count(filter)

  // Reset the filter to empty state
  let empty_filter = bloom_filter.reset(filter)
}

Further documentation can be found at https://hexdocs.pm/gauzy.

Development

gleam test  # Run the tests
Search Document