UintSet v0.1.3 UintSet View Source

Functions that work on sets of small integers >= 0.

UintSet is an alternative set type in Elixir emulating the MapSet interface as closely as possible. Many of the UintSet doctests and unit tests were adapted from MapSet. UintSet illustrates the construction of a functional data structure from scratch, implementing the Inspect, Enumerable, and Collectable protocols.

An UintSet can contain only non-negative integers. By definition, sets contain unique elements. Trying to insert a duplicate is a no-op:

  iex> uint_set = UintSet.new()
  #UintSet<[]>
  iex> uint_set = UintSet.put(uint_set, 3)
  #UintSet<[3]>
  iex> uint_set |> UintSet.put(2) |> UintSet.put(2)
  #UintSet<[2, 3]>

UintSet.new/1 accepts an enumerable of elements:

  iex> UintSet.new(1..5)
  #UintSet<[1, 2, 3, 4, 5]>

An UintSet is represented internally using the %UintSet{} struct. This struct can be used whenever there's a need to pattern match on something being an UintSet:

  iex> match?(%UintSet{}, UintSet.new())
  true

The %UintSet{} struct contains a single field—bits— an integer which is used as a bit vector where each bit set to 1 represents a number present in the set.

An empty set is stored as bits = 0:

  iex> empty = UintSet.new()
  iex> empty.bits
  0
  iex> UintSet.member?(empty, 0)
  false

A set containing just a 0 is stored as bits = 1, because the bit at 0 is set, so the element 0 is present:

  iex> set_with_zero = UintSet.new([0])
  iex> set_with_zero.bits
  1
  iex> UintSet.member?(set_with_zero, 0)
  true

A set with a 2 is stored as bits = 4, because the bit at 2 is set, so the element 2 is present:

  iex> set_with_two = UintSet.new([2])
  iex> set_with_two.bits
  4
  iex> UintSet.member?(set_with_two, 2)
  true

A set with the elements 0 and 1 is stored as bits = 3, because 3 is 0b11, so the bits at 0 and 1 are set:

  iex> set_with_zero_and_one = UintSet.new([0, 1])
  #UintSet<[0, 1]>
  iex> set_with_zero_and_one.bits
  3

The UintSet.new/1 function also accepts a keyword argument setting the initial value of the bits field:

  iex> UintSet.new(bits: 13)
  #UintSet<[0, 2, 3]>

This is easier to understand using base 2 notation for the argument:

  iex> UintSet.new(bits: 0b1101)
  #UintSet<[0, 2, 3]>

UintSets can also be constructed starting from other collection-type data structures: for example, see UintSet.new/1 or Enum.into/2.

All the content of an UintSet is represented by a single integer, which in Elixir is limited only by available memory. Using an integer as a bit vector allows set operations like union and intersection to be implemented using fast bitwise operators. See the source code of UintSet.union and UintSet.intersection.

A bit vector is efficient only for storing sets of small integers, or high-density sets where a large percentage of the possible elements are present. The memory usage is proportional only to the largest element stored, not to the number of elements present. If the largest element in a set is 1_000_000, the raw bits will take 125_000 bytes (⅛), regardless of the number of elements in the set.

This package was inspired by the intset example from chapter 6 of The Go Programming Language, by Alan. A. A. Donovan and Brian W. Kernighan.

Link to this section Summary

Functions

Deletes value from uint_set.

Returns a set that is uint_set1 without the members of uint_set2.

Checks if uint_set1 and uint_set2 have no members in common.

Checks if two sets are equal.

Returns a set containing only members that uint_set1 and uint_set2 have in common.

Returns the number of elements in uint_set. This function is named length because it needs to traverse the uint_set, so it runs in O(n) time. The corresponding function in MapSet is size.

Checks if uint_set contains value.

Returns a new empty UintSet.

Creates a set from an enumerable.

Creates a set from an enumerable via the transformation function.

Inserts value into uint_set if uint_set doesn't already contain it.

Returns a stream function yielding the elements of uint_set one by one in ascending order. The stream lazily traverses the bits of the uint_set as needed.

Checks if uint_set1's members are all contained in uint_set2.

Returns a list containing all members of uint_set in ascending order.

Returns a set containing all members of uint_set1 and uint_set2.

Link to this section Functions

Deletes value from uint_set.

Returns a new set which is a copy of uint_set but without value.

Examples

iex> uint_set = UintSet.new([1, 2, 3])
iex> UintSet.delete(uint_set, 4)
#UintSet<[1, 2, 3]>
iex> UintSet.delete(uint_set, 2)
#UintSet<[1, 3]>
Link to this function

difference(uint_set1, uint_set2) View Source

Returns a set that is uint_set1 without the members of uint_set2.

Examples

iex> UintSet.difference(UintSet.new([1, 2]), UintSet.new([2, 3, 4]))
#UintSet<[1]>
Link to this function

disjoint?(uint_set1, uint_set2) View Source

Checks if uint_set1 and uint_set2 have no members in common.

Examples

iex> UintSet.disjoint?(UintSet.new([1, 2]), UintSet.new([3, 4]))
true
iex> UintSet.disjoint?(UintSet.new([1, 2]), UintSet.new([2, 3]))
false
Link to this function

equal?(uint_set1, uint_set2) View Source

Checks if two sets are equal.

Examples

iex> UintSet.equal?(UintSet.new([1, 2]), UintSet.new([2, 1, 1]))
true
iex> UintSet.equal?(UintSet.new([1, 2]), UintSet.new([3, 4]))
false
Link to this function

intersection(uint_set1, uint_set2) View Source

Returns a set containing only members that uint_set1 and uint_set2 have in common.

Examples

iex> UintSet.intersection(UintSet.new([1, 2]), UintSet.new([2, 3, 4]))
#UintSet<[2]>

iex> UintSet.intersection(UintSet.new([1, 2]), UintSet.new([3, 4]))
#UintSet<[]>

Returns the number of elements in uint_set. This function is named length because it needs to traverse the uint_set, so it runs in O(n) time. The corresponding function in MapSet is size.

Example

iex> UintSet.length(UintSet.new([10, 20, 30]))
3
Link to this function

member?(uint_set, value) View Source

Checks if uint_set contains value.

Examples

iex> UintSet.member?(UintSet.new([1, 2, 3]), 2)
true
iex> UintSet.member?(UintSet.new([1, 2, 3]), 4)
false

Returns a new empty UintSet.

Example

iex> UintSet.new()
#UintSet<[]>

Creates a set from an enumerable.

Examples

iex> UintSet.new([10, 5, 7])
#UintSet<[5, 7, 10]>
iex> UintSet.new(3..7)
#UintSet<[3, 4, 5, 6, 7]>
iex> UintSet.new([3, 3, 3, 2, 2, 1])
#UintSet<[1, 2, 3]>
Link to this function

new(enumerable, transform) View Source

Creates a set from an enumerable via the transformation function.

Examples

iex> UintSet.new([1, 3, 1], fn x -> 2 * x end)
#UintSet<[2, 6]>

Inserts value into uint_set if uint_set doesn't already contain it.

Examples

iex> UintSet.put(UintSet.new([1, 2, 3]), 3)
#UintSet<[1, 2, 3]>
iex> UintSet.put(UintSet.new([1, 2, 3]), 4)
#UintSet<[1, 2, 3, 4]>

Returns a stream function yielding the elements of uint_set one by one in ascending order. The stream lazily traverses the bits of the uint_set as needed.

Examples

iex> my_stream = UintSet.new([10, 5, 7]) |> UintSet.stream
iex> my_stream |> is_function
true
iex> my_stream |> Stream.map(&(&1 * 10)) |> Enum.to_list
[50, 70, 100]
Link to this function

subset?(uint_set1, uint_set2) View Source

Checks if uint_set1's members are all contained in uint_set2.

This function checks if uint_set1 is a subset of uint_set2.

Examples

iex> UintSet.subset?(UintSet.new([1, 2]), UintSet.new([1, 2, 3]))
true
iex> UintSet.subset?(UintSet.new([1, 2, 3]), UintSet.new([1, 2]))
false

Returns a list containing all members of uint_set in ascending order.

Examples

iex> UintSet.to_list(UintSet.new([2, 3, 1]))
[1, 2, 3]
Link to this function

union(uint_set1, uint_set2) View Source

Returns a set containing all members of uint_set1 and uint_set2.

Examples

iex> UintSet.union(UintSet.new([1, 2]), UintSet.new([2, 3, 4]))
#UintSet<[1, 2, 3, 4]>