Xb5.Bag (xb5_elixir v1.0.0)

Copy Markdown View Source

An ordered multiset (bag) backed by a B-tree of order 5.

Unlike a set, a bag allows duplicate values — the same value may appear multiple times. Elements are kept in ascending Erlang term order. Comparisons use == rather than === — so 1 and 1.0 are treated as the same element.

Pushing vs putting

Two insert operations are provided:

  • push/2 — always inserts a new copy, even if the value is already present.
  • put/2 — inserts only if the value is not already present (idempotent, like MapSet.put/2).

Order-statistic operations

In addition to standard collection operations, Xb5.Bag provides:

Conversion to a list via to_list/1 always yields elements in ascending order, with duplicates preserved.

Erlang interop

Xb5.Bag is compatible with the Erlang :xb5_bag module. Build one from an :xb5_bag term via new/1. To go the other way, call unwrap!/1 to extract the size and root node, then pass the result to :xb5_bag.wrap/1.

See also

  • Xb5.Set — ordered set, for unique elements and set-algebraic operations.
  • Xb5.Tree — ordered key-value store.

Examples

iex> bag = Xb5.Bag.new([1, 1, 2, 3])
Xb5.Bag.new([1, 1, 2, 3])
iex> Xb5.Bag.member?(bag, 2)
true
iex> Xb5.Bag.count(bag, 1)
2

Summary

Functions

Finds the element at the given index (0-based). Returns default if index is out of bounds. Runs in O(log n) time.

Returns the number of times value appears in bag. Values are matched using ==.

Removes one occurrence of value from the bag. Returns the bag unchanged if value is not present.

Removes all occurrences of value from the bag. Returns the bag unchanged if value is not present.

Returns a new bag containing only elements for which fun returns a truthy value.

Returns the first (smallest) element in the bag, or default if the bag is empty.

Returns the first (smallest) element in the bag. Raises Xb5.EmptyError if the bag is empty.

Returns the smallest element strictly greater (larger) than value, or :error if none exists.

Returns the 0-based index of value in the bag, or nil if not present. Runs in O(log n) time.

Returns the 0-based index of value in the bag. Raises KeyError if not present. Runs in O(log n) time.

Returns the last (largest) element in the bag, or default if the bag is empty.

Returns the last (largest) element in the bag. Raises Xb5.EmptyError if the bag is empty.

Returns the largest element strictly less (smaller) than value, or :error if none exists.

Checks if bag contains value. Membership is tested using ==, not ===.

Merges two bags into a new bag containing all elements from both, preserving duplicates.

Returns a new empty bag.

Creates a bag from an Erlang :xb5_bag term or an enumerable.

Creates a bag from an Erlang :xb5_bag term or an enumerable via the transformation function.

Returns the percentile value for the given percentile (0.0–1.0) using the given method options. Returns nil if the bag is empty or the percentile is out of range for the chosen method. Runs in O(log n) time.

Returns the percentile bracket for the given percentile, or nil if the bag is empty or the percentile is out of range for the chosen method. Runs in O(log n) time.

Returns the percentile rank of value in the bag as a float in 0.0–1.0. Runs in O(log n) time. Raises Xb5.EmptyError if the bag is empty.

Removes and returns the first (smallest) element. Raises Xb5.EmptyError if the bag is empty.

Removes and returns the last (largest) element. Raises Xb5.EmptyError if the bag is empty.

Adds value to the bag, always inserting a new copy even if already present.

Adds value to the bag only if it is not already present. Returns the bag unchanged if value is present.

Returns a new bag containing only elements for which fun returns a falsy value.

Returns the number of elements in the bag, counting duplicates.

Returns a lazy stream over all elements of bag.

Returns a lazy stream over elements of bag starting from value.

Returns a lazy stream over elements of bag starting from index (0-based), always in ascending order.

Returns structural statistics about the underlying B-tree.

Returns all elements as a sorted list, with duplicates.

Returns the size and root node of bag as %{size: n, root: node}. Pass the result to :xb5_bag.wrap/1 to obtain a proper :xb5_bag term.

Types

order()

@type order() :: :asc | :desc

t()

@type t() :: t(value())

t(value)

@type t(value) :: %Xb5.Bag{root: :xb5_bag_node.t(value), size: non_neg_integer()}

value()

@type value() :: term()

Functions

at(bag, value, default \\ nil)

@spec at(t(value()), index, default) :: value() | default
when index: integer(), default: term()

Finds the element at the given index (0-based). Returns default if index is out of bounds. Runs in O(log n) time.

A negative index counts from the end: -1 is the last element, -2 the second-to-last, etc.

This function is an optimized version of Enum.at/2.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.at(bag, 0)
1
iex> Xb5.Bag.at(bag, 2)
3
iex> Xb5.Bag.at(bag, -1)
3
iex> Xb5.Bag.at(bag, 5)
nil
iex> Xb5.Bag.at(bag, 5, :missing)
:missing

count(bag, value)

@spec count(t(value()), value()) :: non_neg_integer()

Returns the number of times value appears in bag. Values are matched using ==.

Examples

iex> bag = Xb5.Bag.new([1, 1, 1, 2, 3])
iex> Xb5.Bag.count(bag, 1)
3
iex> Xb5.Bag.count(bag, 2)
1
iex> Xb5.Bag.count(bag, 4)
0

delete(bag, value)

@spec delete(t(val1), val2) :: t(val1) when val1: value(), val2: value()

Removes one occurrence of value from the bag. Returns the bag unchanged if value is not present.

Examples

iex> bag = Xb5.Bag.new([1, 1, 2, 3])
iex> Xb5.Bag.delete(bag, 1)
Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.delete(bag, 4)
Xb5.Bag.new([1, 1, 2, 3])

delete_all(bag, value)

@spec delete_all(t(val1), val2) :: t(val1) when val1: value(), val2: value()

Removes all occurrences of value from the bag. Returns the bag unchanged if value is not present.

Examples

iex> bag = Xb5.Bag.new([1, 1, 2, 3])
iex> Xb5.Bag.delete_all(bag, 1)
Xb5.Bag.new([2, 3])
iex> Xb5.Bag.delete_all(bag, 4)
Xb5.Bag.new([1, 1, 2, 3])

filter(bag, fun)

@spec filter(t(a), (a -> as_boolean(term()))) :: t(a) when a: value()

Returns a new bag containing only elements for which fun returns a truthy value.

Examples

iex> Xb5.Bag.filter(Xb5.Bag.new([1, 2, 3, 4, 5]), fn x -> x > 3 end)
Xb5.Bag.new([4, 5])

iex> Xb5.Bag.filter(Xb5.Bag.new([1, 1, 2, 3]), fn x -> rem(x, 2) != 0 end)
Xb5.Bag.new([1, 1, 3])

first(bag, default \\ nil)

@spec first(t(value()), default) :: value() | default when default: term()

Returns the first (smallest) element in the bag, or default if the bag is empty.

Examples

iex> Xb5.Bag.first(Xb5.Bag.new([1, 2, 3]))
1
iex> Xb5.Bag.first(Xb5.Bag.new())
nil
iex> Xb5.Bag.first(Xb5.Bag.new(), :empty)
:empty

first!(bag)

@spec first!(t(value())) :: value()

Returns the first (smallest) element in the bag. Raises Xb5.EmptyError if the bag is empty.

Examples

iex> Xb5.Bag.first!(Xb5.Bag.new([1, 2, 3]))
1
iex> Xb5.Bag.first!(Xb5.Bag.new())
** (Xb5.EmptyError) empty error

higher(bag, value)

@spec higher(t(value()), value()) :: {:ok, value()} | :error

Returns the smallest element strictly greater (larger) than value, or :error if none exists.

value does not need to be a member of the bag.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.higher(bag, 1)
{:ok, 2}
iex> Xb5.Bag.higher(bag, 1.5)
{:ok, 2}
iex> Xb5.Bag.higher(bag, 3)
:error

index_of(bag, value)

@spec index_of(t(value()), value()) :: non_neg_integer() | nil

Returns the 0-based index of value in the bag, or nil if not present. Runs in O(log n) time.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.index_of(bag, 1)
0
iex> Xb5.Bag.index_of(bag, 3)
2
iex> Xb5.Bag.index_of(bag, 4)
nil

index_of!(bag, value)

@spec index_of!(t(value()), value()) :: non_neg_integer()

Returns the 0-based index of value in the bag. Raises KeyError if not present. Runs in O(log n) time.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.index_of!(bag, 1)
0
iex> Xb5.Bag.index_of!(bag, 3)
2
iex> assert_raise KeyError, fn -> Xb5.Bag.index_of!(bag, 4) end

last(bag, default \\ nil)

@spec last(t(value()), default) :: value() | default when default: term()

Returns the last (largest) element in the bag, or default if the bag is empty.

Examples

iex> Xb5.Bag.last(Xb5.Bag.new([1, 2, 3]))
3
iex> Xb5.Bag.last(Xb5.Bag.new())
nil
iex> Xb5.Bag.last(Xb5.Bag.new(), :empty)
:empty

last!(bag)

@spec last!(t(value())) :: value()

Returns the last (largest) element in the bag. Raises Xb5.EmptyError if the bag is empty.

Examples

iex> Xb5.Bag.last!(Xb5.Bag.new([1, 2, 3]))
3
iex> Xb5.Bag.last!(Xb5.Bag.new())
** (Xb5.EmptyError) empty error

lower(bag, value)

@spec lower(t(value()), value()) :: {:ok, value()} | :error

Returns the largest element strictly less (smaller) than value, or :error if none exists.

value does not need to be a member of the bag.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.lower(bag, 3)
{:ok, 2}
iex> Xb5.Bag.lower(bag, 1.5)
{:ok, 1}
iex> Xb5.Bag.lower(bag, 1)
:error

member?(bag, value)

@spec member?(t(), value()) :: boolean()

Checks if bag contains value. Membership is tested using ==, not ===.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.member?(bag, 2)
true
iex> Xb5.Bag.member?(bag, 2.0)
true
iex> Xb5.Bag.member?(bag, 4)
false

merge(bag1, bag2)

@spec merge(t(val1), t(val2)) :: t(val1 | val2) when val1: value(), val2: value()

Merges two bags into a new bag containing all elements from both, preserving duplicates.

Examples

iex> Xb5.Bag.merge(Xb5.Bag.new([1, 2, 3]), Xb5.Bag.new([2, 3, 4]))
Xb5.Bag.new([1, 2, 2, 3, 3, 4])

iex> Xb5.Bag.merge(Xb5.Bag.new([1, 2]), Xb5.Bag.new())
Xb5.Bag.new([1, 2])

new()

@spec new() :: t()

Returns a new empty bag.

Examples

iex> Xb5.Bag.new()
Xb5.Bag.new([])

new(input)

@spec new(:xb5_bag.bag(value()) | Enumerable.t()) :: t(value())

Creates a bag from an Erlang :xb5_bag term or an enumerable.

When given an enumerable, elements are stored in ascending order with duplicates preserved. When given an Erlang :xb5_bag term, the underlying structure is reused directly.

Examples

iex> Xb5.Bag.new([1, 1, 2, 3])
Xb5.Bag.new([1, 1, 2, 3])

iex> Xb5.Bag.new([3, :a, :b, :b])
Xb5.Bag.new([3, :a, :b, :b])

new(input, transform)

@spec new(:xb5_bag.bag() | Enumerable.t(), (term() -> value())) :: t(value())

Creates a bag from an Erlang :xb5_bag term or an enumerable via the transformation function.

Examples

iex> Xb5.Bag.new([1, 1, 2], fn x -> x * 2 end)
Xb5.Bag.new([2, 2, 4])

percentile(bag, percentile, opts \\ [])

@spec percentile(t(value()), percentile, opts) ::
  (value() | interpolation_result) | nil
when percentile: :xb5_bag_utils.percentile(),
     opts: [:xb5_bag_utils.percentile_bracket_opt()],
     interpolation_result: number()

Returns the percentile value for the given percentile (0.0–1.0) using the given method options. Returns nil if the bag is empty or the percentile is out of range for the chosen method. Runs in O(log n) time.

Raises Xb5.Bag.NonNumericInterpolationError if the percentile falls between two elements and those elements are not numbers (interpolation is impossible for non-numeric values).

Examples

iex> bag = Xb5.Bag.new([1, 2, 3, 4])
iex> Xb5.Bag.percentile(bag, 0.0)
1
iex> Xb5.Bag.percentile(bag, 0.5)
2.5
iex> Xb5.Bag.percentile(bag, 1.0)
4
iex> Xb5.Bag.percentile(Xb5.Bag.new(), 0.5)
nil

percentile_bracket(bag, percentile, opts \\ [])

@spec percentile_bracket(t(value()), percentile, opts) ::
  {:exact, value()} | {:between, value(), value(), float()} | nil
when percentile: :xb5_bag_utils.percentile(),
     opts: [:xb5_bag_utils.percentile_bracket_opt()]

Returns the percentile bracket for the given percentile, or nil if the bag is empty or the percentile is out of range for the chosen method. Runs in O(log n) time.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3, 4])
iex> Xb5.Bag.percentile_bracket(bag, 0.0)
{:exact, 1}
iex> Xb5.Bag.percentile_bracket(bag, 0.5)
{:between, 2, 3, 0.5000000000000001}
iex> Xb5.Bag.percentile_bracket(bag, 1.0)
{:exact, 4}
iex> Xb5.Bag.percentile_bracket(Xb5.Bag.new(), 0.5)
nil

percentile_rank(bag, value)

@spec percentile_rank(t(value()), value()) :: float()

Returns the percentile rank of value in the bag as a float in 0.0–1.0. Runs in O(log n) time. Raises Xb5.EmptyError if the bag is empty.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3, 4, 5])
iex> Xb5.Bag.percentile_rank(bag, 3)
0.5
iex> Xb5.Bag.percentile_rank(bag, 1)
0.1
iex> Xb5.Bag.percentile_rank(Xb5.Bag.new(), 1)
** (Xb5.EmptyError) empty error

pop_first!(bag)

@spec pop_first!(t(value())) :: {value(), t(value())}

Removes and returns the first (smallest) element. Raises Xb5.EmptyError if the bag is empty.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.pop_first!(bag)
{1, Xb5.Bag.new([2, 3])}
iex> Xb5.Bag.pop_first!(Xb5.Bag.new())
** (Xb5.EmptyError) empty error

pop_last!(bag)

@spec pop_last!(t(value())) :: {value(), t(value())}

Removes and returns the last (largest) element. Raises Xb5.EmptyError if the bag is empty.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.pop_last!(bag)
{3, Xb5.Bag.new([1, 2])}
iex> Xb5.Bag.pop_last!(Xb5.Bag.new())
** (Xb5.EmptyError) empty error

push(bag, value)

@spec push(t(value()), new_value) :: t(value() | new_value) when new_value: value()

Adds value to the bag, always inserting a new copy even if already present.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.push(bag, 2)
Xb5.Bag.new([1, 2, 2, 3])
iex> Xb5.Bag.push(bag, 4)
Xb5.Bag.new([1, 2, 3, 4])

put(bag, value)

@spec put(t(value()), new_value) :: t(value() | new_value) when new_value: value()

Adds value to the bag only if it is not already present. Returns the bag unchanged if value is present.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.put(bag, 2)
Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.put(bag, 4)
Xb5.Bag.new([1, 2, 3, 4])

reject(bag, fun)

@spec reject(t(a), (a -> as_boolean(term()))) :: t(a) when a: value()

Returns a new bag containing only elements for which fun returns a falsy value.

Examples

iex> Xb5.Bag.reject(Xb5.Bag.new([1, 2, 3, 4, 5]), fn x -> x > 3 end)
Xb5.Bag.new([1, 2, 3])

iex> Xb5.Bag.reject(Xb5.Bag.new([1, 1, 2, 3]), fn x -> rem(x, 2) != 0 end)
Xb5.Bag.new([2])

size(bag)

@spec size(t()) :: non_neg_integer()

Returns the number of elements in the bag, counting duplicates.

Examples

iex> Xb5.Bag.size(Xb5.Bag.new([1, 2, 3]))
3
iex> Xb5.Bag.size(Xb5.Bag.new([1, 1, 2]))
3
iex> Xb5.Bag.size(Xb5.Bag.new())
0

stream(bag, order \\ :asc)

@spec stream(t(value()), order()) :: Enumerable.t()

Returns a lazy stream over all elements of bag.

order controls traversal direction: :asc (ascending, the default) or :desc (descending).

Examples

iex> bag = Xb5.Bag.new([1, 2, 3])
iex> Xb5.Bag.stream(bag) |> Enum.to_list()
[1, 2, 3]
iex> Xb5.Bag.stream(bag, :desc) |> Enum.to_list()
[3, 2, 1]
iex> Xb5.Bag.stream(Xb5.Bag.new()) |> Enum.to_list()
[]

stream_from(bag, value, order \\ :asc)

@spec stream_from(t(value()), value(), order()) :: Enumerable.t()

Returns a lazy stream over elements of bag starting from value.

For :asc (the default), starts at the first element greater than or equal to value. For :desc, starts at the first element less than or equal to value. Returns an empty stream if no such element exists.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3, 4, 5])
iex> Xb5.Bag.stream_from(bag, 3) |> Enum.to_list()
[3, 4, 5]
iex> Xb5.Bag.stream_from(bag, 3, :desc) |> Enum.to_list()
[3, 2, 1]
iex> Xb5.Bag.stream_from(bag, 6) |> Enum.to_list()
[]

stream_from_index(bag, index)

@spec stream_from_index(t(value()), integer()) :: Enumerable.t()

Returns a lazy stream over elements of bag starting from index (0-based), always in ascending order.

A negative index counts from the end: -1 starts at the last element. Returns an empty stream if index is out of bounds.

Examples

iex> bag = Xb5.Bag.new([1, 2, 3, 4, 5])
iex> Xb5.Bag.stream_from_index(bag, 2) |> Enum.to_list()
[3, 4, 5]
iex> Xb5.Bag.stream_from_index(bag, -2) |> Enum.to_list()
[4, 5]
iex> Xb5.Bag.stream_from_index(bag, 10) |> Enum.to_list()
[]

structural_stats(bag)

@spec structural_stats(t()) :: :xb5_structural_stats.t()

Returns structural statistics about the underlying B-tree.

Useful for inspecting tree balance and node utilization.

Examples

iex> Xb5.Bag.structural_stats(Xb5.Bag.new(1..100))
[
  height: 4,
  node_counts: [
    internal4: 2,
    internal3: 3,
    internal2: 3,
    internal1: 1,
    leaf4: 6,
    leaf3: 14,
    leaf2: 5,
    leaf1: 0
  ],
  node_percentages: [
    internal4: 5.9,
    internal3: 8.8,
    internal2: 8.8,
    internal1: 2.9,
    leaf4: 17.6,
    leaf3: 41.2,
    leaf2: 14.7,
    leaf1: 0.0
  ],
  total_keys: 100,
  key_percentages: [
    internal4: 8.0,
    internal3: 9.0,
    internal2: 6.0,
    internal1: 1.0,
    leaf4: 24.0,
    leaf3: 42.0,
    leaf2: 10.0,
    leaf1: 0.0
  ],
  avg_keys_per_node: 2.9411764705882355,
  avg_keys_per_internal_node: 2.6666666666666665,
  avg_keys_per_leaf_node: 3.04
]

to_list(bag)

@spec to_list(t(value())) :: [value()]

Returns all elements as a sorted list, with duplicates.

Examples

iex> Xb5.Bag.to_list(Xb5.Bag.new([1, 2, 3]))
[1, 2, 3]
iex> Xb5.Bag.to_list(Xb5.Bag.new([1, 1, 2]))
[1, 1, 2]

unwrap!(bag)

@spec unwrap!(t(value())) :: :xb5_bag.unwrapped_bag(value())

Returns the size and root node of bag as %{size: n, root: node}. Pass the result to :xb5_bag.wrap/1 to obtain a proper :xb5_bag term.

Examples

iex> bag = Xb5.Bag.new([1, 1, 2, 3])
iex> %{size: size} = Xb5.Bag.unwrap!(bag)
iex> size
4