View Source ddskerl_bound (ddskerl v0.2.1)
DDSketch implementation in Erlang.
This implements an bounded bucket count, that is, on a degenerate case, memory consumption won't grow but lower quantiles will lose accuracy as the summary saturates.
When the number of buckets exceeds the limit, the two smallest buckets are collapsed into one and the limit is kept.
The implementation uses a local and non-shared gb_trees
, and the buckets are not preallocated,
but are rather created on demand, without bounds on their indexes.
Because the data structure cannot be accessed concurretly (share-nothing semantics), a good strategy would be to keep many running in parallel, and running queries over a merge.
When to use
This is a good choice when the summary is tracked by a process and updates the state through its mailbox.
For example, the following could be used as a template:
...
-behaviour(gen_server).
-spec start_link(ddskerl_bound:opts()) -> gen_server:start_ret().
start_link(Opts) ->
gen_server:start_link(?MODULE, Opts, [{spawn_opt, [{message_queue_data, off_heap}]}]).
-spec init(ddskerl_bound:opts()) -> {ok, ddskerl_bound:ddsketch()}.
init(Opts) ->
{ok, ddskerl_bound:new(Opts)}.
handle_call({get_quantile, Q}, _From, Sketch) ->
{reply, ddskerl_bound:quantile(Sketch, Q), Sketch}.
handle_cast({new_value, Value}, Sketch) ->
{noreply, ddskerl_bound:insert(Sketch, Value)}.
...
Summary
Types
-opaque ddskerl_bound()
DDSketch instance.
-type opts() :: #{error := float(), bound := non_neg_integer()}.
Options for the DDSketch.
Functions
-spec insert(ddskerl_bound(), number()) -> ddskerl_bound().
Insert a value into the DDSketch.
-spec merge(ddskerl_bound(), ddskerl_bound()) -> ddskerl_bound().
Merge two DDSketch instances.
-spec new(opts()) -> ddskerl_bound().
Create a new DDSketch instance.
-spec quantile(ddskerl_bound(), float()) -> float() | undefined.
Calculate the quantile of a DDSketch.
-spec sum(ddskerl_bound()) -> number().
Get the sum of elements in the DDSketch.
-spec total(ddskerl_bound()) -> non_neg_integer().
Get the total number of elements in the DDSketch.