telemetry_metrics v0.1.0 Telemetry.Metrics View Source
Data model and specifications for aggregating Telemetry events.
Metrics are responsible for aggregating Telemetry events with the same name in order to gain any useful knowledge about the events.
Please note that Telemetry.Metrics package itself doesn't provide any functionality for aggregating metrics. This library only defines the data model and specifications for aggregations which should be implemented by reporters - libraries exporting metrics to external systems. You can read more about reporters in the "Reporters" section below.
Data model
Telemetry.Metrics
imposes a multi-dimensional data model - a single metric may generate multiple
aggregations, each aggregation being bound to a unique set of tag values. Tags are pairs of
key-values derived from event metadata (in the simplest case, tags are a subset of the metadata).
Based on the tag values, the value of the event will be used to generate one of the aggregations.
For example, imagine that you want to count how many requests are being made against your web application. On each request, you might emit an event with the name of the controller and action handling that request, e.g.:
:telemetry.execute([:http, :request], 1, %{controller: "user_controller", action: "index"})
:telemetry.execute([:http, :request], 1, %{controller: "user_controller", action: "index"})
:telemetry.execute([:http, :request], 1, %{controller: "user_controller", action: "create"})
:telemetry.execute([:http, :request], 1, %{controller: "product_controller", action: "get"})
With multi-dimensional data model, the result of aggregating those events by :controller
and
:action
tags would look like this:
controller | action | count |
---|---|---|
user_controller | index | 2 |
user_controller | create | 1 |
product_controller | get | 1 |
You can see that the request count is broken down by unique set of tag values.
Metric types
Metric type specifies how the event values are aggregated. Telemetry.Metrics
aims to define
a set of metric types covering the most common instrumentation patterns.
Metric types below are heavily inspired by OpenCensus.
Counter
Value of the counter metric is the number of emitted events, regardless of event value. It's monotonically increasing and its value is never reset.
Sum
Value of the sum metric is the sum of event values.
LastValue
Value of this metric is the value of the most recent event.
Distribution
The value of this metric is a histogram distribution of event values, i.e. how many events were emitted with values falling into defined buckets. Histogram values can be used to compute approximation of useful statistics about the data, like quantiles, minimum or maximum.
For example, given boundaries [0, 100, 200]
, the distribution metric produces four values:
- number of event values less than or equal to 0
- number of event values greater than 0 and less than or equal to 100
- number of event values greater than 100 and less than or equal to 200
- number of event values greater than 200
Metric specifications
Metric specification is a data structure describing the metric - its name, type, name of the events aggregated by the metric, etc. The structure of metric specification is relevant only to authors of reporters.
Metric specifications are created using one of the four functions: counter/2
, sum/2
,
last_value/2
and distribution/2
. Each of those functions returns a specification of metric
of the corresponding type. The first argument to all these functions is the name of events which
are aggregated by the metric. Event name might be represented as in Telemetry, i.e. as a list of
atoms ([:http, :request]
), or as a string of words joined by dots ("http.request"
).
Note: do not use data from external sources as metric or event names! Since they are converted to atoms, your application becomes vulnerable to atom leakage and might run out of memory.
The second argument is a list of options. Below is the description of the options common to all metric types:
:name
- the metric name. Metric name can be represented in the same way as event name. Defaults to event name given as first argument;:tags
- tags by which aggregations will be broken down. Defaults to an empty list;:metadata
- determines what part of event metadata is used as the source of tag values. Default value is the value of:tags
or empty list if:tags
are not set. There are three possible values of this option::all
- all event metadata is used;- list of terms, e.g.
[:table, :kind]
- only these keys from the event metadata are used; - one argument function taking the event metadata and returning the metadata which should be used to generate tag values
:description
- human-readable description of the metric. Might be used by reporters for documentation purposes. Defaults tonil
;:unit
- an atom describing the unit of event values. Might be used by reporters for documentation purposes. Defaults to:unit
.
Reporters
Reporters take metric definitions as an input, subscribe to relevant events and update the metrics
when the events are emitted. Updating the metric might involve publishing the metrics periodically,
or on demand, to external systems. Telemetry.Metrics
defines only specification for metric types,
and reporters should provide actual implementation for these aggregations.
Rationale
The design proposed by Telemetry.Metrics
might look controversial - unlike most of the libraries
available on the BEAM, it doesn't aggregate metrics itself, it merely defines what users should
expect when using the reporters. There are two arguments for this solution.
if Telemetry.Metrics
would aggregate metrics, the way those aggregations work would be imposed
on the system where the metrics are published to. For example, counters in StatsD are reset on
every flush and can be decremented, whereas counters in Prometheus are monotonically increasing.
Telemetry.Metrics
doesn't focus on those details - instead, it describes what the end user,
operator, expects to see when using the metric of particular type. This implies that in most
cases aggregated metrics won't be visible inside the BEAM, but in exchange aggregations can be
implemented in a way that makes most sense for particular system. Finally, one could also
implement an in-VM "reporter" which would aggregate the metrics and expose them inside the BEAM.
When there is a need to swap the reporters, and if both reporters are following the metric types
specification, then the end result of aggregation is the same, regardless of the backend system
in use.
Requirements for reporters
Reporters should accept metric specifications and subscribe to relevant events. When those events are emitted, metric should be updated (either in-memory or by contacting external system) in such a way that the user is able to view metric values as described in the "Metric types" section.
If the reporter does not support the metric given to it, it should log a warning.
Reporters should also document how Telemetry.Metrics
metric types, names tags are translated to
metric types and identifiers in the system they publish metrics to.
Link to this section Summary
Types
Common fields for metric specifications
Functions
Returns a specification of counter metric
Returns a specification of distribution metric
Returns a specification of last value metric
Returns a specification of sum metric
Link to this section Types
counter_options()
View Source
counter_options() :: [metric_option()]
counter_options() :: [metric_option()]
description()
View Source
description() :: nil | String.t()
description() :: nil | String.t()
distribution_options()
View Source
distribution_options() :: [
metric_option() | {:buckets, Telemetry.Metrics.Distribution.buckets()}
]
distribution_options() :: [ metric_option() | {:buckets, Telemetry.Metrics.Distribution.buckets()} ]
event_name()
View Source
event_name() :: String.t() | :telemetry.event_name()
event_name() :: String.t() | :telemetry.event_name()
last_value_options()
View Source
last_value_options() :: [metric_option()]
last_value_options() :: [metric_option()]
metadata()
View Source
metadata() ::
:all
| [key :: term()]
| (:telemetry.event_metadata() -> :telemetry.event_metadata())
metadata() :: :all | [key :: term()] | (:telemetry.event_metadata() -> :telemetry.event_metadata())
metric_name()
View Source
metric_name() :: String.t() | normalized_metric_name()
metric_name() :: String.t() | normalized_metric_name()
metric_option()
View Source
metric_option() ::
{:name, metric_name()}
| {:metadata, metadata()}
| {:tags, tags()}
| {:description, description()}
| {:unit, unit()}
metric_option() :: {:name, metric_name()} | {:metadata, metadata()} | {:tags, tags()} | {:description, description()} | {:unit, unit()}
metric_type()
View Source
metric_type() :: :counter | :sum | :last_value | :distribution
metric_type() :: :counter | :sum | :last_value | :distribution
normalized_metric_name()
View Source
normalized_metric_name() :: [atom(), ...]
normalized_metric_name() :: [atom(), ...]
sum_options()
View Source
sum_options() :: [metric_option()]
sum_options() :: [metric_option()]
t()
View Source
t() :: %module(){
name: normalized_metric_name(),
event_name: :telemetry.event_name(),
metadata: (:telemetry.event_metadata() -> :telemetry.event_metadata()),
tags: tags(),
description: description(),
unit: unit()
}
t() :: %module(){ name: normalized_metric_name(), event_name: :telemetry.event_name(), metadata: (:telemetry.event_metadata() -> :telemetry.event_metadata()), tags: tags(), description: description(), unit: unit() }
Common fields for metric specifications
Reporters should assume that these fields are present in all metric specifications.
tag()
View Source
tag() :: term()
tag() :: term()
tags()
View Source
tags() :: [tag()]
tags() :: [tag()]
unit()
View Source
unit() :: atom()
unit() :: atom()
Link to this section Functions
counter(event_name, options \\ [])
View Source
counter(event_name(), counter_options()) :: Telemetry.Metrics.Counter.t()
counter(event_name(), counter_options()) :: Telemetry.Metrics.Counter.t()
Returns a specification of counter metric.
See "Metric specifications" section in the top-level documentation of this module for more information.
Example
counter(
"http.request",
metadata: [:controller, :action] tags: [:controller, :action]
)
distribution(event_name, options)
View Source
distribution(event_name(), distribution_options()) ::
Telemetry.Metrics.Distribution.t()
distribution(event_name(), distribution_options()) :: Telemetry.Metrics.Distribution.t()
Returns a specification of distribution metric.
For a distribution metric, it is required that you include a :buckets
field in the options
keyword list.
See "Metric specifications" section in the top-level documentation of this module for more information.
Example
distribution(
"http.request",
buckets: [100, 200, 300],
tags: [:controller, :action],
)
last_value(event_name, options \\ [])
View Source
last_value(event_name(), last_value_options()) ::
Telemetry.Metrics.LastValue.t()
last_value(event_name(), last_value_options()) :: Telemetry.Metrics.LastValue.t()
Returns a specification of last value metric.
See "Metric specifications" section in the top-level documentation of this module for more information.
Example
last_value(
"vm.memory.total",
description: "Total amount of memory allocated by the Erlang VM", unit: :byte
)
sum(event_name, options \\ [])
View Source
sum(event_name(), sum_options()) :: Telemetry.Metrics.Sum.t()
sum(event_name(), sum_options()) :: Telemetry.Metrics.Sum.t()
Returns a specification of sum metric.
See "Metric specifications" section in the top-level documentation of this module for more information.
Example
sum("user.session_count.change", name: "user.session_count", metadata: [:role], tags: [:role])