telemetry_metrics v0.1.0 Telemetry.Metrics View Source

Data model and specifications for aggregating Telemetry events.

Metrics are responsible for aggregating Telemetry events with the same name in order to gain any useful knowledge about the events.

Please note that Telemetry.Metrics package itself doesn't provide any functionality for aggregating metrics. This library only defines the data model and specifications for aggregations which should be implemented by reporters - libraries exporting metrics to external systems. You can read more about reporters in the "Reporters" section below.

Data model

Telemetry.Metrics imposes a multi-dimensional data model - a single metric may generate multiple aggregations, each aggregation being bound to a unique set of tag values. Tags are pairs of key-values derived from event metadata (in the simplest case, tags are a subset of the metadata). Based on the tag values, the value of the event will be used to generate one of the aggregations.

For example, imagine that you want to count how many requests are being made against your web application. On each request, you might emit an event with the name of the controller and action handling that request, e.g.:

:telemetry.execute([:http, :request], 1, %{controller: "user_controller", action: "index"})
:telemetry.execute([:http, :request], 1, %{controller: "user_controller", action: "index"})
:telemetry.execute([:http, :request], 1, %{controller: "user_controller", action: "create"})
:telemetry.execute([:http, :request], 1, %{controller: "product_controller", action: "get"})

With multi-dimensional data model, the result of aggregating those events by :controller and :action tags would look like this:

controller	action	count
`user_controller`	`index`	2
`user_controller`	`create`	1
`product_controller`	`get`	1

You can see that the request count is broken down by unique set of tag values.

Metric types

Metric type specifies how the event values are aggregated. Telemetry.Metrics aims to define a set of metric types covering the most common instrumentation patterns.

Metric types below are heavily inspired by OpenCensus.

Counter

Value of the counter metric is the number of emitted events, regardless of event value. It's monotonically increasing and its value is never reset.

Sum

Value of the sum metric is the sum of event values.

LastValue

Value of this metric is the value of the most recent event.

Distribution

The value of this metric is a histogram distribution of event values, i.e. how many events were emitted with values falling into defined buckets. Histogram values can be used to compute approximation of useful statistics about the data, like quantiles, minimum or maximum.

For example, given boundaries [0, 100, 200], the distribution metric produces four values:

number of event values less than or equal to 0
number of event values greater than 0 and less than or equal to 100
number of event values greater than 100 and less than or equal to 200
number of event values greater than 200

Metric specifications

Metric specification is a data structure describing the metric - its name, type, name of the events aggregated by the metric, etc. The structure of metric specification is relevant only to authors of reporters.

Metric specifications are created using one of the four functions: counter/2, sum/2, last_value/2 and distribution/2. Each of those functions returns a specification of metric of the corresponding type. The first argument to all these functions is the name of events which are aggregated by the metric. Event name might be represented as in Telemetry, i.e. as a list of atoms ([:http, :request]), or as a string of words joined by dots ("http.request").

Note: do not use data from external sources as metric or event names! Since they are converted to atoms, your application becomes vulnerable to atom leakage and might run out of memory.

The second argument is a list of options. Below is the description of the options common to all metric types:

:name - the metric name. Metric name can be represented in the same way as event name. Defaults to event name given as first argument;
:tags - tags by which aggregations will be broken down. Defaults to an empty list;
:metadata - determines what part of event metadata is used as the source of tag values. Default value is the value of :tags or empty list if :tags are not set. There are three possible values of this option:
- :all - all event metadata is used;
- list of terms, e.g. [:table, :kind] - only these keys from the event metadata are used;
- one argument function taking the event metadata and returning the metadata which should be used to generate tag values
:description - human-readable description of the metric. Might be used by reporters for documentation purposes. Defaults to nil;
:unit - an atom describing the unit of event values. Might be used by reporters for documentation purposes. Defaults to :unit.

Reporters

Reporters take metric definitions as an input, subscribe to relevant events and update the metrics when the events are emitted. Updating the metric might involve publishing the metrics periodically, or on demand, to external systems. Telemetry.Metrics defines only specification for metric types, and reporters should provide actual implementation for these aggregations.

Rationale

The design proposed by Telemetry.Metrics might look controversial - unlike most of the libraries available on the BEAM, it doesn't aggregate metrics itself, it merely defines what users should expect when using the reporters. There are two arguments for this solution. if Telemetry.Metrics would aggregate metrics, the way those aggregations work would be imposed on the system where the metrics are published to. For example, counters in StatsD are reset on every flush and can be decremented, whereas counters in Prometheus are monotonically increasing. Telemetry.Metrics doesn't focus on those details - instead, it describes what the end user, operator, expects to see when using the metric of particular type. This implies that in most cases aggregated metrics won't be visible inside the BEAM, but in exchange aggregations can be implemented in a way that makes most sense for particular system. Finally, one could also implement an in-VM "reporter" which would aggregate the metrics and expose them inside the BEAM. When there is a need to swap the reporters, and if both reporters are following the metric types specification, then the end result of aggregation is the same, regardless of the backend system in use.

Requirements for reporters

Reporters should accept metric specifications and subscribe to relevant events. When those events are emitted, metric should be updated (either in-memory or by contacting external system) in such a way that the user is able to view metric values as described in the "Metric types" section.

If the reporter does not support the metric given to it, it should log a warning.

Reporters should also document how Telemetry.Metrics metric types, names tags are translated to metric types and identifiers in the system they publish metrics to.

Link to this section Summary

Types

counter_options()

description()

distribution_options()

event_name()

last_value_options()

metadata()

metric_name()

metric_option()

metric_type()

normalized_metric_name()

sum_options()

t()

Common fields for metric specifications

tag()

tags()

unit()

Functions

counter(event_name, options \\ [])

Returns a specification of counter metric

distribution(event_name, options)

Returns a specification of distribution metric

last_value(event_name, options \\ [])

Returns a specification of last value metric

sum(event_name, options \\ [])

Returns a specification of sum metric

Link to this section Types

counter_options()

counter_options() :: [metric_option()]

description()

description() :: nil | String.t()

distribution_options()

distribution_options() :: [
  metric_option() | {:buckets, Telemetry.Metrics.Distribution.buckets()}
]

event_name()

event_name() :: String.t() | :telemetry.event_name()

last_value_options()

last_value_options() :: [metric_option()]

metadata()

metadata() ::
  :all
  | [key :: term()]
  | (:telemetry.event_metadata() -> :telemetry.event_metadata())

metric_name()

metric_name() :: String.t() | normalized_metric_name()

metric_option()

metric_option() ::
  {:name, metric_name()}
  | {:metadata, metadata()}
  | {:tags, tags()}
  | {:description, description()}
  | {:unit, unit()}

metric_type()

metric_type() :: :counter | :sum | :last_value | :distribution

normalized_metric_name()

normalized_metric_name() :: [atom(), ...]

sum_options()

sum_options() :: [metric_option()]

t()

t() :: %module(){
  name: normalized_metric_name(),
  event_name: :telemetry.event_name(),
  metadata: (:telemetry.event_metadata() -> :telemetry.event_metadata()),
  tags: tags(),
  description: description(),
  unit: unit()
}

Common fields for metric specifications

Reporters should assume that these fields are present in all metric specifications.

tag()

tag() :: term()

tags()

tags() :: [tag()]

unit()

unit() :: atom()

Link to this section Functions

counter(event_name, options \\ [])

counter(event_name(), counter_options()) :: Telemetry.Metrics.Counter.t()

Returns a specification of counter metric.

See "Metric specifications" section in the top-level documentation of this module for more information.

Example

counter(
  "http.request",
  metadata: [:controller, :action] tags: [:controller, :action]
)

distribution(event_name, options)

distribution(event_name(), distribution_options()) ::
  Telemetry.Metrics.Distribution.t()

Returns a specification of distribution metric.

For a distribution metric, it is required that you include a :buckets field in the options keyword list.

See "Metric specifications" section in the top-level documentation of this module for more information.

Example

distribution(
  "http.request",
  buckets: [100, 200, 300],
  tags: [:controller, :action],
)

last_value(event_name, options \\ [])

last_value(event_name(), last_value_options()) ::
  Telemetry.Metrics.LastValue.t()

Returns a specification of last value metric.

See "Metric specifications" section in the top-level documentation of this module for more information.

Example

last_value(
  "vm.memory.total",
  description: "Total amount of memory allocated by the Erlang VM", unit: :byte
)

sum(event_name, options \\ [])

sum(event_name(), sum_options()) :: Telemetry.Metrics.Sum.t()

Returns a specification of sum metric.

See "Metric specifications" section in the top-level documentation of this module for more information.

Example

sum("user.session_count.change", name: "user.session_count", metadata: [:role], tags: [:role])

telemetry_metrics

v0.1.0

telemetry_metrics v0.1.0 Telemetry.Metrics View Source

Data model

Metric types

Counter

Sum

LastValue

Distribution

Metric specifications

Reporters

Rationale

Requirements for reporters

Link to this section Summary

Types

Functions

Link to this section Types

counter_options() View Source counter_options() :: [metric_option()]

description() View Source description() :: nil | String.t()

distribution_options() View Source distribution_options() :: [ metric_option() | {:buckets, Telemetry.Metrics.Distribution.buckets()} ]

event_name() View Source event_name() :: String.t() | :telemetry.event_name()

last_value_options() View Source last_value_options() :: [metric_option()]

metadata() View Source metadata() :: :all | [key :: term()] | (:telemetry.event_metadata() -> :telemetry.event_metadata())

metric_name() View Source metric_name() :: String.t() | normalized_metric_name()

metric_option() View Source metric_option() :: {:name, metric_name()} | {:metadata, metadata()} | {:tags, tags()} | {:description, description()} | {:unit, unit()}

metric_type() View Source metric_type() :: :counter | :sum | :last_value | :distribution

normalized_metric_name() View Source normalized_metric_name() :: [atom(), ...]

sum_options() View Source sum_options() :: [metric_option()]

t() View Source t() :: %module(){ name: normalized_metric_name(), event_name: :telemetry.event_name(), metadata: (:telemetry.event_metadata() -> :telemetry.event_metadata()), tags: tags(), description: description(), unit: unit() }

tag() View Source tag() :: term()

tags() View Source tags() :: [tag()]

unit() View Source unit() :: atom()

Link to this section Functions

counter(event_name, options \\ []) View Source counter(event_name(), counter_options()) :: Telemetry.Metrics.Counter.t()

Example

distribution(event_name, options) View Source distribution(event_name(), distribution_options()) :: Telemetry.Metrics.Distribution.t()

Example

last_value(event_name, options \\ []) View Source last_value(event_name(), last_value_options()) :: Telemetry.Metrics.LastValue.t()

Example

sum(event_name, options \\ []) View Source sum(event_name(), sum_options()) :: Telemetry.Metrics.Sum.t()

Example

counter_options() View Source

counter_options() :: [metric_option()]

description() View Source

description() :: nil | String.t()

distribution_options() View Source

distribution_options() :: [ metric_option() | {:buckets, Telemetry.Metrics.Distribution.buckets()} ]

event_name() View Source

event_name() :: String.t() | :telemetry.event_name()

last_value_options() View Source

last_value_options() :: [metric_option()]

metadata() View Source

metadata() :: :all | [key :: term()] | (:telemetry.event_metadata() -> :telemetry.event_metadata())

metric_name() View Source

metric_name() :: String.t() | normalized_metric_name()

metric_option() View Source

metric_option() :: {:name, metric_name()} | {:metadata, metadata()} | {:tags, tags()} | {:description, description()} | {:unit, unit()}

metric_type() View Source

metric_type() :: :counter | :sum | :last_value | :distribution

normalized_metric_name() View Source

normalized_metric_name() :: [atom(), ...]

sum_options() View Source

sum_options() :: [metric_option()]

t() View Source

t() :: %module(){ name: normalized_metric_name(), event_name: :telemetry.event_name(), metadata: (:telemetry.event_metadata() -> :telemetry.event_metadata()), tags: tags(), description: description(), unit: unit() }

tag() View Source

tag() :: term()

tags() View Source

tags() :: [tag()]

unit() View Source

unit() :: atom()

counter(event_name, options \\ []) View Source

counter(event_name(), counter_options()) :: Telemetry.Metrics.Counter.t()

distribution(event_name, options) View Source

distribution(event_name(), distribution_options()) :: Telemetry.Metrics.Distribution.t()

last_value(event_name, options \\ []) View Source

last_value(event_name(), last_value_options()) :: Telemetry.Metrics.LastValue.t()

sum(event_name, options \\ []) View Source

sum(event_name(), sum_options()) :: Telemetry.Metrics.Sum.t()