telemetry_metrics v0.2.1 Telemetry.Metrics View Source

Common interface for defining metrics based on :telemetry events.

Metrics are aggregations of Telemetry events with specific name, providing a view of the system's behaviour over time.

For example, to build a sum of HTTP request payload size received by your system, you could define a following metric:

sum("http.request.payload_size")

This definition means that the metric is based on [:http, :request] events, and it should sum up values under :payload_size key in events' measurements.

Telemetry.Metrics also supports breaking down the metric values by tags - this means that there will be a distinct metric for each unique set of selected tags found in event metadata:

sum("http.request.payload_size", tags: [:host, :method])

The above definiton means that we want to keep track of the sum, but for each unique pair of request host and method (assuming that :host and :method keys are present in event's metadata).

There are four metric types provided by Telemetry.Metrics:

  • counter/2 which counts the total number of emitted events
  • sum/2 which keeps track of the sum of selected measurement
  • last_value/2 holding the value of the selected measurement from the most recent event
  • distribution/2 which builds a histogram of selected measurement

Note that these metric definitions by itself are not enough, as they only provide the specification of what is the expected end-result. The job of subscribing to events and building the actual metrics is a responsibility of reporters (described in the "Reporters" section).

Metric definitions

Metric definition is a data structure describing the metric - its name, type, name of the events aggregated by the metric, etc. The structure of metric definition is relevant only to authors of reporters.

Metric definitions are created using one of the four functions: counter/2, sum/2, last_value/2 and distribution/2. Each of those functions returns a definition of metric of the corresponding type.

The first argument to all these functions is the metric name. Metric name can be provided as a string (e.g. "http.request.latency") or a list of atoms ([:http, :request, :latency]). If not overriden in the metric options, metric name also determines the name of Telemetry event and measurement used to produce metric values. In the "http.request.latency" example, the source event name is [:http, :request] and metric values are drawn from :latency measurement.

Note: do not use data from external sources as metric or event names! Since they are converted to atoms, your application becomes vulnerable to atom leakage and might run out of memory.

The second argument is a list of options. Below is the description of the options common to all metric types:

  • :event_name - the source event name. Can be represented either as a string (e.g. "http.request") or a list of atoms ([:http, :request]). By default the event name is all but the last segment of the metric name.
  • :measurement - the event measurement used as a source of a metric values. By default it is the last segment of the metric name. It can be either an arbitrary term, a key in the event's measurements map, or a function accepting the whole measurements map and returning the actual value to be used.
  • :tags - a subset of metadata keys by which aggregations will be broken down. Defaults to an empty list.
  • :tag_values - a function that receives the metadata and returns a map with the tags as keys and their respective values. Defaults to returning the metadata itself.
  • :description - human-readable description of the metric. Might be used by reporters for documentation purposes. Defaults to nil.
  • :unit - an atom describing the unit of selected measurement or a tuple indicating that a measurement should be converted from one unit to another before a metric is updated. Currently, only time unit conversions are supported. For example, setting this option to {:native, :millisecond} means that the measurements are provided in the :native time unit (you can read more about it in the documentation for System.convert_time_unit/3), but a metric should have its values in milliseconds. Both elements of the conversion tuple need to be of type time_unit/0.

Reporters

Reporters take metric definitions as an input, subscribe to relevant events and update the metrics when the events are emitted. Updating the metric might involve publishing the metrics periodically, or on demand, to external systems. Telemetry.Metrics defines only how metrics of particular type should behave and reporters should provide actual implementation for these aggregations.

Telemetry.Metrics package does not include any reporter itself.

Link to this section Summary

Types

The name of the metric, either as string or a list of atoms.

The name of the metric represented as a list of atoms.

t()

Common fields for metric specifications

Functions

Returns a definition of counter metric.

Returns a definition of distribution metric.

Returns a definition of last value metric.

Returns a definition of sum metric.

Link to this section Types

Link to this type

counter_options() View Source
counter_options() :: [metric_option()]

Link to this type

description() View Source
description() :: nil | String.t()

Link to this type

distribution_options() View Source
distribution_options() :: [
  metric_option() | {:buckets, Telemetry.Metrics.Distribution.buckets()}
]

Link to this type

last_value_options() View Source
last_value_options() :: [metric_option()]

The name of the metric, either as string or a list of atoms.

Link to this type

metric_option() View Source
metric_option() ::
  {:event_name, :telemetry.event_name()}
  | {:measurement, measurement()}
  | {:tags, tags()}
  | {:tag_values, tag_values()}
  | {:description, description()}
  | {:unit, unit() | unit_conversion()}

Link to this type

normalized_metric_name() View Source
normalized_metric_name() :: [atom(), ...]

The name of the metric represented as a list of atoms.

Link to this type

sum_options() View Source
sum_options() :: [metric_option()]

Link to this type

t() View Source
t() :: %module(){
  name: normalized_metric_name(),
  measurement: measurement(),
  event_name: :telemetry.event_name(),
  tags: tags(),
  tag_values: (:telemetry.event_metadata() -> :telemetry.event_metadata()),
  description: description(),
  unit: unit()
}

Common fields for metric specifications

Reporters should assume that these fields are present in all metric specifications.

Link to this type

time_unit() View Source
time_unit() :: :second | :millisecond | :microsecond | :nanosecond | :native

Link to this type

unit_conversion() View Source
unit_conversion() :: {time_unit(), time_unit()}

Link to this section Functions

Returns a definition of counter metric.

Counter metric keeps track of the total number of specific events emitted.

Note that for the counter metric it doesn't matter what measurement is selected, as it is ignored by reporters anyway.

See the "Metric definitions" section in the top-level documentation of this module for more information.

Example

counter(
  "http.request.count",
  tags: [:controller, :action]
)

Returns a definition of distribution metric.

Distribution metric builds a histogram of selected measurement's values. Because of that, it is required that you specify the histograms buckets via :buckets option.

For example, given buckets: [0, 100, 200], the distribution metric produces four values:

  • number of measurements less than or equal to 0
  • number of measurements greater than 0 and less than or equal to 100
  • number of measurements greater than 100 and less than or equal to 200
  • number of measurements greater than 200

See the "Metric definitions" section in the top-level documentation of this module for more information.

Example

distribution(
  "http.request.latency",
  buckets: [100, 200, 300],
  tags: [:controller, :action],
)
Link to this function

last_value(metric_name, options \\ []) View Source

Returns a definition of last value metric.

Last value keeps track of the selected measurement found in the most recent event.

See the "Metric definitions" section in the top-level documentation of this module for more information.

Example

last_value(
  "vm.memory.total",
  description: "Total amount of memory allocated by the Erlang VM", unit: :byte
)

Returns a definition of sum metric.

Sum metric keeps track of the sum of selected measurement's values carried by specific events.

See the "Metric definitions" section in the top-level documentation of this module for more information.

Example

sum(
  "user.session_count",
  event_name: "user.session_count",
  measurement: :delta,
  tags: [:role]
)