telemetry_metrics_statsd v0.3.0 TelemetryMetricsStatsd View Source

Telemetry.Metrics reporter for StatsD-compatible metric servers.

To use it, start the reporter with the start_link/1 function, providing it a list of Telemetry.Metrics metric definitions:

import Telemetry.Metrics

TelemetryMetricsStatsd.start_link(
  metrics: [
    counter("http.request.count"),
    sum("http.request.payload_size"),
    last_value("vm.memory.total")
  ]
)

Note that in the real project the reporter should be started under a supervisor, e.g. the main supervisor of your application.

By default the reporter sends metrics to localhost:8125 - both hostname and port number can be configured using the :host and :port options.

Note that the reporter doesn't aggregate metrics in-process - it sends metric updates to StatsD whenever a relevant Telemetry event is emitted.

Translation between Telemetry.Metrics and StatsD

In this section we walk through how the Telemetry.Metrics metric definitions are mapped to StatsD metrics and their types at runtime.

Telemetry.Metrics metric names are translated as follows:

  • if the metric name was provided as a string, e.g. "http.request.count", it is sent to StatsD server as-is
  • if the metric name was provided as a list of atoms, e.g. [:http, :request, :count], it is first converted to a string by joiging the segments with dots. In this example, the StatsD metric name would be "http.request.count" as well

Since there are multiple implementations of StatsD and each of them provides slightly different set of features, other aspects of metric translation are controlled by the formatters. The formatter can be selected using the :formatter option. Currently only two formats are supported - :standard and :datadog.

The following table shows how Telemetry.Metrics metrics map to StatsD metrics:

Telemetry.MetricsStatsD
last_valuegauge, always set to an absolute value
countercounter, always increased by 1
sumgauge, increased and decreased by the provided value
summarytimer recording individual measurement
histogramReported as histogram if DataDog formatter is used

The standard StatsD formatter

The :standard formatter is compatible with the Etsy implementation of StatsD. Since this particular implementation doesn't support explicit tags, tag values are appended as consecutive segments of the metric name. For example, given the definition

counter("db.query.count", tags: [:table, :operation])

and the event

:telemetry.execute([:db, :query], %{}, %{table: "users", operation: "select"})

the StatsD metric name would be "db.query.count.users.select". Note that the tag values are appended to the base metric name in the order they were declared in the metric definition.

Another important aspect of the standard formatter is that all measurements are converted to integers, i.e. no floats are ever sent to the StatsD daemon.

Now to the metric types!

Counter

Telemetry.Metrics counter is simply represented as a StatsD counter. Each event the metric is based on increments the counter by 1. To be more concrete, given the metric definition

counter("http.request.count")

and the event

:telemetry.execute([:http, :request], %{duration: 120})

the following line would be send to StatsD

"http.request.count:1|c"

Note that the counter was bumped by 1, regardless of the measurements included in the event (careful reader will notice that the :count measurement we chose for the metric wasn't present in the map of measurements at all!). Such behaviour conforms to the specification of counter as defined by Telemetry.Metrics package - a counter should be incremented by 1 every time a given event is dispatched.

Last value

Last value metric is represented as a StatsD gauge, whose values are always set to the value of the measurement from the most recent event. With the following metric definition

last_value("vm.memory.total")

and the event

:telemetry.execute([:vm, :memory], %{total: 1024})

the following metric update would be send to StatsD

"vm.memory.total:1024|g"

Sum

Sum metric is also represented as a gauge - the difference is that it always changes relatively and is never set to an absolute value. Given metric definition below

sum("http.request.payload_size")

and the event

:telemetry.execute([:http, :request], %{payload_size: 1076})

the following line would be send to StatsD

"http.request.count:+1076|g"

When the measurement is negative, the StatsD gauge is decreased accordingly.

Summary

The summary is simply represented as a StatsD timer, since it should generate statistics about gathered measurements. Given the metric definition below

summary("http.request.duration")

and the event

:telemetry.execute([:http, :request], %{duration: 120})

the following line would be send to StatsD

"http.request.duration:120|ms"

Distribution

There is no metric in original StatsD implementation equivalent to Telemetry.Metrics distribution. However, histograms can be enabled for selected timer metrics in the StatsD daemon configuration. Because of that, the distribution is also reported as a timer. For example, given the following metric definition

distribution("http.request.duration", buckets: [0])

and the event

:telemetry.execute([:http, :request], %{duration: 120})

the following line would be send to StatsD

"http.request.duration:120|ms"

Since histograms are configured on the StatsD server side, the :buckets option has no effect when used with this reporter.

The DataDog formatter

The DataDog formatter is compatible with DogStatsD, the DataDog StatsD service bundled with its agent.

Tags

The main difference from the standard formatter is that DataDog supports explicit tagging in its protocol. Using the same example as with the standard formatter, given the following definition

counter("db.query.count", tags: [:table, :operation])

and the event

:telemetry.execute([:db, :query], %{}, %{table: "users", operation: "select"})

the metric update packet sent to StatsD would be db.query.count:1|c|#table:users,operation:select.

Metric types

The only difference between DataDog and standard StatsD metric types is that DataDog provides a dedicated histogram metric. That's why Telemetry.Metrics distribution is translated to DataDog histogram.

Also note that DataDog allows measurements to be floats, that's why no rounding is performed when formatting the metric.

Global tags

The library provides an option to specify a set of global tag values, which are available to all metrics running under the reporter.

For example, if you're running your application in multiple deployment environment (staging, production, etc.), you might set the environment as a global tag:

TelemetryMetricsStatsd.start_link(
  metrics: [
    counter("http.request.count", tags: [:env])
    ],
    global_tags: [env: "prod"]
)

Note that if the global tag is to be sent with the metric, the metric needs to have it listed under the :tags option, just like any other tag.

Also, if the same key is configured as a global tag and emitted as a part of event metadata or returned by the :tag_values function, the metadata/:tag_values take precedence and override the global tag value.

Prefixing metric names

Sometimes it's convenient to prefix all metric names with particular value, to group them by the name of the service, the host, or something else. You can use :prefix option to provide a prefix which will be prepended to all metrics published by the reporter (regardless of the formatter used).

Maximum datagram size

Metrics are sent to StatsD over UDP, so it's important that the size of the datagram does not exceed the Maximum Transmission Unit, or MTU, of the link, so that no data is lost on the way. By default the reporter will break up the datagrams at 512 bytes, but this is configurable via the :mtu option.

Link to this section Summary

Functions

Reporter's child spec.

Starts a reporter and links it to the calling process.

Link to this section Types

Link to this type

option() View Source
option() ::
  {:port, :inet.port_number()}
  | {:host, String.t()}
  | {:metrics, [Telemetry.Metrics.t()]}
  | {:mtu, non_neg_integer()}
  | {:prefix, prefix()}
  | {:formatter, :standard | :datadog}
  | {:global_tags, Keyword.t()}

Link to this type

options() View Source
options() :: [option()]

Link to this type

prefix() View Source
prefix() :: String.t() | nil

Link to this section Functions

Reporter's child spec.

This function allows you to start the reporter under a supervisor like this:

children = [
  {TelemetryMetricsStatsd, options}
]

See start_link/1 for a list of available options.

Starts a reporter and links it to the calling process.

The available options are:

  • :metrics - a list of Telemetry.Metrics metric definitions which will be published by the reporter
  • :host - hostname of the StatsD server. Defaults to "localhost".
  • :port - port number of the StatsD server. Defaults to 8125.
  • :formatter - determines the format of the metrics sent to the target server. Can be either :standard or :datadog. Defaults to :standard.
  • :prefix - a prefix prepended to the name of each metric published by the reporter. Defaults to nil.
  • :mtu - Maximum Transmission Unit of the link between your application and the StatsD server in bytes. This value should not be greater than the actual MTU since this could lead to the data loss when the metrics are published. Defaults to 512.
  • :global_tags - Additional default tag values to be sent along with every published metric. These can be overriden by tags sent via the :telemetry.execute call.

You can read more about all the options in the TelemetryMetricsStatsd module documentation.

Example

import Telemetry.Metrics

TelemetryMetricsStatsd.start_link(
  metrics: [
    counter("http.request.count"),
    sum("http.request.payload_size"),
    last_value("vm.memory.total")
  ],
  prefix: "my-service"
)