telemetry_poller (telemetry_poller v1.0.0) View Source

A time-based poller to periodically dispatch Telemetry events.

A poller is a process start in your supervision tree with a list of measurements to perform periodically. On start it expects the period in milliseconds and a list of measurements to perform:

  telemetry_poller:start_link([
    {measurements, Measurements},
    {period, Period}
  ])

The following measurements are supported:

* memory (default)

* total_run_queue_lengths (default)

* system_counts (default)

* {process_info, Proplist}

* {Module, Function, Args}

We will discuss each measurement in detail. Also note that the telemetry_poller application ships with a built-in poller that measures memory, total_run_queue_lengths and system_counts. This takes the VM measurement out of the way so your application can focus on what is specific to its behaviour.

Memory

An event emitted as [vm, memory]. The measurement includes all the key-value pairs returned by erlang:memory/0 function, e.g. total for total memory, processes_used for memory used by all processes, etc.

Total run queue lengths

On startup, the Erlang VM starts many schedulers to do both IO and CPU work. If a process needs to do some work or wait on IO, it is allocated to the appropriate scheduler. The run queue is a queue of tasks to be scheduled. A length of a run queue corresponds to the amount of work accumulated in the system. If a run queue length is constantly growing, it means that the BEAM is not keeping up with executing all the tasks.

There are several run queue types in the Erlang VM. Each CPU scheduler (usually one per core) has its own run queue, and since Erlang 20.0 there is one dirty CPU run queue, and one dirty IO run queue.

The run queue length event is emitted as [vm, total_run_queue_lengths]. The event contains no metadata and three measurements:

  • total - a sum of all run queue lengths
  • cpu - a sum of CPU schedulers' run queue lengths, including dirty CPU run queue length on Erlang version 20 and greater
  • io - length of dirty IO run queue. It's always 0 if running on Erlang versions prior to 20.

Note that the method of making this measurement varies between different Erlang versions: the implementation on versions earlier than Erlang/OTP 20 is less efficient.

The length of all queues is not gathered atomically, so the event value does not represent a consistent snapshot of the run queues' state. However, the value is accurate enough to help to identify issues in a running system.

System counts

An event emitted as [vm, system_counts]. The event contains no metadata and three measurements:

  • process_count - number of process currently existing at the local node
  • atom_count - number of atoms currently existing at the local node
  • port_count - number of ports currently existing at the local node

All three measurements are from erlang:system_info/1.

Process info

A measurement with information about a given process. It must be specified alongside a proplist with the process name, the event name, and a list of keys to be included:

  {process_info, [
   {name, my_app_worker},
   {event, [my_app, worker]},
   {keys, [message_queue_len, memory]}
  ]}

The keys is a list of atoms accepted by erlang:process_info/2.

Custom measurements

Telemetry poller also allows you to perform custom measurements by passing a module-function-args tuple:

  {my_app_example, measure, []}

The given function will be invoked periodically and they must explicitly invoke telemetry:execute/3 function. If the invocation of the MFA fails, the measurement is removed from the Poller.

For all options, see start_link/1. The options listed there can be given to the default poller as well as to custom pollers.

Default poller

A default poller is started with telemetry_poller responsible for emitting measurements for memory and total_run_queue_lengths. You can customize the behaviour of the default poller by setting the default key under the telemetry_poller application environment. Setting it to false disables the poller.

Example - tracking number of active sessions in web application

Let's imagine that you have a web application and you would like to periodically measure number of active user sessions.

  -module(example_app).
 
  session_count() ->
     % logic for calculating session count.

To achieve that, we need a measurement dispatching the value we're interested in:

  -module(example_app_measurements).
 
  dispatch_session_count() ->
     telemetry:execute([example_app, session_count], example_app:session_count()).

and tell the Poller to invoke it periodically:

  telemetry_poller:start_link([{measurements, [{example_app_measurements, dispatch_session_count, []}]).

If you find that you need to somehow label the event values, e.g. differentiate between number of sessions of regular and admin users, you could use event metadata:

  -module(example_app_measurements).
 
  dispatch_session_count() ->
     Regulars = example_app:regular_users_session_count(),
     Admins = example_app:admin_users_session_count(),
     telemetry:execute([example_app, session_count], #{count => Admins}, #{role => admin}),
     telemetry:execute([example_app, session_count], #{count => Regulars}, #{role => regular}).
Note: the other solution would be to dispatch two different events by hooking up example_app:regular_users_session_count/0 and example_app:admin_users_session_count/0 functions directly. However, if you add more and more user roles to your app, you'll find yourself creating a new event for each one of them, which will force you to modify existing event handlers. If you can break down event value by some feature, like user role in this example, it's usually better to use event metadata than add new events. This is a perfect use case for poller, because you don't need to write a dedicated process which would call these functions periodically. Additionally, if you find that you need to collect more statistics like this in the future, you can easily hook them up to the same poller process and avoid creating lots of processes which would stay idle most of the time.

Link to this section Summary

Link to this section Types

Specs

measurement() ::
    memory | total_run_queue_lengths | system_counts |
    {process_info, [{name, atom()} | {event, [atom()]} | {keys, [atom()]}]} |
    {module(), atom(), list()}.

Specs

option() ::
    {name, gen_server:name() | gen_server:server_name()} |
    {period, period()} |
    {measurements, [measurement()]}.

Specs

options() :: [option()].

Specs

period() :: pos_integer().

Specs

state() :: #{measurements => [measurement()], period => period()}.

Specs

t() :: gen_server:server().

Link to this section Functions

Returns a child spec for the poller for running under a supervisor.
Link to this function

code_change(OldVsn, State, Extra)

View Source
Link to this function

handle_call(Request, From, State)

View Source

Specs

init(map()) -> {ok, state()}.
Link to this function

list_measurements(Poller)

View Source

Specs

list_measurements(t()) -> [measurement()].
Returns a list of measurements used by the poller.
Link to this function

make_measurement(Measurement)

View Source

Specs

make_measurement(measurement()) -> measurement() | no_return().
Link to this function

make_measurements_and_filter_misbehaving(Measurements)

View Source

Specs

make_measurements_and_filter_misbehaving([measurement()]) -> [measurement()].

Specs

parse_measurement(measurement()) -> {module(), atom(), list()}.
Link to this function

parse_measurements(Measurements)

View Source

Specs

parse_measurements([measurement()]) -> [{module(), atom(), list()}].
Link to this function

schedule_measurement(CollectInMillis)

View Source

Specs

schedule_measurement(non_neg_integer()) -> ok.

Specs

start_link(options()) -> gen_server:on_start().

Starts a poller linked to the calling process.

Useful for starting Pollers as a part of a supervision tree.

Default options: [{name, telemetry_poller}, {period, timer:seconds(5)}]
Link to this function

terminate(Reason, State)

View Source

Specs

validate_period(term()) -> ok | no_return().