View Source telemetry_poller (telemetry_poller v1.1.0)
A time-based poller to periodically dispatch Telemetry events.
A poller is a process start in your supervision tree with a list of measurements to perform periodically. On start it expects the period in milliseconds and a list of measurements to perform. Initial delay is an optional parameter that sets time delay in milliseconds before starting measurements:
telemetry_poller:start_link([
{measurements, Measurements},
{period, Period},
{init_delay, InitDelay}
])
The following measurements are supported:
* memory
(default)
* total_run_queue_lengths
(default)
* system_counts
(default)
* {process_info, Proplist}
* {Module, Function, Args}
We will discuss each measurement in detail. Also note that the telemetry_poller application ships with a built-in poller that measures memory
, total_run_queue_lengths
and system_counts
. This takes the VM measurement out of the way so your application can focus on what is specific to its behaviour.
memory
Memory
An event emitted as [vm, memory]
. The measurement includes all the key-value pairs returned by erlang:memory/0
function, e.g. total
for total memory, processes_used
for memory used by all processes, etc.
total-run-queue-lengths
Total run queue lengths
On startup, the Erlang VM starts many schedulers to do both IO and CPU work. If a process needs to do some work or wait on IO, it is allocated to the appropriate scheduler. The run queue is a queue of tasks to be scheduled. A length of a run queue corresponds to the amount of work accumulated in the system. If a run queue length is constantly growing, it means that the BEAM is not keeping up with executing all the tasks.
There are several run queue types in the Erlang VM. Each CPU scheduler (usually one per core) has its own run queue, and since Erlang 20.0 there is one dirty CPU run queue, and one dirty IO run queue.
The run queue length event is emitted as [vm, total_run_queue_lengths]
. The event contains no metadata and three measurements:
total
- a sum of all run queue lengthscpu
- a sum of CPU schedulers' run queue lengths, including dirty CPU run queue length on Erlang version 20 and greaterio
- length of dirty IO run queue. It's always 0 if running on Erlang versions prior to 20.
Note that the method of making this measurement varies between different Erlang versions: the implementation on versions earlier than Erlang/OTP 20 is less efficient.
The length of all queues is not gathered atomically, so the event value does not represent a consistent snapshot of the run queues' state. However, the value is accurate enough to help to identify issues in a running system.
system-counts
System counts
An event emitted as [vm, system_counts]
. The event contains no metadata and three measurements:
process_count
- number of process currently existing at the local nodeatom_count
- number of atoms currently existing at the local nodeport_count
- number of ports currently existing at the local node
All three measurements are from erlang:system_info/1
.
process-info
Process info
A measurement with information about a given process. It must be specified alongside a proplist with the process name, the event name, and a list of keys to be included:
{process_info, [
{name, my_app_worker},
{event, [my_app, worker]},
{keys, [message_queue_len, memory]}
]}
The keys
is a list of atoms accepted by erlang:process_info/2
.
custom-measurements
Custom measurements
Telemetry poller also allows you to perform custom measurements by passing a module-function-args tuple:
{my_app_example, measure, []}
The given function will be invoked periodically and they must explicitly invoke telemetry:execute/3
function. If the invocation of the MFA fails, the measurement is removed from the Poller.
For all options, see start_link/1
. The options listed there can be given to the default poller as well as to custom pollers.
default-poller
Default poller
A default poller is started with telemetry_poller
responsible for emitting measurements for memory
and total_run_queue_lengths
. You can customize the behaviour of the default poller by setting the default
key under the telemetry_poller
application environment. Setting it to false
disables the poller.
example-tracking-number-of-active-sessions-in-web-application
Example - tracking number of active sessions in web application
Let's imagine that you have a web application and you would like to periodically measure number of active user sessions.
-module(example_app).
session_count() ->
% logic for calculating session count.
To achieve that, we need a measurement dispatching the value we're interested in:
-module(example_app_measurements).
dispatch_session_count() ->
telemetry:execute([example_app, session_count], example_app:session_count()).
and tell the Poller to invoke it periodically:
telemetry_poller:start_link([{measurements, [{example_app_measurements, dispatch_session_count, []}]).
If you find that you need to somehow label the event values, e.g. differentiate between number of sessions of regular and admin users, you could use event metadata:
-module(example_app_measurements).
dispatch_session_count() ->
Regulars = example_app:regular_users_session_count(),
Admins = example_app:admin_users_session_count(),
telemetry:execute([example_app, session_count], #{count => Admins}, #{role => admin}),
telemetry:execute([example_app, session_count], #{count => Regulars}, #{role => regular}).
Note: the other solution would be to dispatch two different events by hooking up example_app:regular_users_session_count/0
and example_app:admin_users_session_count/0
functions directly. However, if you add more and more user roles to your app, you'll find yourself creating a new event for each one of them, which will force you to modify existing event handlers. If you can break down event value by some feature, like user role in this example, it's usually better to use event metadata than add new events. This is a perfect use case for poller, because you don't need to write a dedicated process which would call these functions periodically. Additionally, if you find that you need to collect more statistics like this in the future, you can easily hook them up to the same poller process and avoid creating lots of processes which would stay idle most of the time.
Link to this section Summary
Functions
Starts a poller linked to the calling process.
Link to this section Types
-type init_delay() :: non_neg_integer().
-type measurement() ::
memory | total_run_queue_lengths | system_counts |
{process_info, [{name, atom()} | {event, [atom()]} | {keys, [atom()]}]} |
{module(), atom(), list()}.
-type option() :: {name, atom() | gen_server:server_name()} | {period, period()} | {init_delay, init_delay()} | {measurements, [measurement()]}.
-type options() :: [option()].
-type period() :: pos_integer().
-type state() :: #{measurements => [measurement()], period => period()}.
-type t() :: gen_server:server_ref().
Link to this section Functions
-spec init(map()) -> {ok, state()}.
-spec list_measurements(t()) -> [measurement()].
-spec start_link(options()) -> gen_server:start_ret().
Starts a poller linked to the calling process.
Useful for starting Pollers as a part of a supervision tree.
Default options: [{name, telemetry_poller}, {period, timer:seconds(5)}, {init_delay, 0}]