Open Gleametry

Basic usage:

import opengleametry/span

{
  use ctx <- span.with("example span", [])
  todo
}

Which results in a open telemetry ‘spanning’ the duration of the code in the use block. Such a span can have multiple properties, which you can set on construction (an empty list here), or during the code execution, see the documentation. One important property is ‘error’.

With extra infrastructure, These spans can then stored and queried for monitoring, debugging, performance and more reasons.

It is quite common to have spans within spans, say an actor that creates a span when receiving a message, then calls a DB layer, where the function creates its own span. These spans are automatically bound together, Open Telemetry calls this a trace.

Serious work

The example above makes a noop trace, since there is no more than the opentelemetry_api.

See example/ for the simplest example app that will emit a trace; it contains all of the instructions of this section. Be sure to run it with the instructions of the next section.

Exporter and Collector

The ‘noop’ bit is serious. The intent here is that the telemetry declarations (called ‘instrumentation’) impact your application as little as possible. That means ‘noop’ when there is no need to store the metrics, and it means an offloading as much as possible when there is need to store the metrics.

In the latter case, Open Telemetry needs an sdk to pick up the telemetry and an exporter to throw it over the fence to a system that stores it, a Collector. For this, your gleam application needs to depend on opengleametry, and also on opentelemetry sdk and opentelemetry_exporter or another exporter.

For the mentioned exporter, you best wait 1 second before launching your application - early spans are lost… You also should have your app run at least 5 seconds (demo effect), otherwise that exporter will not even initialise completely. Use the gleam logging package with its default logging.configure() for the “INFO” to show up.

It is a good idea to set up inets and ssl as extra applications (also to get rid of an error when booting), in your gleam.toml, like this:

[erlang]
extra_applications = ["inets", "ssl"]

Gleam will pick up the opentelemetry_exporter and opentelemetry applications automatically.

Collecting

With all that set, the exporter still does not do anything: you need to point it to a collector, and set another value or two when running your gleam application, e.g:

For bash:

export OTEL_SERVICE_NAME="your_application"
export OTEL_EXPORTER_OTLP_ENDPOINT="http://localhost:4318"
export OTEL_EXPORTER_OTLP_HEADERS="x-service-api-key=12345"
gleam run

For zsh, fish:

set -x OTEL_SERVICE_NAME "your_application"
set -x OTEL_EXPORTER_OTLP_ENDPOINT "http://localhost:4318"
set -x OTEL_EXPORTER_OTLP_HEADERS "x-service-api-key=12345"
gleam run

and you need to run a collector, e.g. jaeger all-in-one docker image:

docker run --rm --name jaeger --network host \
  cr.jaegertracing.io/jaegertracing/jaeger:2.10.0

jaeger all-in-one serves a [UI[(http://localhost:16686) for your browser.

Also, grafana:

docker run --rm --network host -ti grafana/otel-lgtm

Asynchronous work

When passing a link to a process or an actor (for example in this simple way), a span can be connected to its source.

// a link needs a current context; we create one here
use ctx <- span.with("Main", [])
let link = link.current()
let _pid = process.spawn(fn() {
  use ctx <- span.with_links("nested", [link], [])
  echo ctx
})

Crash Reports

By adding

opengleametry.remove_default_handler()
opengleametry.set_sasl_handler()

at the start of your program, you stop showing default crash reports (and more) and secondly, redirect those to opengleametry which creates a span named crash_report, with some useful information from the report; a report includes the process dictionary of the crashed process, which in turn includes the otel/opentelemetry trace and span id, iff the process crashed within any span.with block. That is, the crash report contains a link (aka reference) to the span in which the crash happened.

This assists with debugging unexpected problems in production.

To Do

A lot

Other

There are no tests, since we are only manipulating typed ffi data.

Check out Erlang “trace” concept. https://www.erlang.org/doc/apps/kernel/trace

Search Document