Telemetry

ExGram emits :telemetry events at key points in its lifecycle, letting you integrate with any observability tool that speaks the :telemetry protocol - including Prometheus (via TelemetryMetrics), OpenTelemetry, or simple structured logging.

OpenTelemetry

To have OpenTelemetry traces and spans in your bots, use the OpentelemetryExGram library. It attaches to the events below and creates OTel spans automatically, with full context propagation across GenServer and spawn boundaries.

See the OpenTelemetry guide for setup instructions and available spans.

Events overview

Event	Description
`[:ex_gram, :bot, :init, :start\|:stop]`	Bot dispatcher initialization span
`[:ex_gram, :bot, :shutdown]`	Bot dispatcher shutting down
`[:ex_gram, :updates, :init, :start\|:stop]`	Updates worker initialization span (polling, webhook, noup, test)
`[:ex_gram, :updates, :shutdown]`	Updates worker shutting down
`[:ex_gram, :request, :start\|:stop\|:exception]`	Outbound Telegram API call
`[:ex_gram, :update, :start\|:stop\|:exception]`	Incoming update dispatched to the bot
`[:ex_gram, :handler, :start\|:stop\|:exception]`	Your `handle/2` callback invocation
`[:ex_gram, :middleware, :start\|:stop\|:exception]`	Each middleware in the pipeline
`[:ex_gram, :polling, :start\|:stop\|:exception]`	One polling cycle (fetch + dispatch)

All events are spans and carry duration in :native time units on the :stop event. The :shutdown events are point-in-time and carry system_time. Convert to milliseconds with:

System.convert_time_unit(duration, :native, :millisecond)

See ExGram.Telemetry for the full metadata reference for each event.

Attaching handlers

Use :telemetry.attach/4 (one event) or :telemetry.attach_many/4 (multiple events) to subscribe. A handler is a 4-arity function: (event_name, measurements, metadata, config) -> any.

defmodule MyApp.Telemetry do
  require Logger

  def setup do
    :telemetry.attach_many(
      "my-app-ex-gram",
      [
        [:ex_gram, :bot, :init, :start],
        [:ex_gram, :bot, :init, :stop],
        [:ex_gram, :bot, :shutdown],
        [:ex_gram, :updates, :init, :start],
        [:ex_gram, :updates, :init, :stop],
        [:ex_gram, :updates, :shutdown],
        [:ex_gram, :request, :start],
        [:ex_gram, :request, :stop],
        [:ex_gram, :request, :exception],
        [:ex_gram, :update, :start],
        [:ex_gram, :update, :stop],
        [:ex_gram, :handler, :start],
        [:ex_gram, :handler, :stop],
        [:ex_gram, :handler, :exception],
        [:ex_gram, :middleware, :start],
        [:ex_gram, :middleware, :stop],
        [:ex_gram, :polling, :start],
        [:ex_gram, :polling, :stop],
      ],
      &__MODULE__.handle_event/4,
      nil
    )
  end

  def handle_event(event, measurements, metadata, _config) do
    Logger.debug("Telemetry: #{inspect(event)} #{inspect(measurements)} #{inspect(metadata)}")
  end
end

Call MyApp.Telemetry.setup/0 in your Application.start/2 before starting your supervision tree.

Tracking bot and updates lifecycle

The :init span and :shutdown events let you log when bots and their updates workers come up or go down - useful for auditing restarts in production. The :stop event carries a duration measurement so you can alert on slow startups:

def handle_event([:ex_gram, :bot, :init, :start], _measurements, metadata, _) do
  Logger.info("[ExGram] bot initializing bot=#{metadata.bot}")
end

def handle_event([:ex_gram, :bot, :init, :stop], measurements, metadata, _) do
  duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
  Logger.info("[ExGram] bot ready bot=#{metadata.bot} init_ms=#{duration_ms}")
end

def handle_event([:ex_gram, :bot, :shutdown], _measurements, metadata, _) do
  Logger.warning("[ExGram] bot stopping bot=#{metadata.bot}")
end

def handle_event([:ex_gram, :updates, :init, :start], _measurements, metadata, _) do
  Logger.info("[ExGram] updates worker initializing bot=#{metadata.bot} method=#{metadata.method}")
end

def handle_event([:ex_gram, :updates, :init, :stop], measurements, metadata, _) do
  duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)
  Logger.info("[ExGram] updates worker ready bot=#{metadata.bot} method=#{metadata.method} init_ms=#{duration_ms}")
end

def handle_event([:ex_gram, :updates, :shutdown], _measurements, metadata, _) do
  Logger.warning("[ExGram] updates worker stopping bot=#{metadata.bot} method=#{metadata.method}")
end

Logging API requests

Log every outbound Telegram API call with its duration and result:

def handle_event([:ex_gram, :request, :stop], measurements, metadata, _) do
  duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

  case metadata.result do
    {:ok, _} ->
      Logger.info("[ExGram] #{metadata.method} OK in #{duration_ms}ms bot=#{metadata.bot}")

    {:error, error} ->
      Logger.warning(
        "[ExGram] #{metadata.method} ERROR in #{duration_ms}ms " <>
          "code=#{error.code} message=#{error.message} bot=#{metadata.bot}"
      )
  end
end

def handle_event([:ex_gram, :request, :exception], measurements, metadata, _) do
  duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

  Logger.error(
    "[ExGram] #{metadata.method} EXCEPTION in #{duration_ms}ms " <>
      "kind=#{metadata.kind} reason=#{inspect(metadata.reason)} bot=#{metadata.bot}"
  )
end

Tracking update processing latency

Measure the full pipeline (middleware + routing) for each update:

def handle_event([:ex_gram, :update, :stop], measurements, metadata, _) do
  duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

  Logger.info(
    "[ExGram] update processed in #{duration_ms}ms " <>
      "bot=#{metadata.bot} halted=#{metadata.halted}"
  )
end

Alerting on handler exceptions

The [:ex_gram, :handler, :exception] event fires when your handle/2 callback raises. Use it to alert or capture errors:

def handle_event([:ex_gram, :handler, :exception], _measurements, metadata, _) do
  Logger.error(
    "[ExGram] handler raised in bot=#{metadata.bot} " <>
      "handler=#{metadata.handler} kind=#{metadata.kind} " <>
      "reason=#{inspect(metadata.reason)}"
  )

  # Forward to your error tracker (e.g. Sentry, Honeybadger)
  # Sentry.capture_exception(metadata.reason, stacktrace: metadata.stacktrace)
end

Prometheus metrics with `telemetry_metrics`

Add {:telemetry_metrics, "~> 1.0"} and {:telemetry_poller, "~> 1.0"} to your app's deps, then define metrics:

defmodule MyApp.Telemetry do
  import Telemetry.Metrics

  def metrics do
    [
      # Count bot starts and stops
      counter("ex_gram.bot.init.stop.count", tags: [:bot]),
      counter("ex_gram.bot.shutdown.count", tags: [:bot]),

      # Distribution of bot initialization duration
      distribution("ex_gram.bot.init.stop.duration",
        unit: {:native, :millisecond},
        tags: [:bot]
      ),

      # Count updates worker starts and stops
      counter("ex_gram.updates.init.stop.count", tags: [:bot, :method]),
      counter("ex_gram.updates.shutdown.count", tags: [:bot, :method]),

      # Count every API call
      counter("ex_gram.request.stop.count",
        tags: [:method, :bot]
      ),

      # Distribution of API call durations
      distribution("ex_gram.request.stop.duration",
        unit: {:native, :millisecond},
        tags: [:method, :bot],
        reporter_options: [buckets: [10, 50, 100, 500, 1000, 5000]]
      ),

      # Count API errors
      counter("ex_gram.request.stop.error_count",
        keep: &match?({:error, _}, &1.result),
        tags: [:method, :bot]
      ),

      # Count incoming updates
      counter("ex_gram.update.stop.count",
        tags: [:bot]
      ),

      # Distribution of update processing time
      distribution("ex_gram.update.stop.duration",
        unit: {:native, :millisecond},
        tags: [:bot]
      ),

      # Count handler exceptions
      counter("ex_gram.handler.exception.count",
        tags: [:bot, :handler]
      ),

      # Distribution of polling cycle durations
      distribution("ex_gram.polling.stop.duration",
        unit: {:native, :millisecond},
        tags: [:bot]
      ),
    ]
  end
end

Middleware timing

If your bot has multiple middlewares, you can measure each one individually:

def handle_event([:ex_gram, :middleware, :stop], measurements, metadata, _) do
  duration_ms = System.convert_time_unit(measurements.duration, :native, :millisecond)

  Logger.debug(
    "[ExGram] middleware #{inspect(metadata.middleware)} in #{duration_ms}ms " <>
      "bot=#{metadata.bot} halted=#{metadata.halted}"
  )
end

Handler vs. update events

It is worth understanding the distinction between update and handler events:

[:ex_gram, :update, ...] - spans middleware execution and update routing. When using async dispatch (the default), the stop event fires after the handler process is spawned, not after the handler finishes. The duration reflects middleware + routing overhead only.
[:ex_gram, :handler, ...] - spans the actual handle/2 callback. This event always fires in the process executing the handler (the spawned process for async dispatch, or the GenServer process for sync dispatch).

For end-to-end latency tracking, subscribe to [:ex_gram, :handler, :stop].

← Previous Page OpenTelemetry

Next Page → Configure multiple bots