Alarmist.RemedyWorker (alarmist v0.4.1)

View Source

Remedy callback runner

This module handles the common concerns with running the code that fixes alarms. Users don't call this module directly, but how it works can be useful. Callbacks should be registered in a module that has use Alarmist.Alarm or by calling Alarmist.add_remedy/2.

You can think of this module as a supervised GenServer that listens for an Alarm ID and runs a callback function when it's set. It has a few more features, though:

  1. If the alarm toggles back and forth while a callback is running, the events don't queue. The callback is run to completion.
  2. A timer can be set on the callback to kill the process if it hangs.
  3. If the alarm persists, the callback can be called again after a configurable retry timeout.

One would hope to not need any of these features. Alarms usually don't happen under normal operation, though, so some additional bulletproofing can be nice.

The following options control the handling:

  • :retry_timeout — time to wait for the alarm to be cleared before calling the callback again (default: :infinity)
  • :callback_timeout — time to wait for the callback to run (default: 60 seconds)

Since the :retry_timeout defaults to :infinity, the callback is only called when the alarm gets set or if the RemedyWorker gets restarted.

State Machine Diagram

stateDiagram-v2
  [*] --> clear : initial state

  clear --> running : alarm set

  running --> finishing_run : alarm cleared
  running --> waiting_to_retry : callback completes or times out

  waiting_to_retry --> running : retry delay timer expires
  waiting_to_retry --> clear : alarm cleared

  finishing_run --> clear : callback completes or timeouts
  finishing_run --> running : alarm set

Summary

Functions

Start the remedy worker

Stop a worker

Functions

start_link(opts)

@spec start_link(Keyword.t()) :: GenServer.on_start()

Start the remedy worker

Options:

  • :alarm_id — Alarm ID that the callback remedies (required)
  • :task_supervisor — name or pid of Task.Supervisor
  • :remedy — see Alarmist.Alarm.__using__/1

stop(alarm_id)

@spec stop(Alarmist.alarm_id()) :: :ok | {:error, :not_found}

Stop a worker

If the worker is in the process of calling a callback, it will kill the callback process too.