View Source Shoehorn.Handler behaviour (shoehorn v0.9.2)

A behaviour module for implementing handling of failing applications

A Shoehorn.Handler is a module that knows how to respond to specific applications going down. There are two types of failing applications. The first is an application that fails to initialize and the second is an application that stops while it is running.

Example

The Shoehorn.Handler behaviour requires developers to implement two callbacks.

The init callback sets up any state the handler needs.

The application_started callback is called when an application starts up.

The application_exited callback processes the incoming failure and replies with the reaction that Shoehorn should take in case of application failure.

    defmodule Example.ShoehornHandler do
      @behaviour Shoehorn.Handler

      @impl Shoehorn.Handler
      def init(_opts) do
        {:ok, %{restart_counts: 0}}
      end

      @impl Shoehorn.Handler
      def application_started(app, state) do
        {:continue, state}
      end

      @impl Shoehorn.Handler
      def application_exited(:non_essential_app, _reason,  state) do
        {:continue, state}
      end

      def application_exited(:essential_app, _reason, %{restart_counts: restart_counts} = state) when restart_counts < 2 do
        # repair actions to make before restarting
        Application.ensure_all_started(:essential_app)
        {:continue, %{state | restart_counts: restart_counts + 1}}
      end

      def application_exited(_, state) do
        {:halt, state}
      end
    end

We initialize our Shoehorn.Handler with a restart count for state by calling init with the configuration options from our shoehorn config.

When we have an application startup, we will put a message on the console to notify the developer.

When we have a non-essential application fail we return :continue to inform the system to keep going like nothing happened.

We restart the essential application of our system two times, and then we tell the system to halt if starting over wasn't fixing the system.

Summary

Types

The cause that is firing the handler

The reaction letting Shoehorn know what to do

t()

Callbacks

Callback for handling application crashes

Callback for handling application starts

Callback to initialize the handle

Types

@type cause() :: any()

The cause that is firing the handler

@type opts() :: [{:handler, atom()}]
@type reaction() :: :continue | :halt

The reaction letting Shoehorn know what to do

  • :continue - keep the system going like nothing happened
  • :halt - stop the application and bring the system down
@type t() :: %Shoehorn.Handler{module: atom(), state: any()}

Callbacks

Link to this callback

application_exited(app, cause, state)

View Source
@callback application_exited(app :: atom(), cause(), state :: any()) ::
  {reaction(), state :: any()}

Callback for handling application crashes

Called with the application name, cause, and the handler's state. It must return a tuple containing the reaction that the Shoehorn should take, and the new state of the handler.

The code that you execute here can be used to notify or capture some information before halting the system. This information can later be used to recreate the issue or debug the problem causing the application to exit.

Use application_exited as a place for a last-ditch effort to fix the issue and restart the application. Ideally, capture some information on the system state, and solve it upstream. Shoehorn restarts should be used as a splint to keep a critical system running.

The default implementation returns the previous state, and a :halt reaction.

Callback Timeout

This callback is limited in the amount of time it has to execute by a shutdown timer. If the callback does not return within the defined shutdown time, the node is instructed to halt. The default shutdown time is 30 seconds but this value can be changed in the application config.

Link to this callback

application_started(app, state)

View Source
@callback application_started(app :: atom(), state :: any()) ::
  {reaction(), state :: any()}

Callback for handling application starts

Called with the application name, and the handler's state. It must return a tuple containing the reaction that the Shoehorn should take, and the new state of the handler.

      def application_exited(:essential_app, _reason, state) do
        # repair actions to make before restarting
        # notify someone of the crash and the details
        # log debug data
        Application.ensure_all_started(:essential_app)
        {:continue, %{state | restart_counts: restart_counts + 1}}
      end

The default implementation returns unchanged state, and a :continue reaction.

@callback init(opts :: keyword()) :: {:ok, state :: any()}

Callback to initialize the handle

The callback will receive a keyword list of all the configuration for shoehorn.

The callback must return a tuple of {:ok, state}. Where state is the initial state of the handler. The system will halt if the return is anything other than :ok.

Functions

@spec init(opts()) :: t() | no_return()
Link to this function

invoke(event, app, handler)

View Source
@spec invoke(:application_started, app :: atom(), t()) :: {reaction(), t()}
Link to this function

invoke(event, app, cause, handler)

View Source
@spec invoke(:application_exited, app :: atom(), cause(), t()) :: {reaction(), t()}