Hemdal.Check (Hemdal v1.0.3)
View SourceEvery check performed by Hemdal is based on a state machine which is in charge of running a command, check the return and based on the return and if it was successfully executed or not, determine the state of the machine.
The state machine has the following states:
disabled: it's not performing checks, it waits until it's activated.normal: it's running correctly the command and always receiving a success state. It's configuring a state timeout based oncheck_in_secfromHemdal.Config.Alert.failing: when innormalstate, it receives an failed response, it's moved tofailingstatus. It's configuring a state timeout based onrecheck_in_secand if it's not recovering after a number ofretriesit's moving tobroken(seeHemdal.Config.Alert).broken: it was not running correctly for some time. We consider the subject under check broken and we are checking everybroken_recheck_in_secseconds. Only if it's recovered it back tonormalstate.
Summary
Types
The alert ID in use to identify the state machine running the checks for the alert.
The returned status retrieved from the process is built to contain a map with keys which are strings and the content which could be different depending on the key. The keys are the following ones
The status available inside of the events. It's valid for both, current and previous state.
Functions
Returns a specification to start this module under a supervisor.
Check if the alert is running.
Get all of the alerts running. It's requesting to the supervisor the list
of all of the alerts and it's gathering the status for each one based on
the get_status/1 function.
Returns the PID of the alert process if it's running.
Get the status of an alert. It's requesting the status directly to the process.
Reload all of the alerts based on the configuration backend. See
Hemdal.Config for further information. If the alert isn't running
it's starting it.
Ensure all of the alerts are started.
Update the alert passing the new configuration to the process. It's useful when we want to change the configuration for the command, the host or whatever else inside of the alert/check.
Types
@type alert_id() :: String.t()
The alert ID in use to identify the state machine running the checks for the alert.
@type returned_status() :: map()
The returned status retrieved from the process is built to contain a map with keys which are strings and the content which could be different depending on the key. The keys are the following ones:
statusis an atom and it could be:ok,:disabled,:warnor:error.alertis a map which is including information for the alert itself, information like: id, name, host, and command.last_updateis a naive datetime generated at the moment.resultis a map with information of the executed command.
@type status() :: :ok | :warn | :error | :disabled
The status available inside of the events. It's valid for both, current and previous state.
@type t() :: %Hemdal.Check{ alert: Hemdal.Config.Alert.t() | nil, fail_started: NaiveDateTime.t() | nil, last_update: NaiveDateTime.t(), retries: non_neg_integer(), status: returned_status() | nil }
Functions
Returns a specification to start this module under a supervisor.
See Supervisor.
Check if the alert is running.
@spec get_all() :: [returned_status()]
Get all of the alerts running. It's requesting to the supervisor the list
of all of the alerts and it's gathering the status for each one based on
the get_status/1 function.
Returns the PID of the alert process if it's running.
@spec get_status(pid() | alert_id()) :: [returned_status()]
Get the status of an alert. It's requesting the status directly to the process.
@spec reload_all() :: :ok
Reload all of the alerts based on the configuration backend. See
Hemdal.Config for further information. If the alert isn't running
it's starting it.
@spec start_all() :: :ok
Ensure all of the alerts are started.
@spec update_alert(Hemdal.Config.Alert.t()) :: {:ok, pid()}
Update the alert passing the new configuration to the process. It's useful when we want to change the configuration for the command, the host or whatever else inside of the alert/check.