Handles Kubernetes liveness and readiness probes for Elixir/Phoenix applications using OTP's native shutdown sequence.

Replaces the deprecated traffic_drain_plug library.

How it works

The library has two components that work together:

KubernetesProbes.Plug is added as the first plug in your Phoenix endpoint. It intercepts requests to the liveness and readiness probe paths and responds immediately, before any other plugs run. The liveness probe returns 200 as long as the BEAM is up. The readiness probe returns 200 when the app is ready to serve traffic, and 503 during startup or while draining.

KubernetesProbes.Drainer is a GenServer added as the last child of your application supervisor. On SIGTERM, OTP terminates children in reverse order, so the Drainer terminates first. Its terminate/2 callback immediately flips the readiness probe to 503 via :persistent_term and sleeps for the configured drain window. This gives Kubernetes time to stop routing new traffic before the Endpoint, Repo, and other resources are torn down.

Installation

# mix.exs
{:kubernetes_probes, "~> 0.1"}

Usage

1. Add the Drainer as the last child in your application supervisor:

# lib/my_app/application.ex
children = [
  MyApp.Repo,
  MyAppWeb.Endpoint,
  # Must be last — terminates first on shutdown
  {KubernetesProbes.Drainer, wait: 20_000}
]

2. Add the Plug as the first plug in your endpoint:

# lib/my_app_web/endpoint.ex
plug KubernetesProbes.Plug

# With a custom readiness check (e.g. database connectivity):
plug KubernetesProbes.Plug, ready?: &MyApp.repos_ready?/0

# With custom probe paths:
plug KubernetesProbes.Plug, liveness_path: "/healthz", readiness_path: "/readyz"

Probe endpoints

The default paths are /probe/liveness and /probe/readiness. Both can be overridden via the :liveness_path and :readiness_path plug options.

PathMethodDescription
/probe/livenessGETReturns 200 while the BEAM is running
/probe/readinessGETReturns 200 when the drainer is :running and ready? returns true; 503 while draining or not ready

Configuration

Drainer options

OptionDefaultDescription
:wait20_000Drain window in milliseconds

Plug options

OptionDefaultDescription
:ready?fn -> true endZero-arity function returning a boolean. Called on each readiness request while the drainer is :running
:liveness_path"/probe/liveness"Path for the liveness probe
:readiness_path"/probe/readiness"Path for the readiness probe

Shorten the drain window in dev and test

# config/dev.exs — avoid 20s hang on Ctrl-C
config :my_app, KubernetesProbes.Drainer, wait: 100

# config/test.exs — avoid slow suite teardown
config :my_app, KubernetesProbes.Drainer, wait: 10

Pass the configured value when adding the child:

{KubernetesProbes.Drainer, wait: Application.compile_env(:my_app, [KubernetesProbes.Drainer, :wait], 20_000)}

Kubernetes deployment

Set terminationGracePeriodSeconds to at least the drain window plus a few seconds for the rest of the shutdown sequence. With the default 20 s drain window, 30 s is a safe value.

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 30
      containers:
        - name: my-app
          ports:
            - containerPort: 4000
          livenessProbe:
            httpGet:
              path: /probe/liveness
              port: 4000
            initialDelaySeconds: 30
            periodSeconds: 30
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /probe/readiness
              port: 4000
            initialDelaySeconds: 10
            periodSeconds: 2
            successThreshold: 1

StatefulSet

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: my-app
spec:
  template:
    spec:
      terminationGracePeriodSeconds: 30
      containers:
        - name: my-app
          ports:
            - containerPort: 4000
          livenessProbe:
            httpGet:
              path: /probe/liveness
              port: 4000
            initialDelaySeconds: 30
            periodSeconds: 30
            timeoutSeconds: 5
          readinessProbe:
            httpGet:
              path: /probe/readiness
              port: 4000
            initialDelaySeconds: 10
            periodSeconds: 2
            successThreshold: 1