ObanDoctor.Plugins.TableHealth (oban_doctor v0.2.2)

Monitors the health of the oban_jobs table including size, bloat, and vacuum status.

As Oban Plugin

Add to your Oban config for scheduled telemetry emission:

config :my_app, Oban,
  plugins: [
    {ObanDoctor.Plugins.TableHealth, interval: :timer.hours(6)}
  ]

The plugin automatically uses your Oban instance's configuration (repo, prefix). Only the leader node (via Oban.Peer) executes the scheduled check, ensuring metrics are emitted once across your cluster.

Plugin Options

:interval - How often to emit telemetry in ms (default: 6 hours)
:thresholds - Alert thresholds (see below)

Default Thresholds

:table_size_mb - Alert when table exceeds this size (default: 1000 MB)
:dead_tuple_ratio - Alert when dead tuple ratio exceeds this (default: 0.1 = 10%)
:min_tuples_for_ratio_alert - Minimum total tuples before dead tuple ratio alert fires (default: 1000)
:hours_since_vacuum - Alert when hours since last vacuum exceeds this (default: 24)

Override any threshold while keeping defaults for the rest:

config :my_app, Oban,
  plugins: [
    {ObanDoctor.Plugins.TableHealth,
     interval: :timer.hours(6),
     thresholds: [
       table_size_mb: 2000,       # Override to 2GB
       hours_since_vacuum: 48     # Override to 48 hours
       # dead_tuple_ratio and min_tuples_for_ratio_alert use defaults
     ]}
  ]

Telemetry

Emits [:oban_doctor, :table, :health] with health metrics:

:telemetry.attach("my-handler", [:oban_doctor, :table, :health], fn _event, measurements, metadata, _config ->
  # measurements = %{table_size_mb: 150, dead_tuple_ratio: 0.05, alert_count: 0}
  # metadata = %{
  #   oban_name: Oban,
  #   metrics: %{table_size_bytes: ..., dead_tuples: ..., ...},
  #   alerts: []
  # }
end, nil)

Standalone Usage

Query table health metrics directly without configuring the plugin:

ObanDoctor.Plugins.TableHealth.metrics(MyApp.Repo)
#=> %{
#     table_size_bytes: 157286400,
#     table_size_mb: 150,
#     index_size_bytes: 52428800,
#     total_size_bytes: 209715200,
#     live_tuples: 500000,
#     dead_tuples: 25000,
#     dead_tuple_ratio: 0.05,
#     last_vacuum: ~U[2024-01-15 10:30:00Z],
#     last_autovacuum: ~U[2024-01-15 12:45:00Z],
#     hours_since_vacuum: 2.5,
#     hours_since_autovacuum: 0.25,
#     jobs_by_state: %{available: 100, completed: 400000, ...},
#     total_jobs: 500100,
#     alerts: []
#   }

Metrics

Metric	Description
`table_size_bytes`	Size of the oban_jobs table in bytes
`table_size_mb`	Size of the oban_jobs table in megabytes
`index_size_bytes`	Size of all indexes on oban_jobs in bytes
`total_size_bytes`	Total size including table and indexes
`live_tuples`	Number of live (visible) rows
`dead_tuples`	Number of dead (deleted but not vacuumed) rows
`dead_tuple_ratio`	Ratio of dead to total tuples (dead / (live + dead))
`last_vacuum`	Timestamp of last manual VACUUM
`last_autovacuum`	Timestamp of last autovacuum
`hours_since_vacuum`	Hours since last manual VACUUM (nil if never)
`hours_since_autovacuum`	Hours since last autovacuum (nil if never)
`jobs_by_state`	Count of jobs grouped by state
`total_jobs`	Total number of jobs in the table
`alerts`	List of `{alert_type, message}` tuples for threshold violations

Bloat Estimation

The dead_tuple_ratio serves as a proxy for table bloat. PostgreSQL's autovacuum removes dead tuples but doesn't reclaim disk space (that requires VACUUM FULL). A high dead tuple ratio (>10%) indicates the table may benefit from maintenance.

For accurate bloat measurement, consider using the pgstattuple extension directly.

Summary

Types

option()

Functions

metrics(repo, opts \\ [])

Query current table health metrics from the database.

Types

option()

@type option() ::
  Oban.Plugin.option() | {:interval, pos_integer()} | {:thresholds, keyword()}

Functions

metrics(repo, opts \\ [])

@spec metrics(
  module(),
  keyword()
) :: map()

Query current table health metrics from the database.

Returns a map with table statistics including size, vacuum status, and job counts.

Options

:oban - Oban instance name to get config from (prefix, queues)
:conf - Oban config struct (when called from a plugin)
:thresholds - Threshold values for generating alerts

The prefix is automatically extracted from the Oban configuration via :oban or :conf.

Return Value

Returns a map with table health metrics. See module documentation for full list.

Examples

# Use running Oban instance config
iex> TableHealth.metrics(MyApp.Repo, oban: Oban)
%{
  table_size_bytes: 157286400,
  table_size_mb: 150,
  ...
}