Vaultx.Sys.Health (Vaultx v0.7.0)

View Source

Comprehensive HashiCorp Vault system health monitoring.

This module provides enterprise-grade health checking capabilities for Vault servers and clusters, supporting all health monitoring scenarios including load balancer integration, HA cluster monitoring, and enterprise features.

Health Monitoring Features

Core Health Checks

  • Server Status: Initialization and seal status
  • HA Leadership: Active/standby node detection
  • Performance Standby: Enterprise performance standby monitoring
  • Cluster Health: Multi-node cluster status

Load Balancer Integration

  • Customizable Status Codes: Configure response codes for different states
  • Standby Handling: Flexible standby node status reporting
  • Health Check Endpoints: Optimized for load balancer health checks

Enterprise Features

  • Namespace Support: Multi-tenant health monitoring
  • DR Replication: Disaster recovery cluster status
  • Performance Replication: Performance replication monitoring

HTTP Status Code Reference

Standard Vault health status codes:

  • 200 - Initialized, unsealed, and active
  • 429 - Unsealed and standby
  • 472 - Disaster recovery secondary (active and standby)
  • 473 - Performance standby
  • 474 - Standby node unable to connect to active node
  • 501 - Not initialized
  • 503 - Sealed
  • 530 - Removed from cluster

API Compliance

Fully implements HashiCorp Vault Health API:

Usage Examples

# Basic health check
{:ok, health} = Vaultx.Sys.Health.check()
health.initialized #=> true
health.sealed #=> false

# Health check with custom status codes for load balancer
{:ok, health} = Vaultx.Sys.Health.check([
  standbyok: true,
  perfstandbyok: true
])

Load Balancer Integration

For load balancers that only understand 200-level responses:

{:ok, health} = Vaultx.Sys.Health.check([
  standbyok: true,        # Return 200 for standby nodes
  perfstandbyok: true,    # Return 200 for performance standby
  activecode: 200,        # Status code for active nodes
  standbycode: 200        # Override standby status code
])

Security Considerations

  • Health endpoints are typically unauthenticated for monitoring purposes
  • Seal status information is safe to expose publicly
  • Leader status may reveal cluster topology information
  • Use appropriate network controls to limit access to health endpoints

Summary

Types

Health check options.

Health status response structure.

Functions

Check Vault server health status.

Types

health_opts()

@type health_opts() :: [
  standbyok: boolean(),
  perfstandbyok: boolean(),
  activecode: pos_integer(),
  standbycode: pos_integer(),
  drsecondarycode: pos_integer(),
  haunhealthycode: pos_integer(),
  performancestandbycode: pos_integer(),
  removedcode: pos_integer(),
  sealedcode: pos_integer(),
  uninitcode: pos_integer(),
  timeout: pos_integer(),
  retry_attempts: non_neg_integer(),
  namespace: String.t()
]

Health check options.

health_status()

@type health_status() :: %{
  initialized: boolean(),
  sealed: boolean(),
  standby: boolean(),
  performance_standby: boolean(),
  server_time_utc: integer(),
  version: String.t(),
  cluster_name: String.t(),
  cluster_id: String.t(),
  replication_performance_mode: String.t(),
  replication_dr_mode: String.t(),
  replication_primary_canary_age_ms: integer(),
  ha_connection_healthy: boolean(),
  last_request_forwarding_heartbeat_ms: integer(),
  removed_from_cluster: boolean(),
  clock_skew_ms: integer(),
  echo_duration_ms: integer(),
  enterprise: boolean(),
  license: map() | nil,
  last_wal: integer() | nil
}

Health status response structure.

Functions

check(opts \\ [])

Check Vault server health status.

This endpoint returns the health status of Vault. It matches the semantics of a Consul HTTP health check and provides a simple way to monitor the health of a Vault instance.

Options

  • :standbyok - Return active status code for standby nodes (default: false)
  • :perfstandbyok - Return active status code for performance standby (default: false)
  • :activecode - Status code for active nodes (default: 200)
  • :standbycode - Status code for standby nodes (default: 429)

Examples

# Basic health check
{:ok, health} = Vaultx.Sys.Health.check()

# Load balancer friendly check
{:ok, health} = Vaultx.Sys.Health.check([standbyok: true])