View Source ProcessHub.Strategy.Distribution.CentralizedLoadBalancer (ProcessHub v0.4.0-beta)

Provides implementation for distribution behavior using centralized load balancing.

This strategy implements a centralized approach to process distribution where a single leader node collects performance metrics from all nodes in the cluster and makes distribution decisions based on real-time load data.

Unlike the ProcessHub.Strategy.Distribution.ConsistentHashing strategy that uses deterministic hash-based distribution, the centralized load balancer actively monitors cluster resources and dynamically assigns processes to the least loaded nodes.

How It Works

The centralized load balancer operates through a leader election mechanism:

  1. Leader Election: Uses the :elector library to elect a single leader node. The leader is determined by highest uptime - the node that has been running the longest becomes the leader. This selection criteria is currently not configurable.
  2. Metrics Collection: Each node periodically sends performance metrics to the leader
  3. Load Scoring: The leader calculates load scores based on multiple system metrics
  4. Distribution: New processes are assigned to nodes with the lowest load scores

The load scoring algorithm considers the following BEAM VM metrics:

  • Scheduler utilization (40% weight) - CPU usage across schedulers
  • Run queue length (30% weight) - Number of processes waiting to run
  • Process count (20% weight) - Total number of processes on the node
  • Memory usage (10% weight) - Total memory consumption

Key Characteristics

No Process Replication

Important Limitation

This strategy does not support process replication. Only a single instance of each process can exist at any time across the cluster. This makes it unsuitable for use cases requiring high availability through process redundancy.

Experimental Status

Experimental Feature

This distribution strategy is currently experimental and should not be used in production environments without thorough testing. The implementation may change in future versions.

Single Hub Limitation

Configuration Constraint

When ProcessHub is used with multiple different hubs and configurations, only one single hub can be configured to use the centralized load balancer strategy at any given time in the cluster.

No Process Shuffling

Unlike some distribution strategies, the centralized load balancer does not shuffle existing processes when new nodes join the cluster. New processes will be distributed to optimal nodes based on current load, but existing processes remain where they are. However, when a node leaves the cluster, its processes will be redistributed to other available nodes.

Configuration Options

The strategy can be configured using the following struct fields:

  • :max_history_size (default: 30) - Maximum number of historical load scores to maintain for each node. Used for calculating weighted averages and trend analysis.

  • :weight_decay_factor (default: 0.9) - Exponential decay factor applied to historical scores. Values closer to 1.0 give more weight to historical data, while values closer to 0.0 prioritize recent measurements.

  • :push_interval (default: 10_000) - Interval in milliseconds between metric collection and transmission from each node to the leader.

Usage Example

iex> distribution_strategy = %ProcessHub.Strategy.Distribution.CentralizedLoadBalancer{
iex>   max_history_size: 50,
iex>   weight_decay_factor: 0.8,
iex>   push_interval: 5_000
iex> }
iex>
iex> hub_config = %ProcessHub{
iex>   hub_id: :my_hub,
iex>   distribution_strategy: distribution_strategy
iex> }

Comparison with ConsistentHashing

FeatureCentralizedLoadBalancerConsistentHashing
Distribution BasisReal-time load metricsDeterministic hashing
Process ShufflingNo (on node join)Yes (minimal)
Replication SupportNoYes
Leader DependencyYesNo
Production ReadyNo (experimental)Yes
Multiple HubsNo (single hub only)Yes

Summary

Functions

Returns a specification to start this module under a supervisor.

Gets the scoreboard for debugging/monitoring purposes.

Types

@type node_metrics() :: %{
  scheduler_utilization: float(),
  run_queue_total: non_neg_integer(),
  process_count: non_neg_integer(),
  memory_usage: non_neg_integer(),
  timestamp: integer()
}
@type node_score() :: %{
  current_score: float(),
  historical_scores: [float()],
  last_updated: integer()
}
@type t() :: %ProcessHub.Strategy.Distribution.CentralizedLoadBalancer{
  calculator_pid: pid() | nil,
  max_history_size: pos_integer(),
  nodeup_redistribution: boolean(),
  push_interval: pos_integer(),
  scoreboard: %{required(node()) => node_score()},
  weight_decay_factor: float()
}

Functions

Returns a specification to start this module under a supervisor.

See Supervisor.

Link to this function

get_scoreboard(strategy)

View Source

Gets the scoreboard for debugging/monitoring purposes.