View Source Manual distribution
Introduction
By default ProcessHub
uses the ProcessHub.Strategy.Distribution.ConsistentHashing
strategy to distribute processes.
This is automatic and the ProcessHub
will decide which node to start the process on.
This strategy is useful for most use cases, but sometimes you may want to distribute processes manually. This guide will show you how to do that.
By manually distributing processes, you can control which processes are started on which nodes. This can be useful when you want to distribute processes based on specific criteria, such as the node's hardware capabilities or the process's workload.
Let's dive in a see some examples.
Example GenServer
As always our processes should have some template to follow. Here is a simple GenServer
process that we will use in our examples:
defmodule MyProcess do
use GenServer
def start_link(state) do
GenServer.start_link(__MODULE__, state)
end
def init(state) do
{:ok, state}
end
end
The configuration
First, let's configure the ProcessHub
to use the ProcessHub.Strategy.Distribution.Guided
strategy.
defmodule MyApp.Application do
use Application
def start(_type, _args) do
children = [process_hub()]
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
defp process_hub() do
{ProcessHub, %ProcessHub{
hub_id: :my_hub,
# We have to swap the default distribution strategy with the guided one.
distribution_strategy: %ProcessHub.Strategy.Distribution.Guided{}
}}
end
end
Start 2 nodes
Now let's start 2 nodes and connect them to each other.
iex --name node1@127.0.0.1 --cookie mycookie -S mix
iex --name node2@127.0.0.1 --cookie mycookie -S mix
From node 1, connect to node 2(or vice versa):
Node.connect(:"node2@127.0.0.1")
Start distributed processes.
Now let's start a process on node 1 and guide it to node 2.
iex> node1_child = %{id: :node1_child, start: {MyProcess, :start_link, [nil]}}
iex> node2_child = %{id: :node2_child, start: {MyProcess, :start_link, [nil]}}
iex> ProcessHub.start_children(:my_hub, [node1_child, node2_child], [
...> child_mapping: %{
...> node1_child: [:"node1@127.0.0.1"],
...> node2_child: [:"node2@127.0.0.1"],
...> }
...> ])
In the example above, we started 2 processes, node1_child
and node2_child
. We also specified that node1_child
should be started on node 1 and node2_child
should be started on node 2.
We do this by adding the child_mapping
option to the ProcessHub.start_children/3
function. The child_mapping
option is a map that specifies which node to start the process on.
Each key in the map is the id
of the process and the value is a list of nodes to start the process on.
Why use lists to specify the node?
The reason we use lists for the
child_mapping
is that we can specify multiple nodes for a process if we want to replicate the process on multiple nodes.This requires the
ProcessHub.Strategy.Redundancy.Replication
strategy to be used. Which we will cover in another guide. For example:node1_child: [:"node1@127.0.0.1", :"node2@127.0.0.1"]
Checking the registered processes
Now let's check the registered processes on each node.
iex> ProcessHub.process_list(:my_hub, :global)
[
node1_child: ["node1@127.0.0.1": #PID<0.280.0>],
node2_child: ["node2@127.0.0.1": #PID<22474.276.0>]
]
We can see that the node1_child
process is registered on node 1 and the node2_child
process is registered on node 2.
The
child_mapping
is mandatoryIf we decide to use the
ProcessHub.Strategy.Distribution.Guided
strategy, we have to specify thechild_mapping
option when starting the processes. Otherwise{:error, :missing_child_mapping}
will be returned when starting the processes.