Cluster-aware startup and shutdown for SuperCache.
Start options
All options accepted by SuperCache.Bootstrap.start!/1 are valid here,
plus the following cluster-specific keys:
| Option | Type | Default | Description |
|---|---|---|---|
:cluster | atom | :distributed | :local or :distributed |
:replication_factor | integer | 2 | Total copies (primary + replicas) |
:replication_mode | atom | :async | :async, :sync, or :strong (3PC) |
:num_partition | integer | scheduler count | Number of ETS partitions |
:table_type | atom | :set | ETS table type |
Node-source options (forwarded to NodeMonitor)
| Option | Type | Default | Description |
|---|---|---|---|
:nodes | [node()] | — | Static peer list evaluated once at start-up. |
:nodes_mfa | {module, atom, [term]} | — | Called at init and on every :refresh_ms tick. |
:refresh_ms | pos_integer | 5_000 | MFA re-evaluation interval (ignored otherwise). |
When neither :nodes nor :nodes_mfa is supplied, NodeMonitor falls
back to watching all Erlang-connected nodes (legacy behaviour).
Replication modes
:async— fire-and-forget. Lowest latency; eventual consistency.:sync— synchronous delivery to all replicas before returning. One extra RTT per write.:strong— three-phase commit viaSuperCache.Cluster.ThreePhaseCommit. Guarantees that either all replicas apply a write or none do. Three extra RTTs per write.
Start sequence
- Validate options.
- Write all options to
SuperCache.Config. - Reconfigure
NodeMonitorwith the node-source opts (:nodes,:nodes_mfa,:refresh_ms). This is done early soManagersees the correct managed set when it health-checks peers in later steps. - If other nodes are already live, verify that every structural config
key on this node matches the cluster. Raises
ArgumentErroron any mismatch. No ETS tables have been created at this point, so a rejection leaves the node in a completely clean state andstart!/1can be retried with corrected opts without hitting "table already exists". - Start
PartitionandStoragesubsystems. - Start the
Bufferwrite-buffer pool. - If
:replication_modeis:strong, run crash-recovery viaThreePhaseCommit.recover/0to resolve in-doubt transactions left over from a previous crash. - If other nodes are already live, request a full sync so this node receives a consistent snapshot of each partition.
- Mark
:startedin config.
Stop sequence
- Stop the
Buffer(flushes pending lazy writes). - Stop
Storage(deletes ETS tables). - Stop
Partition(clears partition registry). - Mark
:startedasfalse.
Config verification
When a node joins a running cluster, start!/1 calls verify_cluster_config!/1
which performs a pairwise comparison of every structural config key against all
live peers via :erpc. The keys checked are:
[:key_pos, :partition_pos, :num_partition, :table_type, :replication_factor, :replication_mode]
:started, :cluster, and :table_prefix are intentionally excluded:
:started is a liveness flag that will differ during bootstrap; :cluster
is always :distributed in this module; :table_prefix must already match
for ETS tables to be addressable so a mismatch would cause an earlier crash.
Any mismatch raises ArgumentError listing every divergent key with both
the local and remote values so the operator can identify the problem
immediately rather than observing silent data inconsistency later.
Summary
Functions
Return the structural config of this node as a map.
Return the full partition map for this node as a list of
{partition_idx, {primary, replicas}} pairs.
Returns true when this node is running in distributed mode and has
completed start-up.
Start SuperCache in cluster mode with the given options.
Stop SuperCache and release all ETS resources.
Functions
@spec export_config() :: map()
Return the structural config of this node as a map.
Called via :erpc from a joining node during config verification.
Returns only the keys in @config_keys — never liveness flags.
Example
SuperCache.Cluster.Bootstrap.export_config()
# => %{key_pos: 0, partition_pos: 0, num_partition: 8,
# table_type: :set, replication_factor: 2, replication_mode: :async}
@spec fetch_partition_map(pos_integer()) :: [{non_neg_integer(), {node(), [node()]}}]
Return the full partition map for this node as a list of
{partition_idx, {primary, replicas}} pairs.
Called via :erpc from test helpers on the test node to read the
partition assignment of a remote peer without crossing the no-lambda
boundary. num must match SuperCache.Config.get_config(:num_partition)
on the calling node; callers should read that value locally and pass it
as an argument so the comparison is always against the same reference.
Example
SuperCache.Cluster.Bootstrap.fetch_partition_map(8)
# => [{0, {:"a@host", [:"b@host"]}}, ...]
@spec running?() :: boolean()
Returns true when this node is running in distributed mode and has
completed start-up.
Called remotely by Manager.node_running?/1 via :erpc.
@spec start!(keyword()) :: :ok
Start SuperCache in cluster mode with the given options.
Raises ArgumentError for invalid options or when the node's structural
config does not match an already-running cluster.
@spec stop() :: :ok
Stop SuperCache and release all ETS resources.