reckon_db_consistency_checker (reckon_db v1.2.6)

View Source

Cluster consistency checker for reckon-db

Provides active split-brain detection and cluster health verification. Implements multi-layer consistency checking:

1. Membership Consensus - All nodes agree on cluster membership 2. Raft Log Consistency - Log terms and indices match across followers 3. Leader Consensus - All nodes agree on who the leader is 4. Quorum Verification - Sufficient nodes available for operations

Split-Brain Detection:

Split-brain occurs when network partitions cause nodes to form independent clusters. This module detects such scenarios by:

- Collecting membership views from all nodes via RPC - Comparing views to find inconsistencies - Detecting when nodes report different leaders - Identifying when quorum is at risk

Academic References:

- Ongaro, D. and Ousterhout, J. (2014). In Search of an Understandable Consensus Algorithm (Raft). USENIX ATC 2014. - Brewer, E. (2012). CAP Twelve Years Later: How the "Rules" Have Changed. IEEE Computer, 45(2), 23-29.

See also: reckon_db_health_prober.

Summary

Functions

Force an immediate consistency check

Get current quorum status

Get current consistency status

Register a callback for status changes

Remove a previously registered callback

Start the consistency checker

Verify all nodes agree on the current leader

Verify membership consensus across all cluster nodes

Verify Raft log consistency across cluster

Types

check_detail/0

-type check_detail() :: #{status := ok | warning | error, message := binary(), data := term()}.

check_result/0

-type check_result() ::
          #{status := consistency_status(),
            checks := #{atom() => check_detail()},
            timestamp := integer(),
            duration_us := non_neg_integer()}.

consistency_status/0

-type consistency_status() :: healthy | degraded | split_brain | no_quorum.

store_config/0

-type store_config() ::
          #store_config{store_id :: atom(),
                        data_dir :: string(),
                        mode :: single | cluster,
                        timeout :: pos_integer(),
                        writer_pool_size :: pos_integer(),
                        reader_pool_size :: pos_integer(),
                        gateway_pool_size :: pos_integer(),
                        options :: map()}.

Functions

check_now(StoreId)

-spec check_now(atom()) -> check_result().

Force an immediate consistency check

get_quorum_status(StoreId)

-spec get_quorum_status(atom()) -> {ok, map()} | {error, term()}.

Get current quorum status

Returns quorum availability and margin information.

get_status(StoreId)

-spec get_status(atom()) -> {ok, consistency_status()} | {error, not_running}.

Get current consistency status

handle_call(Request, From, State)

handle_cast(Msg, State)

handle_info(Info, State)

init(Store_config)

on_status_change(StoreId, Callback)

-spec on_status_change(atom(), fun((consistency_status()) -> any())) -> reference().

Register a callback for status changes

remove_callback(StoreId, Ref)

-spec remove_callback(atom(), reference()) -> ok.

Remove a previously registered callback

start_link(Store_config)

-spec start_link(store_config()) -> {ok, pid()} | {error, term()}.

Start the consistency checker

terminate(Reason, State)

verify_leader_consensus(StoreId)

-spec verify_leader_consensus(atom()) -> {ok, map()} | {error, term()}.

Verify all nodes agree on the current leader

verify_membership_consensus(StoreId)

-spec verify_membership_consensus(atom()) -> {ok, map()} | {error, term()}.

Verify membership consensus across all cluster nodes

Collects membership views from each node and compares them. Returns consensus if all nodes agree, split_brain if they disagree.

verify_raft_consistency(StoreId)

-spec verify_raft_consistency(atom()) -> {ok, map()} | {error, term()}.

Verify Raft log consistency across cluster

Checks that follower nodes have consistent log terms and indices. Significant divergence may indicate replication issues.