SuperCache.Cluster.WAL (SuperCache v1.3.0)

Copy Markdown View Source

Write-Ahead Log for fast strong consistency.

Replaces the heavy Three-Phase Commit (3PC) protocol with a lighter-weight WAL-based approach that:

  1. Writes to local ETS immediately
  2. Appends operation to WAL (in-memory ETS for speed)
  3. Async replicates WAL entries to replicas
  4. Tracks acknowledgments from replicas
  5. Returns success once majority has acked

This reduces strong-mode latency from ~1500µs (3PC) to ~200µs (WAL).

Design

  • WAL entries are stored in an ETS table for fast access
  • Each entry has a monotonically increasing sequence number
  • Replicas ack entries asynchronously via :erpc.cast
  • Majority acknowledgment determines commit success
  • Periodic cleanup of committed entries
  • Recovery on node restart replays uncommitted entries

Usage

This module is called internally by the Replicator when replication_mode is set to :strong. You should not call it directly.

Configuration

The WAL uses sensible defaults but can be tuned via application config:

config :super_cache, :wal,
  majority_timeout: 2_000,  # ms to wait for majority ack
  cleanup_interval: 5_000,  # ms between cleanup cycles
  max_pending: 10_000       # max uncommitted entries before backpressure

Example

# Called internally by Replicator.replicate/3
SuperCache.Cluster.WAL.commit(2, [{:put, {:user, 1, "Alice"}}])
# => :ok

Summary

Functions

Handle replication acknowledgment from a replica.

Returns a specification to start this module under a supervisor.

Commit operations via WAL.

Recover uncommitted WAL entries after restart.

Apply WAL operations on a replica and acknowledge.

Starts the WAL GenServer.

Return WAL statistics.

Functions

ack(seq, replica_node)

@spec ack(non_neg_integer(), node()) :: :ok

Handle replication acknowledgment from a replica.

Called via :erpc.cast on the primary node when a replica has applied the WAL entry.

child_spec(init_arg)

Returns a specification to start this module under a supervisor.

See Supervisor.

commit(partition_idx, ops)

@spec commit(non_neg_integer(), [{atom(), any()}]) :: :ok | {:error, term()}

Commit operations via WAL.

  1. Writes to local ETS immediately
  2. Appends to WAL
  3. Async replicates to replicas
  4. Waits for majority acknowledgment
  5. Returns :ok on success, {:error, reason} on failure

This is the fast path for strong consistency — typically ~200µs vs ~1500µs for 3PC.

Example

WAL.commit(2, [{:put, {:user, 1, "Alice"}}])
# => :ok

recover()

@spec recover() :: :ok

Recover uncommitted WAL entries after restart.

Replays any entries that haven't been fully committed to ensure consistency.

replicate_and_ack(seq, partition_idx, ops)

@spec replicate_and_ack(non_neg_integer(), non_neg_integer(), [{atom(), any()}]) ::
  :ok

Apply WAL operations on a replica and acknowledge.

Called via :erpc.cast from the primary. Applies the operations locally then sends an ack back to the primary.

start_link(opts \\ [])

@spec start_link(keyword()) :: :ignore | {:error, any()} | {:ok, pid()}

Starts the WAL GenServer.

stats()

@spec stats() :: %{pending: non_neg_integer(), acks_pending: non_neg_integer()}

Return WAL statistics.

Example

WAL.stats()
# => %{pending: 42, committed: 1000}