Nebulex.Adapters.Coherent (Nebulex.Distributed v3.0.0)

Copy Markdown View Source

Adapter module for the coherent cache topology.

Features

  • Local cache with distributed invalidation across cluster nodes.
  • Automatic cache invalidation via Nebulex.Streams.
  • Configurable primary storage adapter.
  • Maximum read performance (pure local lookups).

Coherent Cache Topology

The coherent adapter provides a "local cache with distributed invalidation" pattern. Each node maintains its own independent local cache, but writes trigger invalidation events across the cluster via Nebulex.Streams.

Key characteristics:

  • Local Storage: Each node has a full local cache. All read operations are served directly from the local cache with no network overhead.

  • Distributed Invalidation: When a cache entry is modified (inserted, updated, or deleted), an event is broadcast to all nodes in the cluster. Other nodes invalidate (delete) that entry from their local caches.

  • Eventual Consistency: After invalidation, the next read on other nodes results in a cache miss, forcing a fresh fetch from the System-of-Record (SoR).

  • Write-Invalidate Protocol: Only invalidation events are broadcast, not the actual values. This minimizes network overhead.

How It Works

Node A                          Node B                          Node C
                              
 Local Cache                  Local Cache                  Local Cache  
                              
                                                                   
       
                                                    
                               
                  Streams    Invalidator 
                 (PubSub)                     (Workers)  
                               

The process:

  1. Node A modifies a cache entry (e.g., Cache.put("key", value)).
  2. The local cache stores the value and emits a cache event.
  3. Nebulex.Streams broadcasts the event via Phoenix.PubSub.
  4. The Nebulex.Streams.Invalidator on Nodes B and C receives the event.
  5. The Invalidator deletes "key" from the local caches on B and C.
  6. Next read on B or C: cache miss → fetch fresh from SoR.

When to Use

The coherent adapter is ideal for:

  • Read-Heavy Workloads: Maximum read performance since all reads are local.
  • Configuration/Reference Data: Data that rarely changes but must be consistent when it does.
  • Session Caches: When each node primarily serves its own sessions but needs consistency for shared data.
  • Simple Distributed Caching: When you want the simplicity of local caching with basic distributed consistency.

Comparison with Other Adapters

AspectCoherentPartitionedMultilevel
Data LocationIndependent per nodeSharded across nodesL1 local + L2 shared
Read PerformanceFastest (local)Network hop requiredL1 fast, L2 slower
Write BehaviorLocal + invalidation broadcastRemote write to ownerWrite through levels
ConsistencyEventual (after invalidation)Strong (single owner)Varies by config
Network OverheadLow (only invalidations)Medium (data transfer)Medium to High

Primary Storage Adapter

This adapter depends on a local cache adapter (primary storage), adding a distributed invalidation layer on top of it. You don't need to manually define the primary storage cache; the adapter initializes it automatically as part of the supervision tree.

The :primary_storage_adapter option (defaults to Nebulex.Adapters.Local) configures which adapter to use for the local storage. Options for the primary adapter can be specified via the :primary configuration option.

Usage

The cache expects the :otp_app and :adapter as options when used. The :otp_app should point to an OTP application with the cache configuration. Optionally, you can configure the desired primary storage adapter with the option :primary_storage_adapter (defaults to Nebulex.Adapters.Local). See the compile time options for more information:

  • :primary_storage_adapter (atom/0) - The adapter module used for the primary (local) storage on each cluster node. The coherent adapter wraps this local adapter and adds distributed invalidation on top of it using Nebulex.Streams. This option allows you to choose which adapter to use for the local storage. The configuration for the primary adapter is specified via the :primary start option. The default value is Nebulex.Adapters.Local.

For example:

defmodule MyApp.CoherentCache do
  use Nebulex.Cache,
    otp_app: :my_app,
    adapter: Nebulex.Adapters.Coherent
end

Providing a custom :primary_storage_adapter:

defmodule MyApp.CoherentCache do
  use Nebulex.Cache,
    otp_app: :my_app,
    adapter: Nebulex.Adapters.Coherent,
    adapter_opts: [primary_storage_adapter: Nebulex.Adapters.Local]
end

Configuration in config/config.exs:

config :my_app, MyApp.CoherentCache,
  primary: [
    gc_interval: :timer.hours(12),
    max_size: 1_000_000
  ],
  stream_opts: [
    partitions: System.schedulers_online()
  ]

Add the cache to your supervision tree:

def start(_type, _args) do
  children = [
    {MyApp.CoherentCache, []},
    ...
  ]

  opts = [strategy: :one_for_one, name: MyApp.Supervisor]
  Supervisor.start_link(children, opts)
end

See Nebulex.Cache for more information.

Configuration Options

This adapter supports the following configuration options:

  • :primary (keyword/0) - Configuration options passed to the primary storage adapter specified via :primary_storage_adapter. The available options depend on which adapter you choose. Refer to the documentation of your chosen primary storage adapter for the complete list of supported options. The default value is [].

  • :stream_opts (keyword/0) - Configuration options for the event stream used for distributed invalidation. The stream broadcasts cache events (inserts, updates, deletes) to all nodes in the cluster, enabling automatic invalidation of stale entries.

    Note: The following options are set automatically by the adapter and cannot be overridden:

    • :cache - Derived from the cache module.
    • :name - Derived from the cache instance name.
    • :broadcast_fun - Set to :broadcast_from to avoid self-invalidation.

    The default value is [].

    • :pubsub (atom/0) - The Phoenix.PubSub instance to use for event broadcasting.

      Defaults to Nebulex.Streams.PubSub. You can provide a custom PubSub instance if you want to use your application's existing PubSub. The specified PubSub must be started in your supervision tree.

      The default value is Nebulex.Streams.PubSub.

    • :backoff_initial (non_neg_integer/0) - Initial backoff time in milliseconds for listener re-registration.

      When the stream server fails to register the event listener, it will wait this amount of time before retrying. The backoff time increases exponentially up to :backoff_max.

      The default value is 1000.

    • :backoff_max (timeout/0) - Maximum backoff time in milliseconds for listener re-registration.

      When retrying failed listener registration, the backoff time will not exceed this value.

      The default value is 30000.

    • :partitions (pos_integer/0) - Number of partitions for parallel event processing.

      When provided, events are divided into this many independent sub-streams, allowing multiple invalidator workers to process events in parallel. Each partition has its own topic and worker.

      Typical values:

      • Omit or 1: Low event volume (single worker handles all events).
      • System.schedulers_online(): CPU-bound event processing.
      • System.schedulers_online() * 2: I/O-bound event processing.
    • :hash (Nebulex.Streams.hash/0) - Custom hash function for routing events to partitions.

      This function receives a Nebulex.Event.CacheEntryEvent and returns either:

      • A partition number (0 to partitions-1): routes the event to that partition.
      • :none: discards the event entirely.

      Defaults to Nebulex.Streams.default_hash/1 which uses phash2 for even distribution.

      The hash function is only used when :partitions is configured.

      The default value is &Nebulex.Streams.default_hash/1.

Telemetry Events

Since the coherent adapter depends on the configured primary storage cache (which uses a local cache adapter), this one will also emit Telemetry events. Additionally, Nebulex.Streams and Nebulex.Streams.Invalidator emit their own telemetry events for monitoring the invalidation process.

For example, the cache defined before MyApp.CoherentCache will emit:

  • [:my_app, :coherent_cache, :command, :start]
  • [:my_app, :coherent_cache, :primary, :command, :start]
  • [:my_app, :coherent_cache, :command, :stop]
  • [:my_app, :coherent_cache, :primary, :command, :stop]

Additionally, stream and invalidator events:

  • [:nebulex, :streams, :listener_registered]
  • [:nebulex, :streams, :broadcast]
  • [:nebulex, :streams, :invalidator, :started]
  • [:nebulex, :streams, :invalidator, :invalidate, :start]
  • [:nebulex, :streams, :invalidator, :invalidate, :stop]

See the Telemetry guide and Nebulex.Streams documentation for more information.

Extended API

This adapter provides some additional convenience functions to the Nebulex.Cache API.

Retrieving the primary storage or local cache module:

MyCache.__primary__()

Best Practices

  1. Use for read-heavy workloads: The coherent adapter excels when reads far outnumber writes.

  2. Configure partitions for high write volumes: If you have many concurrent writes, use :partitions in :stream_opts to parallelize invalidation processing.

  3. Ensure PubSub connectivity: The adapter relies on Phoenix.PubSub for event distribution. Ensure your cluster nodes can communicate via PubSub.

  4. Handle cache misses gracefully: After invalidation, reads result in cache misses. Ensure your application can fetch fresh data from the SoR.

  5. Monitor invalidation latency: Use telemetry events to monitor how quickly invalidations propagate across the cluster.

Caveats

  • Eventual Consistency Window: There is a latency between when a write occurs on one node and when the invalidation is processed on other nodes. During this window, other nodes may serve stale data. The duration depends on network latency and PubSub message delivery. For most use cases this is negligible, but time-sensitive applications should account for this.

  • Memory Usage: Each node maintains its own independent local cache. This is not full data replication; nodes only cache data they have locally accessed or written. Memory usage depends on each node's access patterns and the primary adapter's configuration (e.g., :max_size, :gc_interval).