View Source Nebulex.Adapters.Replicated (Nebulex v2.6.4)
Built-in adapter for replicated cache topology.
Overall features
- Replicated cache topology.
- Configurable primary storage adapter.
- Cache-level locking when deleting all entries or adding new nodes.
- Key-level (or entry-level) locking for key-based write-like operations.
- Support for transactions via Erlang global name registration facility.
- Stats support rely on the primary storage adapter.
Replicated Cache Topology
A replicated cache is a clustered, fault tolerant cache where data is fully replicated to every member in the cluster. This cache offers the fastest read performance with linear performance scalability for reads but poor scalability for writes (as writes must be processed by every member in the cluster). Because data is replicated to all servers, adding servers does not increase aggregate cache capacity.
There are several challenges to building a reliably replicated cache. The
first is how to get it to scale and perform well. Updates to the cache have
to be sent to all cluster nodes, and all cluster nodes have to end up with
the same data, even if multiple updates to the same piece of data occur at
the same time. Also, if a cluster node requests a lock, ideally it should
not have to get all cluster nodes to agree on the lock or at least do it in
a very efficient way (:global
is used here), otherwise it will scale
extremely poorly; yet in the case of a cluster node failure, all of the data
and lock information must be kept safely.
The best part of a replicated cache is its access speed. Since the data is replicated to each cluster node, it is available for use without any waiting. This is referred to as "zero latency access," and is perfect for situations in which an application requires the highest possible speed in its data access.
However, there are some limitations:
Cost Per Update - Updating a replicated cache requires pushing the new version of the data to all other cluster members, which will limit scalability if there is a high frequency of updates per member.
Cost Per Entry - The data is replicated to every cluster member, so Memory Heap space is used on each member, which will impact performance for large caches.
Based on "Distributed Caching Essential Lessons" by Cameron Purdy.
Usage
When used, the Cache expects the :otp_app
and :adapter
as options.
The :otp_app
should point to an OTP application that has the cache
configuration. For example:
defmodule MyApp.ReplicatedCache do
use Nebulex.Cache,
otp_app: :my_app,
adapter: Nebulex.Adapters.Replicated
end
Optionally, you can configure the desired primary storage adapter with the
option :primary_storage_adapter
; defaults to Nebulex.Adapters.Local
.
defmodule MyApp.ReplicatedCache do
use Nebulex.Cache,
otp_app: :my_app,
adapter: Nebulex.Adapters.Replicated,
primary_storage_adapter: Nebulex.Adapters.Local
end
The configuration for the cache must be in your application environment,
usually defined in your config/config.exs
:
config :my_app, MyApp.ReplicatedCache,
primary: [
gc_interval: 3_600_000,
backend: :shards
]
If your application was generated with a supervisor (by passing --sup
to mix new
) you will have a lib/my_app/application.ex
file containing
the application start callback that defines and starts your supervisor.
You just need to edit the start/2
function to start the cache as a
supervisor on your application's supervisor:
def start(_type, _args) do
children = [
{MyApp.ReplicatedCache, []},
...
]
See Nebulex.Cache
for more information.
Options
This adapter supports the following options and all of them can be given via the cache configuration:
:primary
- The options that will be passed to the adapter associated with the local primary storage. These options will depend on the local adapter to use.:task_supervisor_opts
- Start-time options passed toTask.Supervisor.start_link/1
when the adapter is initialized.
Shared options
Almost all of the cache functions outlined in Nebulex.Cache
module
accept the following options:
:timeout
- The time-out value in milliseconds for the command that will be executed. If the timeout is exceeded, then the current process will exit. For executing a command on remote nodes, this adapter usesTask.await/2
internally for receiving the result, so this option tells how much time the adapter should wait for it. If the timeout is exceeded, the task is shut down but the current process doesn't exit, only the result associated with that task is skipped in the reduce phase.
Telemetry events
This adapter emits all recommended Telemetry events, and documented
in Nebulex.Cache
module (see "Adapter-specific events" section).
Since the replicated adapter depends on the configured primary storage
adapter (local cache adapter), this one may also emit Telemetry events.
Therefore, there will be events emitted by the replicated adapter as well
as the primary storage adapter. For example, for the cache defined before
MyApp.ReplicatedCache
, these would be the emitted events:
[:my_app, :replicated_cache, :command, :start]
[:my_app, :replicated_cache, :primary, :command, :start]
[:my_app, :replicated_cache, :command, :stop]
[:my_app, :replicated_cache, :primary, :command, :stop]
[:my_app, :replicated_cache, :command, :exception]
[:my_app, :replicated_cache, :primary, :command, :exception]
As you may notice, the telemetry prefix by default for the replicated cache
is [:my_app, :replicated_cache]
, and the prefix for its primary storage
[:my_app, :replicated_cache, :primary]
.
See also the Telemetry guide for more information and examples.
Stats
This adapter depends on the primary storage adapter for the stats support. Therefore, it is important to ensure the underlying primary storage adapter does support stats, otherwise, you may get unexpected errors.
Extended API
This adapter provides some additional convenience functions to the
Nebulex.Cache
API.
Retrieving the primary storage or local cache module:
MyCache.__primary__()
Retrieving the cluster nodes associated with the given cache name:
MyCache.nodes()
Joining the cache to the cluster:
MyCache.join_cluster()
Leaving the cluster (removes the cache from the cluster):
MyCache.leave_cluster()
Adapter-specific telemetry events
This adapter exposes following Telemetry events:
telemetry_prefix ++ [:replication]
- Dispatched by the adapter when a replication error occurs due to a write-like operation under-the-hood.Measurements:
%{rpc_errors: non_neg_integer}
Metadata:
%{ adapter_meta: %{optional(atom) => term}, rpc_errors: [{node, error :: term}] }
telemetry_prefix ++ [:bootstrap]
- Dispatched by the adapter at start time when there are errors while syncing up with the cluster nodes.Measurements:
%{ failed_nodes: non_neg_integer, remote_errors: non_neg_integer }
Metadata:
%{ adapter_meta: %{optional(atom) => term}, failed_nodes: [node], remote_errors: [term] }
Caveats of replicated adapter
As it is explained in the beginning, a replicated topology not only brings with advantages (mostly for reads) but also with some limitations and challenges.
This adapter uses global locks (via :global
) for all operation that modify
or alter the cache somehow to ensure as much consistency as possible across
all members of the cluster. These locks may be per key or for the entire cache
depending on the operation taking place. For that reason, it is very important
to be aware about those operation that can potentially lead to performance and
scalability issues, so that you can do a better usage of the replicated
adapter. The following is with the operations and aspects you should pay
attention to:
Starting and joining a new replicated node to the cluster is the most expensive action, because all write-like operations across all members of the cluster are blocked until the new node completes the synchronization process, which involves copying cached data from any of the existing cluster nodes into the new node, and this could be very expensive depending on the number of caches entries. For that reason, adding new nodes is considered an expensive operation that should happen only from time to time.
Deleting all entries. When
Nebulex.Cache.delete_all/2
action is executed, like in the previous case, all write-like operations in all members of the cluster are blocked until the deletion action is completed (this implies deleting all cached data from all cluster nodes). Therefore, deleting all entries from cache is also considered an expensive operation that should happen only from time to time.Write-like operations based on a key only block operations related to that key across all members of the cluster. This is not as critical as the previous two cases but it is something to keep in mind anyway because if there is a highly demanded key in terms of writes, that could be also a potential bottleneck.
Summing up, the replicated cache topology along with this adapter should be used mainly when the the reads clearly dominate over the writes (e.g.: Reads 80% and Writes 20% or less). Besides, operations like deleting all entries from cache or adding new nodes must be executed only once in a while to avoid performance issues, since they are very expensive.
Summary
Functions
Helper function to use dynamic cache for internal primary cache storage when needed.
Functions
Helper function to use dynamic cache for internal primary cache storage when needed.