View Source Nebulex.Adapters.Local (Nebulex v2.6.4)
Adapter module for Local Generational Cache; inspired by epocxy.
Generational caching using an ets table (or multiple ones when used with
:shards
) for each generation of cached data. Accesses hit the newer
generation first, and migrate from the older generation to the newer
generation when retrieved from the stale table. When a new generation
is started, the oldest one is deleted. This is a form of mass garbage
collection which avoids using timers and expiration of individual
cached elements.
This implementation of generation cache uses only two generations
(which is more than enough) also referred like the newer
and
the older
.
Overall features
- Configurable backend (
ets
or:shards
). - Expiration – A status based on TTL (Time To Live) option. To maintain cache performance, expired entries may not be immediately removed or evicted, they are expired or evicted on-demand, when the key is read.
- Eviction – Generational Garbage Collection.
- Sharding – For intensive workloads, the Cache may also be partitioned
(by using
:shards
backend and specifying the:partitions
option). - Support for transactions via Erlang global name registration facility.
- Support for stats.
Options
This adapter supports the following options and all of them can be given via the cache configuration:
:backend
- Defines the backend or storage to be used for the adapter. Supported backends are::ets
and:shards
. Defaults to:ets
.:read_concurrency
- (boolean) Since this adapter uses ETS tables internally, this option is used when a new table is created; see:ets.new/2
. Defaults totrue
.:write_concurrency
- (boolean) Since this adapter uses ETS tables internally, this option is used when a new table is created; see:ets.new/2
. Defaults totrue
.:compressed
- (boolean) This option is used when a new ETS table is created and it defines whether or not it includes X as an option; see:ets.new/2
. Defaults tofalse
.:backend_type
- This option defines the type of ETS to be used (Defaults to:set
). However, it is highly recommended to keep the default value, since there are commands not supported (unexpected exception may be raised) for types like:bag
or: duplicate_bag
. Please see the ETS docs for more information.:partitions
- If it is set, an integer > 0 is expected, otherwise, it defaults toSystem.schedulers_online()
. This option is only available for:shards
backend.:gc_interval
- If it is set, an integer > 0 is expected defining the interval time in milliseconds to garbage collection to run, delete the oldest generation and create a new one. If this option is not set, garbage collection is never executed, so new generations must be created explicitly, e.g.:MyCache.new_generation(opts)
.:max_size
- If it is set, an integer > 0 is expected defining the max number of cached entries (cache limit). If it is not set (nil
), the check to release memory is not performed (the default).:allocated_memory
- If it is set, an integer > 0 is expected defining the max size in bytes allocated for a cache generation. When this option is set and the configured value is reached, a new cache generation is created so the oldest is deleted and force releasing memory space. If it is not set (nil
), the cleanup check to release memory is not performed (the default).:gc_cleanup_min_timeout
- An integer > 0 defining the min timeout in milliseconds for triggering the next cleanup and memory check. This will be the timeout to use when either the max size or max allocated memory is reached. Defaults to10_000
(10 seconds).:gc_cleanup_max_timeout
- An integer > 0 defining the max timeout in milliseconds for triggering the next cleanup and memory check. This is the timeout used when the cache starts and there are few entries or the consumed memory is near to0
. Defaults to600_000
(10 minutes).:gc_flush_delay
- If it is set, an integer > 0 is expected defining the delay in milliseconds before objects from the oldest generation are flushed. Defaults to10_000
(10 seconds).
Usage
Nebulex.Cache
is the wrapper around the cache. We can define a
local cache as follows:
defmodule MyApp.LocalCache do
use Nebulex.Cache,
otp_app: :my_app,
adapter: Nebulex.Adapters.Local
end
Where the configuration for the cache must be in your application
environment, usually defined in your config/config.exs
:
config :my_app, MyApp.LocalCache,
gc_interval: :timer.hours(12),
max_size: 1_000_000,
allocated_memory: 2_000_000_000,
gc_cleanup_min_timeout: :timer.seconds(10),
gc_cleanup_max_timeout: :timer.minutes(10)
For intensive workloads, the Cache may also be partitioned using :shards
as cache backend (backend: :shards
) and configuring the desired number of
partitions via the :partitions
option. Defaults to
System.schedulers_online()
.
config :my_app, MyApp.LocalCache,
gc_interval: :timer.hours(12),
max_size: 1_000_000,
allocated_memory: 2_000_000_000,
gc_cleanup_min_timeout: :timer.seconds(10),
gc_cleanup_max_timeout: :timer.minutes(10),
backend: :shards,
partitions: System.schedulers_online() * 2
If your application was generated with a supervisor (by passing --sup
to mix new
) you will have a lib/my_app/application.ex
file containing
the application start callback that defines and starts your supervisor.
You just need to edit the start/2
function to start the cache as a
supervisor on your application's supervisor:
def start(_type, _args) do
children = [
{MyApp.LocalCache, []},
...
]
See Nebulex.Cache
for more information.
Eviction configuration
This section is to understand a bit better how the different configuration options work and have an idea what values to set; especially if it is the first time using Nebulex.
:ttl
option
The :ttl
option that is used to set the expiration time for a key, it
doesn't work as eviction mechanism, since the local adapter implements a
generational cache, the options that control the eviction process are:
:gc_interval
, :gc_cleanup_min_timeout
, :gc_cleanup_max_timeout
,
:max_size
and :allocated_memory
. The :ttl
is evaluated on-demand
when a key is retrieved, and at that moment if it s expired, then remove
it from the cache, hence, it can not be used as eviction method, it is
more for keep the integrity and consistency in the cache. For this reason,
it is highly recommended to configure always the eviction options mentioned
before.
Caveats when using :ttl
option:
- When using the
:ttl
option, ensure it is less than:gc_interval
, otherwise, there may be a situation where the key is evicted and the:ttl
hasn't happened yet (maybe because the garbage collector ran before the key had been fetched). - Assuming you have
:gc_interval
set to 2 hrs, then you put a new key with:ttl
set to 1 hr, and 1 minute later the GC runs, that key will be moved to the older generation so it can be yet retrieved. On the other hand, if the key is never fetched till the next GC cycle (causing moving it to the newer generation), since the key is already in the oldest generation it will be evicted from the cache so it won't be retrievable anymore.
Garbage collection or eviction options
This adapter implements a generational cache, which means its main eviction mechanism is pushing a new cache generation and remove the oldest one. In this way, we ensure only the most frequently used keys are always available in the newer generation and the the least frequently used are evicted when the garbage collector runs, and the garbage collector is triggered upon these conditions:
- When the time interval defined by
:gc_interval
is completed. This makes the garbage-collector process to run creating a new generation and forcing to delete the oldest one. - When the "cleanup" timeout expires, and then the limits
:max_size
and:allocated_memory
are checked, if one of those is reached, then the garbage collector runs (a new generation is created and the oldest one is deleted). The cleanup timeout is controlled by:gc_cleanup_min_timeout
and:gc_cleanup_max_timeout
, it works with an inverse linear backoff, which means the timeout is inverse proportional to the memory growth; the bigger the cache size is, the shorter the cleanup timeout will be.
First-time configuration
For configuring the cache with accurate and/or good values it is important to know several things in advance, like for example the size of an entry in average so we can calculate a good value for max size and/or allocated memory, how intensive will be the load in terms of reads and writes, etc. The problem is most of these aspects are unknown when it is a new app or we are using the cache for the first time. Therefore, the following recommendations will help you to configure the cache for the first time:
- When configuring the
:gc_interval
, think about how that often the least frequently used entries should be evicted, or what is the desired retention period for the cached entries. For example, if:gc_interval
is set to 1 hr, it means you will keep in cache only those entries that are retrieved periodically within a 2 hr period;gc_interval * 2
, being 2 the number of generations. Longer than that, the GC will ensure is always evicted (the oldest generation is always deleted). If it is the first time using Nebulex, perhaps you can start withgc_interval: :timer.hours(12)
(12 hrs), so the max retention period for the keys will be 1 day; but ensure you also set either the:max_size
or:allocated_memory
. - It is highly recommended to set either
:max_size
or:allocated_memory
to ensure the oldest generation is deleted (least frequently used keys are evicted) when one of these limits is reached and also to avoid running out of memory. For example, for the:allocated_memory
we can set 25% of the total memory, and for the:max_size
something between100_000
and1_000_000
. - For
:gc_cleanup_min_timeout
we can set10_000
, which means when the cache is reaching the size or memory limit, the polling period for the cleanup process will be 10 seconds. And for:gc_cleanup_max_timeout
we can set600_000
, which means when the cache is almost empty the polling period will be close to 10 minutes.
Stats
This adapter does support stats by using the default implementation
provided by Nebulex.Adapter.Stats
. The adapter also uses the
Nebulex.Telemetry.StatsHandler
to aggregate the stats and keep
them updated. Therefore, it requires the Telemetry events are emitted
by the adapter (the :telemetry
option should not be set to false
so the Telemetry events can be dispatched), otherwise, stats won't
work properly.
Queryable API
Since this adapter is implemented on top of ETS tables, the query must be
a valid match spec given by :ets.match_spec()
. However, there are some
predefined and/or shorthand queries you can use. See the section
"Predefined queries" below for for information.
Internally, an entry is represented by the tuple
{:entry, key, value, touched, ttl}
, which means the match pattern within
the :ets.match_spec()
must be something like:
{:entry, :"$1", :"$2", :"$3", :"$4"}
.
In order to make query building easier, you can use Ex2ms
library.
Predefined queries
nil
- All keys are returned.:unexpired
- All unexpired keys/entries.:expired
- All expired keys/entries.{:in, [term]}
- Only the keys in the given key list ([term]
) are returned. This predefined query is only supported forNebulex.Cache.delete_all/2
. This is the recommended way of doing bulk delete of keys.
Examples
# built-in queries
MyCache.all()
MyCache.all(:unexpired)
MyCache.all(:expired)
MyCache.all({:in, ["foo", "bar"]})
# using a custom match spec (all values > 10)
spec = [{{:_, :"$1", :"$2", :_, :_}, [{:>, :"$2", 10}], [{{:"$1", :"$2"}}]}]
MyCache.all(spec)
# using Ex2ms
import Ex2ms
spec =
fun do
{_, key, value, _, _} when value > 10 -> {key, value}
end
MyCache.all(spec)
The :return
option applies only for built-in queries, such as:
nil | :unexpired | :expired
, if you are using a custom :ets.match_spec()
,
the return value depends on it.
The same applies to the stream
function.
Extended API (convenience functions)
This adapter provides some additional convenience functions to the
Nebulex.Cache
API.
Creating new generations:
MyCache.new_generation()
MyCache.new_generation(reset_timer: false)
Retrieving the current generations:
MyCache.generations()
Retrieving the newer generation:
MyCache.newer_generation()