View Source http_cache (http_cache v0.3.0)

An HTTP caching library

http_cache is a stateless Erlang HTTP caching library that implements the various HTTP RFCs related to caching.

modules

Modules

http_cache exposes functions to cache backend responses, get cached responses whenever they can be served, and invalidate previously stored responses.

http_cache_store is the behaviour to be implemented by stores.

http_cache_store_process is an example store that stores cached responses in the current process and is mainly used for testing purpose.

telemetry-events

Telemetry events

All time measurements are in microseconds.

The following events are emitted by http_cache:
  • [http_cache, lookup] when http_cache:get/2 is called.

    Measurements:
    • total_time: the total time of the lookup
    • store_lookup_time: time taken to query the store for suitable responses
    • response_selection_time: time to select the best response among suitable responses. A high value can indicate the presence of too many variants
    • candidate_count: the number of candidate responses that are returned by the store. A high value can indicate the presence of too many variants
    • decompress_time: time spend decompressing the response
    • range_time: time spend constructing a range response
    Metadata:
    • freshness: one of fresh, stale, must_revalidate or miss
  • [http_cache, cache] when http_cache:cache/3 or http_cache:cache/4 is called.

    Measurements:
    • total_time: the total time of the caching operation
    • store_save_time: time taken to save the response into the store
    • compress_time: time spend compressing the response. This happens when the auto_compress option is used
    • decompress_time: time spend decompressing the response. This happens when the auto_compress option is used but the client does not support compression and the result, stored compressed, has to be returned uncompressed
    • range_time: time spend constructing a range response
    Metadata:
    • cacheable: true if the response was cacheable (and cached), false otherwise
  • [http_cache, invalidation] when http_cache:invalidate_url/2 or http_cache:invalidate_by_alternate_key/2 is called.

    Measurements:
    • duration: the time it took to invalidate entries
    • count: the number of entries invalidated if the store supports returning this value
    Metadata:
    • type: invalidate_by_url or invalidate_by_alternate_key
  • [http_cache, store, error] informs about errors of the store.

    Measurements: none

    Metadata:
    • type: cache, invalidate_by_url` or `invalidate_by_alternate_key
    • reason: an erlang term that gives the error reason
  • [http_cache, compress_operation] whenever a compress operation is performed on an HTTP response.

    Measurements: none

    Metadata:
    • alg: gzip (which is the only supported algorithm at the moment)
  • [http_cache, decompress_operation] whenever a decompress operation is performed on an HTTP response.

    Measurements: none

    Metadata:
    • alg: gzip (which is the only supported algorithm at the moment)

Link to this section Summary

Types

Alternate key attached to a stored response

Request or response body

Request or response headers

An HTTP method, for example "PATCH"

Options passed to the functions of this module

An HTTP request
An HTTP response
HTTP status
UNIX timestamp in seconds
An URL, with the schema, domain and optionally path

Functions

Caches a response

Caches a response when revalidating

Gets a response from the cache for the given answer

Invalidates all responses stored with the alternate key

Invalidates all responses for a URL

Notifies a response is currently being downloaded

Notifies the backend that a response was used

Link to this section Types

-type alternate_key() :: term().

Alternate key attached to a stored response

Used to invalidate by alternate key (e.g. to invalidate all the images if the image alternate key if set to all images).
-type body() :: iodata().
Request or response body
-type headers() :: [{binary(), binary()}].

Request or response headers

A header can appear more than once, this is allowed by HTTP
-type invalidation_result() ::
    {ok, NbInvalidatedResponses :: non_neg_integer() | undefined} | {error, term()}.
-type method() :: binary().
An HTTP method, for example "PATCH"
-type opts() ::
    #{store := module(),
      alternate_keys => [alternate_key()],
      allow_stale_while_revalidate => boolean(),
      allow_stale_if_error => boolean(),
      auto_accept_encoding => boolean(),
      auto_compress => boolean(),
      auto_decompress => boolean(),
      bucket => term(),
      compression_threshold => non_neg_integer(),
      origin_unreachable => boolean(),
      default_ttl => non_neg_integer(),
      default_grace => non_neg_integer(),
      ignore_query_params_order => boolean(),
      max_ranges => non_neg_integer(),
      request_time => non_neg_integer(),
      store_opts => http_cache_store_behaviour:opts(),
      type => type()}.

Options passed to the functions of this module

  • alternate_keys: alternate keys associated with the stored request. Requests can then be invalidated by alternate key with invalidate_by_alternate_key/2. Use by cache/3 and cache/4.
  • allow_stale_while_revalidate: allows returning valid stale response while revalidating. Used by get/2. Defaults to false.
  • allow_stale_if_error: allows returning valid stale response when an error occurs. See https://datatracker.ietf.org/doc/html/rfc5861#section-3. Used by get/2. Defaults to false.
  • auto_accept_encoding: automatically selects an acceptable response based on accept-encoding and content-encoding headers.

    Compressed response vary on the exact value accept-encoding header. For example, gzip, brotli, brotli, gzip, brotli,gzip and gzip;q=1.0, brotli;q=1.0 are equivalent but considered different because their string representation do not match. So, if a response of a request with the accept-encoding: gzip is cached, none of the abovementionned variations would result in returning the cached response. When set to true, this options allows automatically returning acceptable content when available even when headers don't exactly match.

    Doesn't take priority into account (except for priority 0 which is discarded).

    Used by get/2. Defaults to false.
  • auto_compress: automatically compresses decompressed text responses with gzip. This can help with reducing the size of stored content. Moreover, most browsers do support gzip encoding.

    When this option is used, auto_decompress is automatically set to true as well.

    Does not compress responses with strong etags (see https://bz.apache.org/bugzilla/show_bug.cgi?id=63932).

    Used by cache/3 and cache/4. Defaults to false.
  • auto_compress_mime_types: the list of mime-types that are compressed when auto_compress is used.

    Used by cache/3 and cache/4. Defaults to [<<"text/html">>, <<"text/css">>, <<"text/plain">>, <<"text/xml">>, <<"text/javascript">>, <<"application/javascript">>, <<"application/json">>, <<"application/ld+json">>, <<"application/xml">>, <<"application/xhtml+xml">>, <<"application/rss+xml">>, <<"application/atom+xml">>, <<"image/svg+xml">>, <<"font/ttf">>, <<"font/eot">>, <<"font/otf">>, <<"font/opentype">> ]
  • auto_decompress: automatically decompresses stored gzip responses when the client does not support compression.

    Does not decompress responses with strong etags (see https://bz.apache.org/bugzilla/show_bug.cgi?id=63932).

    Used by get/2, cache/3 and cache/4. Defaults to false.
  • bucket: an Erlang term to differentiate between different caches. For instance, when what needs to use several private caches, this option can be used to differentiate the cached responses and prevent them from being mixed up, potentially leaking private data. Used by get/2, cache/3 and cache/4. Defaults to the atom default.
  • compression_threshold: compression threshold in bytes. Compressing a very tiny response can result in actually bigger response (in addition to the performance hit of compression it).

    Although there's no additional cost when this library serves a compressed file, but it has a cost on the client that has to decompress it.

    This is why the default value is so high: we want to make sure that it's worth performing compression and decompression.

    See further discussion: https://webmasters.stackexchange.com/questions/31750/what-is-recommended-minimum-object-size-for-gzip-performance-benefits.

    Used by cache/3 and cache/4. Defaults to 1000.
  • origin_unreachable: indicates that the current cache using this library is unable to reach the origin server. In this case, a stale response can be returned even if the HTTP cache headers do not explicitely allow it. Used by get/2. Defaults to false.
  • default_ttl: the default TTL, in seconds. This value is used when no TTL information is found in the response, but the response is cacheable by default (see https://datatracker.ietf.org/doc/html/rfc7231#section-6.1). Used by cache/3 and cache/4. Defaults to 120.
  • default_grace: the amount of time an expired response is kept in the cache. Such a response is called a stale response, and can be returned in some circumstances, for instance when the origin server returns an 5xx error and stale-if-error header is used. Use by cache/3 and cache/4. Defaults to 120.
  • ignore_query_params_order: when a response is cached, a request key is computed based on the method, URL and body. This option allows to keep the same request key for URLs whose parameters are identical, but in different order. This helps increasing cache hit if URL parameter order doesn't matter. Used by get/2, cache/3 and cache/4. Defaults to false.
  • max_ranges: maximum number of range sets accepted when responding to a range request. This is limited to avoid DOS attack by a client. See https://datatracker.ietf.org/doc/html/rfc7233#section-6.1. Used by get/2. Defaults to 100.
  • store: required, the store backend's module name. Used by all functions, no defaults.
  • store_opts: the store backend's options. Used by all functions, defaults to [].
  • type: cache type. shared or private. A CDN is an example of a shared cache. A browser cache is an example of a private cache. Used by get/2, cache/3 and cache/4. Defaults to shared.
  • request_time: the time the request was initiated, as a UNIX timestamp in seconds. Setting this timestamp helps correcting the age of the request between the time the request was made and the time the response was received and cached, which can be several seconds. Used by cache/3 and cache/4.
-type request() :: {method(), url(), headers(), body()}.
An HTTP request
-type response() :: {status(), headers(), body() | sendfile()}.
An HTTP response
-type sendfile() ::
    {sendfile, Offset :: non_neg_integer(), Length :: non_neg_integer() | all, Path :: binary()}.
-type status() :: pos_integer().
HTTP status
-type timestamp() :: non_neg_integer().
UNIX timestamp in seconds
-type type() :: shared | private.
-type url() :: binary().
An URL, with the schema, domain and optionally path

Link to this section Functions

Link to this function

cache(Request, Response, Opts)

View Source
-spec cache(request(), response(), opts()) -> {ok, response()} | not_cacheable.

Caches a response

This function never returns an error, even when the backend store returns one. Instead it returns {ok, response()} when the response is cacheable (even if an error occurs to actually save it) or not_cacheable when the response cannot be cached.

When {ok, response()} is returned, the response should be returned to the client instead of the initial response that was passed as a parameter, because it is transformed accordingly to the options passed: it can be compressed or uncompressed, and it will be returned as a range response if the request is a range request and the backend doesn't support it and returned a full response.

This function shall be called with any response, even those known to be not cacheable, such as DELETE requests, because such non-cacheable request can still have side effects on other cached objects (see https://www.rfc-editor.org/rfc/rfc9111.html#name-invalidating-stored-respons). In this example, a successful DELETE request triggers the invalidation of cached results of the deleted object with the same URL.
Link to this function

cache(Request, Response, RevalidatedResponse, Opts)

View Source
-spec cache(Request :: request(), Response :: response(), RevalidatedResponse :: response(), opts()) ->
         {ok, response()} | not_cacheable.

Caches a response when revalidating

Similar to cache/3, but to be used when revalidating a response, when get/2 return a :must_revalidate response. The Response parameter is the response received from the origin server, and the RevalidatedResponse parameter is the previously :must_revalidate response that is being revalidated.

When the returned response is a 304 (not modified) response, stored responses are updated and a response is returned from the 2 responses passed as a parameter. It's recommended to use the response returned by this function, because the 304 response is used to update headers of the revalidated response.

Otherwise, cache/3 is called.

Gets a response from the cache for the given answer

The function returns one of:
  • {fresh, {http_cache_store_behaviour:response_ref(), response()}}: the response is fresh and can be returned directly to the client.
  • {stale, {http_cache_store_behaviour:response_ref(), response()}}: the response is stale but can be directly returned to the client.

    Stale responses that are cached but cannot be returned do to unfulfilled condition are not returned.

    By default, a stale response is returned only when there's a max-stale header in the request. See the following option to enable returning stale response in other cases:
    • allow_stale_while_revalidate
    • allow_stale_if_error
    • origin_unreachable
  • {must_revalidate, {http_cache_store_behaviour:response_ref(), response()}}: the response must be revalidated.
  • miss: no suitable response was found.
Using this function does not automatically notify the response was returned. Therefore, use notify_response_used/2 with the returned response reference when a cached response is used.
Link to this function

invalidate_by_alternate_key(AltKeys, Opts)

View Source
-spec invalidate_by_alternate_key(alternate_key() | [alternate_key()], opts()) -> invalidation_result().
Invalidates all responses stored with the alternate key
Link to this function

invalidate_url(Url, Opts)

View Source
-spec invalidate_url(url(), opts()) -> invalidation_result().

Invalidates all responses for a URL

This includes all variants and all responses for all HTTP methods.
Link to this function

notify_downloading(Request, Pid, Opts)

View Source
-spec notify_downloading(request(), pid(), opts()) -> ok.

Notifies a response is currently being downloaded

For future use, does not do anything at the moment.
Link to this function

notify_response_used(RespRef, Opts)

View Source
-spec notify_response_used(http_cache_store_behaviour:response_ref(), opts()) -> ok | {error, term()}.

Notifies the backend that a response was used

Some backends, such as LRU backends, need to update metadata (in that case: last used time) when a response is used.