View Source Getting Started
Table of Contents
- Prelude
- Installation
- Creating partitioned tables
- Inserting entries
- Retrieving entries
- Making partitioned tables part of an application's supervision tree
- Advanced topics
Prelude
In this guide, we're going to learn some basics about shards
, how to create
partitioned tables and how to use them; which is the easiest part, since
it is the same ETS API.
Installation
Erlang
In your rebar.config
:
{deps, [
{shards, "1.1.0"}
]}.
Elixir
In your mix.exs
:
def deps do
[{:shards, "~> 1.1"}]
end
Creating partitioned tables
Exactly as ETS, shards:new/2
function receives 2 arguments, the name of the
table and the options. But in addition to the options given by ets:new/2
,
shards
provides the next ones:
{partitions, pos_integer()}
- Allows to set the desired number of partitions. By default, the number of partitions is the total of online schedulers (erlang:system_info(schedulers_online)
).{keyslot_fun, shards_meta:keyslot_fun()}
- Function used to compute the partition where the action will be evaluated based on the key. Defaults toerlang:phash2/2
.{parallel, boolean()}
- Specifies whethershards
should work in parallel mode or not, for the applicable functions, e.g.:select
,match
, etc. By default is set tofalse
.{parallel_timeout, timeout()}
- Whenparallel
is set totrue
, it specifies the max timeout for a parallel execution. Defaults toinfinity
.
Wen a new table is created, the metadata is created for that table as well. The purpose of the metadata is to store information related to that table, such as: number of partitions, keyslot function, etc. To learn more about it, check out shards_meta.
Erlang:
% create a named table with 4 shards
> shards:new(tab1, [named_table, {partitions, 4}]).
tab1
% create another one with default options
> shards:new(tab2, []).
#Ref<0.893853601.1266286595.53859>
Elixir:
# create a named table with 4 shards
iex> :shards.new(:tab1, [:named_table, partitions: 4])
:tab1
# create another one with default options
iex> :shards.new(:tab2, [])
#Reference<0.1257361217.1909325839.17166>
NOTE: You can also start the observer by calling
observer:start()
to see how a partitioned table looks like.
Inserting entries
Erlang:
> shards:insert(tab1, {k1, 1}).
true
> shards:insert(tab1, [{k1, 1}, {k2, 2}]).
true
> shards:insert_new(tab1, {k3, 3}).
true
> shards:insert_new(tab1, {k3, 3}).
false
> shards:insert_new(tab1, [{k3, 3}, {k4, 4}]).
false
Elixir:
iex> :shards.insert(:tab1, k1: 1)
true
iex> :shards.insert(:tab1, k1: 1, k2: 2)
true
iex> :shards.insert_new(:tab1, k3: 3)
true
iex> :shards.insert_new(:tab1, k3: 3)
false
iex> :shards.insert_new(:tab1, k3: 3, k4: 4)
false
Retrieving entries
Erlang:
% inserting some objects
> shards:insert(tab1, [{k1, 1}, {k2, 2}, {k3, 3}]).
true
% let's check those objects
> shards:lookup(tab1, k1).
[{k1,1}]
> shards:lookup(tab1, k2).
[{k2,2}]
> shards:lookup(tab1, k3).
[{k3,3}]
> shards:lookup(tab1, k4).
[]
> shards:lookup_element(tab1, k3, 2).
3
% delete an object and then check
> shards:delete(tab1, k3).
true
> shards:lookup(tab1, k3).
[]
% now let's find all stored objects using select
> MatchSpec = ets:fun2ms(fun({K, V}) -> {K, V} end).
[{{'$1','$2'},[],[{{'$1','$2'}}]}]
> shards:select(tab1, MatchSpec).
[{k2,2},{k1,1}]
Elixir:
iex> :shards.insert(:tab1, k1: 1, k2: 2, k3: 3)
true
iex> :shards.lookup(:tab1, :k1)
[k1: 1]
iex> :shards.lookup(:tab1, :k2)
[k2: 2]
iex> :shards.lookup(:tab1, :k3)
[k3: 3]
iex> :shards.lookup(:tab1, :k4)
[]
iex> :shards.lookup_element(:tab1, :k3, 2)
3
iex> :shards.delete(:tab1, :k3)
true
iex> :shards.lookup(:tab1, :k3)
[]
iex> ms = :ets.fun2ms(& &1)
[{:"$1", [], [:"$1"]}]
iex> :shards.select(:tab1, ms)
[k2: 2, k1: 1]
As you may have noticed using shards
is extremely easy, it's only matters of
using the same ETS API but with shards
module.
You can try the rest of the ETS API but using shards
.
Making partitioned tables part of an application's supervision tree
There might be some cases we may want to start the tables as part of an existing supervision tree. To do so, we can create a dynamic supervisor for taking care of creating/deleting the tables and add the dynamic supervisor as part of our app supervision tree.
Shards provides the module shards_group
, which is a dynamic supervisor that
can bee added to an existing application and/or supervision tree. Besides,
shards_group
brings with the function to create tables making them part of
the dynamic supervisor and also with the function to delete them and remove
them from the dynamic supervisor. But let's see how it works!
Erlang example
Supposing you have an application myapp
and the main supervisor myapp_sup
,
the only piece of configuration is to setup the shards_group
as a supervisor
within the application's supervision tree, like so:
-module(myapp_sup).
-export([start_link/1, init/1]).
start_link() ->
supervisor:start_link({name, ?MODULE}, ?MODULE, []).
init(_) ->
Children = [
% Dynamic supervisor with default name shards_group
% See shards_group:child_spec/1
shards_group:child_spec()
],
{ok, {{one_for_one, 10, 10}, Children}}.
The call
shards_group:child_spec()
will return the spec using the default name for the dynamic supervisorshards_group
, but can pass the desired name by callingshards_group:child_spec(desired_name)
.
Once the app is started, you can use shards_group
to create/delete tables
and shards
for the rest of the API functions, like so:
% create a table as part of the app supervision tree
> shards_group:new_table(myapp_dynamic_sup, tab1, [named_table]).
{ok,<0.194.0>,#Ref<0.3052443831.2753691659.260835>}
> shards:insert(tab1, [{a, 1}, {b, 2}]).
true
> shards:lookup_element(tab1, a, 2).
1
% let's create another table
> {ok, _Pid, Tab} = shards_group:new_table(myapp_dynamic_sup, tab2, []).
{ok,<0.195.0>,#Ref<0.3052443831.2753691659.260836>}
> shards:insert(tab2, [{c, 3}, {d, 4}]).
true
> shards:lookup_element(tab1, c, 2).
3
% you can open the observer to see hoe the tables look like
> observer:start().
ok
% deleting a table
> shards_group:del_table(myapp_dynamic_sup, tab1).
true
Elixir example
In Elixir is much easier it provides the DynamicSupervisor
module out-of-box,
so we have two options, either use :shards_group
like before in the Erlang
example, or use DynamicSupervisor
and create a simple module to encapsulate
the logic of creating/deleting tables. Since we already know how :shards_group
works in the previous example, let's take the second option.
First of all, let's create the module to encapsulate the logic of creating and deleting the tables:
defmodule MyApp.DynamicShards do
@moduledoc false
# creates a table with shards as part of the DynamicSupervisor which at
# the same time is part of the app supervision tree
def new(name, opts) do
{:ok, _pid, tab} =
DynamicSupervisor.start_child(__MODULE__, table_spec(name, opts))
tab
end
# deletes the table and also removes the table supervisor from the
# DynamicSupervisor
def delete(tab) do
DynamicSupervisor.terminate_child(__MODULE__, :shards_meta.tab_pid(tab))
end
# this functions encapsulates the logic of creating the sharded table
# as child of the DynamicSupervisor
def start_table(name, opts) do
tab = :shards.new(name, opts)
pid = :shards_meta.tab_pid(tab)
{:ok, pid, tab}
end
# DynamicSupervisor child spec
defp table_spec(name, opts) do
%{
id: name,
start: {__MODULE__, :start_table, [name, opts]},
type: :supervisor
}
end
end
The final piece of configuration is to setup a DynamicSupervisor
as a
supervisor within the application's supervision tree, which we can do in
lib/my_app/application.ex
, inside the start/2
function:
def start(_type, _args) do
children = [
{DynamicSupervisor, strategy: :one_for_one, name: MyApp.DynamicShards}
]
...
Now we can use :shards
:
iex> MyApp.DynamicShards.new(:t1, [:named_table])
:t1
iex> t2 = MyApp.DynamicShards.new(:t2, [])
#Reference<0.2506606486.3592028173.65798>
iex> :shards.insert(:t1, a: 1, b: 2)
true
iex> :shards.insert(t2, c: 3, d: 4)
true
iex> :shards.lookup_element(:t1, :a, 2)
1
iex> :shards.lookup_element(t2, :c, 2)
3
# open the observer
iex> :observer.start()
ok
# delete a table
iex> MyApp.DynamicShards.delete(:t1)
:ok
Advanced topics
Storing and retrieving custom metadata entries
Since shards
uses an internal ETS table for the metadata, it also provides
helper functions for storing and retrieving custom metadata entries.
> Tab = shards:new(mytab, [named_table, {partitions, 4}]).
mytab
> shards:put_meta(Tab, foo, bar).
ok
> shards:get_meta(Tab, foo).
bar
% Non-existing key
> shards:get_meta(Tab, foo_foo).
undefined
% With default
> shards:get_meta(Tab, foo_foo, bar_bar).
bar_bar
Elixir:
iex> tab = :shards.new(:mytab, [:named_table, partitions: 4])
:mytab
iex> :shards.put_meta(tab, "foo", "bar")
:ok
iex> :shards.get_meta(tab, "foo")
"bar"
# Non-existing key
iex> :shards.get_meta(tab, "foo foo")
:undefined
# With default
iex> :shards.get_meta(tab, "foo foo", "bar bar")
"bar bar"
Caching the metadata
Like it is explained in shards module,
when a partitioned table is created there is a metadata associated to it, to
resolve the number of partitions and other needed attributes. Hence, every time
a function is executed, shards
has to resolve the metadata first; but this is
done internally by shards
. Nevertheless, shards
allows to pass the metadata
as last argument for most of the functions (check the docs to see what functions
allow it), in case you are able to cache it and use it later for further
operations. This is possible because the metadata is something doesn't change,
so it can be easily cached and avoid shards
the extra step to retrieve it
from the meta table. But be aware this is just an ETS lookup, so the improvement
is terms of performance may be insignificant, for that reason it is
recommendable to evaluate very carefully each scenario and and see if its really
worth it.
The overall recommendation is to let
shards
take care of retrieving the metadata and everything else, just use the baseshards
API (ETS API).
In case you want to cache the metadata:
Erlang:
> Tab = shards:new(mytab, [named_table, {partitions, 4}]).
mytab
> Meta = shards:table_meta(Tab).
{meta,<0.174.0>,1,4,fun erlang:phash2/2,false,[named_table]}
Elixir:
iex> tab = :shards.new(:mytab, [:named_table, partitions: 4])
:mytab
iex> meta = :shards.meta(tab)
{:meta, #PID<0.163.0>, 1, 4, &:erlang.phash2/2, false, [:named_table]}
Then, you can use it for most of the functions (check the docs):
Erlang:
> shards:insert(Tab, [{a, 1}, {b, 2}], Meta).
true
> shards:lookup(Tab, a, Meta).
[{a,1}]
> shards:lookup_element(Tab, a, 2, Meta).
1
Elixir:
iex> :shards.insert(tab, [a: 1, b: 2], meta)
true
iex> :shards.lookup(tab, :a, meta)
[a: 1]
iex> :shards.lookup_element(tab, :a, 2, meta)
1
NOTE: See shards docs for more information about the functions that support receiving the metadata; most of them allow the metadata parameter, but there are few exceptions you should know.