View Source cets_discovery behaviour (cets v0.2.0)

Node discovery logic.

Joins table together when a new node appears.

Things that make discovery logic harder:

- A table list is dynamic (but eventually we add all the tables into it).

- Creating Erlang distribution connection is async, but it net_kernel:ping/1 is blocking.

- net_kernel:ping/1 could block for unknown number of seconds (but net_kernel default timeout is 7 seconds).

- Resolving nodename could take a lot of time (5 seconds in tests). It is unpredictable blocking.

- join tables should be one by one to avoid OOM.

- Backend:get_nodes/1 could take a long time.

- cets_discovery:get_tables/1, cets_discovery:add_table/2 should be fast.

- The most important net_kernel flags for us to consider are:

* dist_auto_connect=never

* connect_all

* prevent_overlapping_partitions

These flags change the way the discovery logic behaves. Also the module would not try to connect to the hidden nodes.

Retry logic considerations:

- Backend:get_nodes/1 could return an error during startup, so we have to retry fast.

- There are two periods of operation for this module:

* startup phase, usually first 5 minutes.

* regular operation phase, after the startup phase.

- We don't need to check for the updated get_nodes too often in the regular operation phase.

Link to this section Summary

Types

backend_state/0

Backend state.

from/0

gen_server's caller.

get_nodes_result/0

Result of get_nodes/2 call.

join_result/0

Join result information.

milliseconds/0

Number of milliseconds.

opts/0

Backend could define its own options.

retry_type/0

Retry logic type.

server/0

Discovery server process.

start_result/0

Result of start_link/1.

state/0

system_info/0

Discovery status.

Functions

add_table(Server, Table)

Adds a table to be tracked and joined.

delete_table(Server, Table)

Deletes a table from being tracked or joined.

get_tables(Server)

Gets a list of the tracked tables.

info(Server)

Gets information for each tracked table.

start(Opts)

Starts a discovery process.

start_link(Opts)

Starts a discovery process with a link.

system_info(Server)

Gets discovery process status.

wait_for_get_nodes(Server, Timeout)

Waits for the current get_nodes call to return.

wait_for_ready(Server, Timeout)

Blocks until the initial discovery is done.

Link to this section Types

backend_state/0

-type backend_state() :: term().

Backend state.

from/0

-type from() :: {pid(), reference()}.

gen_server's caller.

get_nodes_result/0

-type get_nodes_result() :: {ok, [node()]} | {error, term()}.

Result of get_nodes/2 call.

join_result/0

-type join_result() ::
    #{node := node(),
      table := atom(),
      what := join_result | pid_not_found,
      result => ok | {error, _},
      reason => term()}.

Join result information.

milliseconds/0

-type milliseconds() :: integer().

Number of milliseconds.

opts/0

-type opts() :: #{name := atom(), _ := _}.

Backend could define its own options.

retry_type/0

-type retry_type() :: initial | after_error | regular | after_nodedown.

Retry logic type.

server/0

-type server() :: pid() | atom().

Discovery server process.

start_result/0

-type start_result() :: {ok, pid()} | {error, term()}.

Result of start_link/1.

state/0

-type state() ::
    #{phase := initial | regular,
      results := [join_result()],
      nodes := ordsets:ordset(node()),
      unavailable_nodes := ordsets:ordset(node()),
      tables := [atom()],
      backend_module := module(),
      backend_state := state(),
      get_nodes_status := not_running | running,
      should_retry_get_nodes := boolean(),
      last_get_nodes_result := not_called_yet | get_nodes_result(),
      last_get_nodes_retry_type := retry_type(),
      join_status := not_running | running,
      should_retry_join := boolean(),
      timer_ref := reference() | undefined,
      pending_wait_for_ready := [gen_server:from()],
      pending_wait_for_get_nodes := [gen_server:from()],
      nodeup_timestamps := #{node() => milliseconds()},
      nodedown_timestamps := #{node() => milliseconds()},
      node_start_timestamps := #{node() => milliseconds()},
      start_time := milliseconds()}.

system_info/0

-type system_info() :: map().

Discovery status.

Link to this section Callbacks

get_nodes/1

-callback get_nodes(backend_state()) -> {get_nodes_result(), backend_state()}.

init/1

-callback init(map()) -> backend_state().

Link to this section Functions

add_table(Server, Table)

-spec add_table(server(), cets:table_name()) -> ok.

Adds a table to be tracked and joined.

delete_table(Server, Table)

-spec delete_table(server(), cets:table_name()) -> ok.

Deletes a table from being tracked or joined.

get_tables(Server)

-spec get_tables(server()) -> {ok, [cets:table_name()]}.

Gets a list of the tracked tables.

info(Server)

-spec info(server()) -> [cets:info()].

Gets information for each tracked table.

start(Opts)

-spec start(opts()) -> start_result().

Starts a discovery process.

start_link(Opts)

-spec start_link(opts()) -> start_result().

Starts a discovery process with a link.

system_info(Server)

-spec system_info(server()) -> system_info().

Gets discovery process status.

wait_for_get_nodes(Server, Timeout)

-spec wait_for_get_nodes(server(), timeout()) -> ok.

Waits for the current get_nodes call to return.

Just returns if there is no gen_nodes call running. Waits for another get_nodes, if should_retry_get_nodes flag is set. It is different from wait_for_ready, because it does not wait for unavailable nodes to return pang.

wait_for_ready(Server, Timeout)

-spec wait_for_ready(server(), timeout()) -> ok.

Blocks until the initial discovery is done.

This call would also wait till the data is loaded from the remote nodes.

Settings View Source cets_discovery behaviour (cets v0.2.0)

Link to this section Summary

Types

Callbacks

Functions

Link to this section Types

backend_state/0

from/0

get_nodes_result/0

join_result/0

milliseconds/0

opts/0

retry_type/0

server/0

start_result/0

state/0

system_info/0

Link to this section Callbacks

get_nodes/1

init/1

Link to this section Functions

add_table(Server, Table)

delete_table(Server, Table)

get_tables(Server)

info(Server)

start(Opts)

start_link(Opts)

system_info(Server)

wait_for_get_nodes(Server, Timeout)

wait_for_ready(Server, Timeout)

View Source cets_discovery behaviour (cets v0.2.0)