View Source Rein.Agent behaviour (rein v0.1.0)

The behaviour that should be implemented by a Rein agent module.

Summary

Types

The full state of the current Reinforcement Learning process, as stored in the Rein struct

t()

An arbitrary Nx.Container that holds metadata for the agent

Callbacks

Initializes the agent state with the given agent-specific options.

Can be used to record the observation in an experience replay buffer.

Resets any values that vary between sessions (which would be episodes for episodic tasks) for the agent state.

Selects the action to be taken.

Types

@type rl_state() :: Rein.t()

The full state of the current Reinforcement Learning process, as stored in the Rein struct

@type t() :: Nx.Container.t()

An arbitrary Nx.Container that holds metadata for the agent

Callbacks

@callback init(random_key :: Nx.t(), opts :: keyword()) :: {t(), random_key :: Nx.t()}

Initializes the agent state with the given agent-specific options.

Should be implemented in a way that the result would be semantically the same as if reset/2 was called in the end of the function.

As a suggestion, the implementation should only initialize fixed values here, that is values that don't change between sessions (epochs for non-episodic tasks, episodes for episodic tasks). Then, call reset/2 internally to initialize the rest of variable values.

Link to this callback

optimize_model(rl_state)

View Source
@callback optimize_model(rl_state()) :: rl_state()
Link to this callback

record_observation(rl_state, action, reward, is_terminal, next_rl_state)

View Source
@callback record_observation(
  rl_state(),
  action :: Nx.t(),
  reward :: Nx.t(),
  is_terminal :: Nx.t(),
  next_rl_state :: rl_state()
) :: rl_state()

Can be used to record the observation in an experience replay buffer.

If this is not desired, just make this function return the first argument unchanged.

Link to this callback

reset(random_key, rl_state)

View Source
@callback reset(random_key :: Nx.t(), rl_state :: t()) :: {t(), random_key :: Nx.t()}

Resets any values that vary between sessions (which would be episodes for episodic tasks) for the agent state.

Link to this callback

select_action(rl_state, iteration)

View Source
@callback select_action(rl_state(), iteration :: Nx.t()) :: {action :: Nx.t(), rl_state()}

Selects the action to be taken.