View Source Rein.Environment behaviour (rein v0.1.0)

Defines an environment to be passed to Rein.

Summary

Types

The full state of the current Reinforcement Learning process, as stored in the Rein struct

t()

An arbitrary Nx.Container that holds metadata for the environment

Callbacks

Applies the selected action to the environment.

Initializes the environment state with the given enviroment-specific options.

Resets any values that vary between sessions (which would be episodes for episodic tasks, epochs for non-episodic tasks) for the environment state.

Types

@type rl_state() :: Rein.t()

The full state of the current Reinforcement Learning process, as stored in the Rein struct

@type t() :: Nx.Container.t()

An arbitrary Nx.Container that holds metadata for the environment

Callbacks

Link to this callback

apply_action(rl_state, action)

View Source
@callback apply_action(rl_state(), action :: Nx.t()) :: rl_state()

Applies the selected action to the environment.

Returns the updated environment, also updated with the reward and a flag indicating whether the new state is terminal.

@callback init(random_key :: Nx.t(), opts :: keyword()) :: {t(), random_key :: Nx.t()}

Initializes the environment state with the given enviroment-specific options.

Should be implemented in a way that the result would be semantically the same as if reset/2 was called in the end of the function.

As a suggestion, the implementation should only initialize fixed values here, that is values that don't change between sessions (epochs for non-episodic tasks, episodes for episodic tasks). Then, call reset/2 internally to initialize the rest of variable values.

Link to this callback

reset(random_key, environment_state)

View Source
@callback reset(random_key :: Nx.t(), environment_state :: t()) ::
  {t(), random_key :: Nx.t()}

Resets any values that vary between sessions (which would be episodes for episodic tasks, epochs for non-episodic tasks) for the environment state.