View Source Rein.Agents.DDPG (rein v0.1.0)
Deep Deterministic Policy Gradient implementation.
This assumes that the Actor network will output {nil, num_actions} actions,
and that the Critic network accepts the "actions" input with the same shape.
Actions are deemed to be in a continuous space of type :f32.