brain_learner (macula_tweann v0.18.1)

Brain learner process for weight adaptation via plasticity.

This GenServer manages the learning aspects of a brain system: - Applies plasticity rules to update weights based on neural activity - Maintains an experience buffer for batch learning - Handles reward signals for reinforcement-style learning

Online Learning

When online learning is enabled, the learner receives activation data after each inference and applies plasticity rules:

Inference → Activations → Learner → Weight Updates → Back to Inference

Batch Learning

For delayed rewards (e.g., end of game), the learner buffers experiences and applies learning when a reward is received:

1. Record experiences during episode 2. Receive final reward 3. Apply learning with reward propagation (eligibility traces)

Theory

This module implements reward-modulated Hebbian learning, where weight changes depend on: - Pre-synaptic activity (input to connection) - Post-synaptic activity (output from connection) - Global reward signal (from environment)

The basic rule: delta_w = learning_rate * pre * post * reward

For delayed rewards, eligibility traces track which synapses were recently active, allowing credit assignment across time.

See also: plasticity, plasticity_modulated.

Summary

Types

experience/0

Functions

clear_experience(Pid)

Clear the experience buffer.

disable(Pid)

Disable learning.

enable(Pid)

Enable learning.

get_auto_record(Pid)

Check if automatic experience recording is enabled.

get_experience_count(Pid)

Get the number of buffered experiences.

get_learning_rate(Pid)

Get the current learning rate.

get_plasticity_rule(Pid)

Get the current plasticity rule.

get_weight_deltas(Pid)

Get accumulated weight deltas from last learning step.

handle_call(Request, From, State)

handle_cast(Msg, State)

handle_info(Info, State)

init(Opts)

is_enabled(Pid)

Check if learning is enabled.

learn_from_experience(Pid)

Learn from buffered experiences using current reward.

learn_from_experience(Pid, FinalReward)

Learn from buffered experiences with a specific reward.

record_experience(Pid, Inputs, Activations)

Record an experience for batch learning.

reward(Pid, Reward)

Provide a reward signal.

set_auto_record(Pid, Enabled)

Enable or disable automatic experience recording.

set_baseline_reward(Pid, Baseline)

Set the baseline reward for comparison.

set_learning_rate(Pid, Rate)

Set the learning rate.

set_plasticity_rule(Pid, Rule)

Set the plasticity rule.

start_link(Opts)

Start a brain learner process.

stop(Pid)

Stop the learner process.

terminate(Reason, State)

Types

experience/0

-type experience() ::
          #{inputs := [float()],
            activations := [[float()]],
            outputs := [float()],
            timestamp := integer()}.

Functions

clear_experience(Pid)

-spec clear_experience(pid()) -> ok.

Clear the experience buffer.

disable(Pid)

-spec disable(pid()) -> ok.

Disable learning.

enable(Pid)

-spec enable(pid()) -> ok.

Enable learning.

get_auto_record(Pid)

-spec get_auto_record(pid()) -> boolean().

Check if automatic experience recording is enabled.

get_experience_count(Pid)

-spec get_experience_count(pid()) -> non_neg_integer().

Get the number of buffered experiences.

get_learning_rate(Pid)

-spec get_learning_rate(pid()) -> float().

Get the current learning rate.

get_plasticity_rule(Pid)

-spec get_plasticity_rule(pid()) -> atom().

Get the current plasticity rule.

get_weight_deltas(Pid)

-spec get_weight_deltas(pid()) -> [float()].

Get accumulated weight deltas from last learning step.

Useful for debugging and visualization.

handle_call(Request, From, State)

handle_cast(Msg, State)

handle_info(Info, State)

init(Opts)

is_enabled(Pid)

-spec is_enabled(pid()) -> boolean().

Check if learning is enabled.

learn_from_experience(Pid)

-spec learn_from_experience(pid()) -> {ok, non_neg_integer()}.

Learn from buffered experiences using current reward.

learn_from_experience(Pid, FinalReward)

-spec learn_from_experience(pid(), float()) -> {ok, non_neg_integer()}.

Learn from buffered experiences with a specific reward.

record_experience(Pid, Inputs, Activations)

-spec record_experience(pid(), [float()], [[float()]]) -> ok.

Record an experience for batch learning.

reward(Pid, Reward)

-spec reward(pid(), float()) -> ok.

Provide a reward signal.

For online learning, this affects the next weight update. Positive rewards strengthen active connections, negative weaken them.

set_auto_record(Pid, Enabled)

-spec set_auto_record(pid(), boolean()) -> ok.

Enable or disable automatic experience recording.

When enabled, the learner automatically records experiences from 'evaluated' events published by the brain via pubsub.

set_baseline_reward(Pid, Baseline)

-spec set_baseline_reward(pid(), float()) -> ok.

Set the baseline reward for comparison.

Effective reward = actual_reward - baseline_reward. This helps with reward normalization.

set_learning_rate(Pid, Rate)

-spec set_learning_rate(pid(), float()) -> ok.

Set the learning rate.

set_plasticity_rule(Pid, Rule)

-spec set_plasticity_rule(pid(), atom()) -> ok.

Set the plasticity rule.

Available rules: none, hebbian, modulated

start_link(Opts)

-spec start_link(map()) -> {ok, pid()} | {error, term()}.

Start a brain learner process.

Options: - inference_pid - PID of the brain inference process (required for weight updates) - enabled - Whether learning is enabled (default: true) - plasticity_rule - Atom identifying the rule (default: modulated) - learning_rate - Learning rate (default: 0.01) - baseline_reward - Baseline to subtract from rewards (default: 0.0) - max_buffer_size - Max experiences to buffer (default: 1000)

stop(Pid)

-spec stop(pid()) -> ok.

Stop the learner process.