brain_learner (macula_tweann v0.18.1)
View SourceBrain learner process for weight adaptation via plasticity.
This GenServer manages the learning aspects of a brain system: - Applies plasticity rules to update weights based on neural activity - Maintains an experience buffer for batch learning - Handles reward signals for reinforcement-style learning
Online Learning
When online learning is enabled, the learner receives activation data after each inference and applies plasticity rules:
Inference → Activations → Learner → Weight Updates → Back to Inference
Batch Learning
For delayed rewards (e.g., end of game), the learner buffers experiences and applies learning when a reward is received:
1. Record experiences during episode 2. Receive final reward 3. Apply learning with reward propagation (eligibility traces)
Theory
This module implements reward-modulated Hebbian learning, where weight changes depend on: - Pre-synaptic activity (input to connection) - Post-synaptic activity (output from connection) - Global reward signal (from environment)
The basic rule: delta_w = learning_rate * pre * post * reward
For delayed rewards, eligibility traces track which synapses were recently active, allowing credit assignment across time.
See also: plasticity, plasticity_modulated.
Summary
Functions
Clear the experience buffer.
Disable learning.
Enable learning.
Check if automatic experience recording is enabled.
Get the number of buffered experiences.
Get the current learning rate.
Get the current plasticity rule.
Get accumulated weight deltas from last learning step.
Check if learning is enabled.
Learn from buffered experiences using current reward.
Learn from buffered experiences with a specific reward.
Record an experience for batch learning.
Provide a reward signal.
Enable or disable automatic experience recording.
Set the baseline reward for comparison.
Set the learning rate.
Set the plasticity rule.
Start a brain learner process.
Stop the learner process.
Types
Functions
-spec clear_experience(pid()) -> ok.
Clear the experience buffer.
-spec disable(pid()) -> ok.
Disable learning.
-spec enable(pid()) -> ok.
Enable learning.
Check if automatic experience recording is enabled.
-spec get_experience_count(pid()) -> non_neg_integer().
Get the number of buffered experiences.
Get the current learning rate.
Get the current plasticity rule.
Get accumulated weight deltas from last learning step.
Useful for debugging and visualization.
Check if learning is enabled.
-spec learn_from_experience(pid()) -> {ok, non_neg_integer()}.
Learn from buffered experiences using current reward.
-spec learn_from_experience(pid(), float()) -> {ok, non_neg_integer()}.
Learn from buffered experiences with a specific reward.
Record an experience for batch learning.
Provide a reward signal.
For online learning, this affects the next weight update. Positive rewards strengthen active connections, negative weaken them.
Enable or disable automatic experience recording.
When enabled, the learner automatically records experiences from 'evaluated' events published by the brain via pubsub.
Set the baseline reward for comparison.
Effective reward = actual_reward - baseline_reward. This helps with reward normalization.
Set the learning rate.
Set the plasticity rule.
Available rules: none, hebbian, modulated
Start a brain learner process.
Options: - inference_pid - PID of the brain inference process (required for weight updates) - enabled - Whether learning is enabled (default: true) - plasticity_rule - Atom identifying the rule (default: modulated) - learning_rate - Learning rate (default: 0.01) - baseline_reward - Baseline to subtract from rewards (default: 0.0) - max_buffer_size - Max experiences to buffer (default: 1000)
-spec stop(pid()) -> ok.
Stop the learner process.