plasticity_modulated (macula_tweann v0.18.1)
View SourceReward-modulated Hebbian plasticity rule.
This module implements a variant of Hebbian learning where the weight update is modulated by a global reward or punishment signal. This bridges unsupervised Hebbian learning with reinforcement learning.
Theory
Standard Hebbian learning strengthens any co-active connections, which can lead to "runaway" learning of irrelevant correlations. Modulated Hebbian learning solves this by gating updates with a reward signal:
Δw = η × pre × post × reward
Where reward ∈ [-1, 1]: - Positive reward: strengthens co-active connections (LTP) - Negative reward: weakens co-active connections (LTD) - Zero reward: no weight change (neutral)
This is biologically inspired by dopamine modulation of synaptic plasticity in the basal ganglia and prefrontal cortex.
Eligibility Traces
For delayed rewards, we maintain an eligibility trace that records which synapses were recently active:
e(t) = γ × e(t-1) + pre × post Δw = η × e(t) × reward
Where γ is the trace decay rate (typically 0.9-0.99).
Usage
Weight = {0.5, 0.0, 0.01, []}, PreActivity = 0.8, PostActivity = 0.6, Reward = 1.0, % Positive reward
NewWeight = plasticity_modulated:apply_rule(Weight, PreActivity, PostActivity, Reward). %% => {0.5048, 0.0048, 0.01, []}
Configuration
Parameters in the weight_spec param list: - trace_decay: Eligibility trace decay (default: no trace) - trace: Current eligibility trace value - baseline_reward: Baseline to subtract from reward - reward_scale: Scale factor for reward
References
[1] Schultz, W. (1998). Predictive Reward Signal of Dopamine Neurons. Journal of Neurophysiology. [2] Izhikevich, E.M. (2007). Solving the Distal Reward Problem through Linkage of STDP and Dopamine Signaling. Cerebral Cortex. [3] Fremaux, N., Gerstner, W. (2016). Neuromodulated Spike-Timing- Dependent Plasticity, and Theory of Three-Factor Learning Rules. Frontiers in Neural Circuits.
Summary
Functions
Apply the modulated Hebbian learning rule.
Apply rule with explicit eligibility trace handling.
Return a description of this rule.
Get the eligibility trace from a weight spec.
Initialize state for this rule.
Return the rule name.
Reset the rule state.
Set the eligibility trace in a weight spec.
Update an eligibility trace without applying plasticity.
Types
-type weight_spec() :: plasticity:weight_spec().
Functions
-spec apply_rule(weight_spec(), float(), float(), float()) -> weight_spec().
Apply the modulated Hebbian learning rule.
The weight change is modulated by the reward signal.
-spec apply_with_trace(weight_spec(), float(), float(), float(), float()) -> {weight_spec(), float()}.
Apply rule with explicit eligibility trace handling.
This variant allows external management of the eligibility trace, useful when the trace needs to be shared across weights.
-spec description() -> binary().
Return a description of this rule.
-spec get_trace(weight_spec()) -> float().
Get the eligibility trace from a weight spec.
Initialize state for this rule.
Initializes the eligibility trace if configured.
-spec name() -> atom().
Return the rule name.
Reset the rule state.
-spec set_trace(weight_spec(), float()) -> weight_spec().
Set the eligibility trace in a weight spec.
Update an eligibility trace without applying plasticity.
Useful for maintaining traces during periods without reward.