plasticity_modulated (macula_tweann v0.18.1)

View Source

Reward-modulated Hebbian plasticity rule.

This module implements a variant of Hebbian learning where the weight update is modulated by a global reward or punishment signal. This bridges unsupervised Hebbian learning with reinforcement learning.

Theory

Standard Hebbian learning strengthens any co-active connections, which can lead to "runaway" learning of irrelevant correlations. Modulated Hebbian learning solves this by gating updates with a reward signal:

Δw = η × pre × post × reward

Where reward ∈ [-1, 1]: - Positive reward: strengthens co-active connections (LTP) - Negative reward: weakens co-active connections (LTD) - Zero reward: no weight change (neutral)

This is biologically inspired by dopamine modulation of synaptic plasticity in the basal ganglia and prefrontal cortex.

Eligibility Traces

For delayed rewards, we maintain an eligibility trace that records which synapses were recently active:

e(t) = γ × e(t-1) + pre × post Δw = η × e(t) × reward

Where γ is the trace decay rate (typically 0.9-0.99).

Usage

Weight = {0.5, 0.0, 0.01, []}, PreActivity = 0.8, PostActivity = 0.6, Reward = 1.0, % Positive reward

NewWeight = plasticity_modulated:apply_rule(Weight, PreActivity, PostActivity, Reward). %% => {0.5048, 0.0048, 0.01, []}

Configuration

Parameters in the weight_spec param list: - trace_decay: Eligibility trace decay (default: no trace) - trace: Current eligibility trace value - baseline_reward: Baseline to subtract from reward - reward_scale: Scale factor for reward

References

[1] Schultz, W. (1998). Predictive Reward Signal of Dopamine Neurons. Journal of Neurophysiology. [2] Izhikevich, E.M. (2007). Solving the Distal Reward Problem through Linkage of STDP and Dopamine Signaling. Cerebral Cortex. [3] Fremaux, N., Gerstner, W. (2016). Neuromodulated Spike-Timing- Dependent Plasticity, and Theory of Three-Factor Learning Rules. Frontiers in Neural Circuits.

Summary

Functions

Apply the modulated Hebbian learning rule.

Apply rule with explicit eligibility trace handling.

Return a description of this rule.

Get the eligibility trace from a weight spec.

Initialize state for this rule.

Return the rule name.

Reset the rule state.

Set the eligibility trace in a weight spec.

Update an eligibility trace without applying plasticity.

Types

weight_spec/0

-type weight_spec() :: plasticity:weight_spec().

Functions

apply_rule(_, PreActivity, PostActivity, Reward)

-spec apply_rule(weight_spec(), float(), float(), float()) -> weight_spec().

Apply the modulated Hebbian learning rule.

The weight change is modulated by the reward signal.

apply_with_trace(_, PreActivity, PostActivity, Reward, TraceDecay)

-spec apply_with_trace(weight_spec(), float(), float(), float(), float()) -> {weight_spec(), float()}.

Apply rule with explicit eligibility trace handling.

This variant allows external management of the eligibility trace, useful when the trace needs to be shared across weights.

description()

-spec description() -> binary().

Return a description of this rule.

get_trace(_)

-spec get_trace(weight_spec()) -> float().

Get the eligibility trace from a weight spec.

init(Params)

-spec init(map()) -> map().

Initialize state for this rule.

Initializes the eligibility trace if configured.

name()

-spec name() -> atom().

Return the rule name.

reset(State)

-spec reset(map()) -> map().

Reset the rule state.

set_trace(_, Trace)

-spec set_trace(weight_spec(), float()) -> weight_spec().

Set the eligibility trace in a weight spec.

update_trace(PreActivity, PostActivity, OldTrace, TraceDecay)

-spec update_trace(float(), float(), float(), float()) -> float().

Update an eligibility trace without applying plasticity.

Useful for maintaining traces during periods without reward.