viva_math/entropy
Entropy and information theory functions.
Based on Shannon (1948) and Kullback-Leibler (1951). Used for memory consolidation scoring and uncertainty quantification.
References:
- Shannon (1948) “A Mathematical Theory of Communication”
- Cover & Thomas (2006) “Elements of Information Theory”
Types
KL divergence sensitivity types.
Controls how sensitive the divergence is to differences.
pub type KlSensitivity {
Standard
ArousalWeighted(arousal: Float)
CustomGamma(gamma: Float)
}
Constructors
-
StandardStandard KL divergence
-
ArousalWeighted(arousal: Float)Arousal-weighted: γ increases with arousal (sharper for high arousal)
-
CustomGamma(gamma: Float)Custom gamma parameter
Values
pub fn binary_cross_entropy(
p: Float,
q: Float,
) -> Result(Float, Nil)
Binary cross-entropy for single probability.
H(p, q) = -[p log(q) + (1-p) log(1-q)]
pub fn conditional_entropy(
px: List(Float),
pxy: List(List(Float)),
) -> Float
Conditional entropy: H(X|Y) = H(X, Y) - H(Y)
Uncertainty in X given knowledge of Y.
pub fn cross_entropy(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Cross-entropy: H(P, Q) = -Σ p(x) log q(x)
Used in machine learning loss functions. H(P, Q) = H(P) + D_KL(P || Q)
pub fn hybrid_shannon(
probs1: List(Float),
probs2: List(Float),
alpha: Float,
) -> Float
Hybrid entropy for mixed emotional states.
H_hybrid(X) = α × H(X₁) + (1 - α) × H(X₂)
Proposed by DeepSeek R1 for modeling hybrid emotions. α ∈ [0, 1] controls the blend between two emotional distributions.
Examples
hybrid_shannon([0.5, 0.5], [0.7, 0.3], 0.5) // Blend of two emotions
pub fn jensen_shannon(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Jensen-Shannon Divergence: JS(P, Q) = (D_KL(P || M) + D_KL(Q || M)) / 2 where M = (P + Q) / 2
This is symmetric and bounded [0, 1] when using log₂.
pub fn kl_divergence(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
KL Divergence: D_KL(P || Q) = Σ p(x) log(p(x) / q(x))
Measures how much P diverges from Q (not symmetric!). P is the “true” distribution, Q is the approximation.
Returns Error if distributions have different lengths or Q has zeros where P is non-zero.
pub fn kl_divergence_with_sensitivity(
p: List(Float),
q: List(Float),
sensitivity: KlSensitivity,
) -> Result(Float, Nil)
KL divergence with sensitivity parameter.
D_KL^γ(P || Q) = γ × (μ₁ - μ₂)² + D_KL(P || Q)
Proposed by DeepSeek R1 for arousal-modulated divergence. Higher γ = more sensitive to mean differences.
pub fn mutual_information(
px: List(Float),
py: List(Float),
pxy: List(List(Float)),
) -> Float
Mutual Information: I(X; Y) = H(X) + H(Y) - H(X, Y)
Measures shared information between two variables. Takes marginal distributions and joint distribution as input.
pub fn relative_entropy_rate(
observed: List(Float),
expected: List(Float),
) -> Result(Float, Nil)
Relative entropy rate for sequences.
Used for measuring “surprise” in temporal data.
pub fn renyi(
probabilities: List(Float),
alpha: Float,
) -> Result(Float, Nil)
Renyi entropy of order α.
H_α(X) = (1/(1-α)) × log(Σ p(x)^α)
Generalizes Shannon entropy (α → 1 gives Shannon). α = 0: Hartley entropy (log of support size) α = 2: Collision entropy α → ∞: Min-entropy
pub fn shannon(probabilities: List(Float)) -> Float
Shannon entropy: H(X) = -Σ p(x) log₂ p(x)
Measures uncertainty/information content of a distribution. Higher entropy = more uncertainty.
Examples
shannon([0.5, 0.5]) // -> 1.0 (maximum for 2 outcomes)
shannon([1.0, 0.0]) // -> 0.0 (no uncertainty)
shannon([0.25, 0.25, 0.25, 0.25]) // -> 2.0
pub fn shannon_normalized(probabilities: List(Float)) -> Float
Normalized Shannon entropy (0 to 1 range).
Divides by log₂(n) where n is number of outcomes.
pub fn symmetric_kl(
p: List(Float),
q: List(Float),
) -> Result(Float, Nil)
Symmetric KL Divergence (Jensen-Shannon divergence without the 1/2).
D_sym(P, Q) = D_KL(P || Q) + D_KL(Q || P)