Evidential Deep Learning with Dirichlet Priors.
Evidential Neural Networks place a Dirichlet distribution over class probabilities, enabling principled uncertainty estimation in a single forward pass (no ensembles or MC sampling needed). The network outputs evidence parameters (alpha) that parameterize a Dirichlet distribution, from which both aleatoric and epistemic uncertainty can be derived.
How It Works
Instead of outputting softmax probabilities, the network outputs evidence parameters alpha_k >= 0 for each class:
p(y|x) = Dir(p | alpha)
alpha_k = evidence_k + 1The Dirichlet concentration parameters encode:
- Belief mass: b_k = (alpha_k - 1) / S where S = sum(alpha)
- Uncertainty mass: u = K / S (K = num_classes)
- Expected probability: p_k = alpha_k / S
Uncertainty Types
| Type | Formula | Meaning |
|---|---|---|
| Epistemic | u = K / S | Lack of evidence (data uncertainty) |
| Aleatoric | E[H[Cat(p)]] | Inherent class overlap |
| Total | H[E[Cat(p)]] | Combined uncertainty |
Architecture
Input [batch, input_size]
|
v
+--------------------------------------+
| Backbone MLP |
| Dense -> Act -> Dense -> Act |
+--------------------------------------+
|
v
+--------------------------------------+
| Evidence Head: |
| Dense -> Softplus (ensures > 0) |
| alpha = evidence + 1 |
+--------------------------------------+
|
v
Dirichlet Parameters [batch, num_classes]Usage
model = EvidentialNN.build(
input_size: 256,
hidden_sizes: [128, 64],
num_classes: 10
)
# Get predictions + uncertainty
{init_fn, predict_fn} = Axon.build(model)
params = init_fn.(template, state)
alpha = predict_fn.(params, input)
{epistemic, aleatoric} = EvidentialNN.uncertainty(alpha)References
- Sensoy et al., "Evidential Deep Learning to Quantify Classification Uncertainty" (NeurIPS 2018)
- https://arxiv.org/abs/1806.01768
Summary
Functions
Compute aleatoric (data) uncertainty.
Build an Evidential Neural Network.
Compute epistemic (knowledge) uncertainty.
Evidential Deep Learning loss (Type II Maximum Likelihood).
Compute the expected class probabilities from Dirichlet parameters.
Get the output size of an Evidential NN.
Compute epistemic and aleatoric uncertainty from Dirichlet parameters.
Types
@type build_opt() :: {:activation, atom()} | {:dropout, float()} | {:hidden_sizes, [pos_integer()]} | {:input_size, pos_integer()} | {:num_classes, pos_integer() | nil}
Options for build/1.
Functions
@spec aleatoric_uncertainty(Nx.Tensor.t()) :: Nx.Tensor.t()
Compute aleatoric (data) uncertainty.
Aleatoric uncertainty is the expected entropy of the categorical distribution under the Dirichlet: E_Dir[H[Cat(p)]].
Parameters
alpha- Dirichlet parameters[batch, num_classes]
Returns
Aleatoric uncertainty [batch].
Build an Evidential Neural Network.
Options
:input_size- Input feature dimension (required):hidden_sizes- List of hidden layer sizes (default: [256, 128]):num_classes- Number of output classes (required):activation- Activation function for hidden layers (default: :relu):dropout- Dropout rate (default: 0.0)
Returns
An Axon model outputting Dirichlet alpha parameters:
[batch, input_size] -> [batch, num_classes]
The output alpha_k values are always > 1 (evidence + 1).
@spec epistemic_uncertainty(Nx.Tensor.t()) :: Nx.Tensor.t()
Compute epistemic (knowledge) uncertainty.
Epistemic uncertainty = K / S where K is the number of classes and S = sum(alpha) is the Dirichlet strength. High uncertainty when evidence is low (alpha values close to 1).
Parameters
alpha- Dirichlet parameters[batch, num_classes]
Returns
Epistemic uncertainty [batch] in [0, 1].
@spec evidential_loss(Nx.Tensor.t(), Nx.Tensor.t(), keyword()) :: Nx.Tensor.t()
Evidential Deep Learning loss (Type II Maximum Likelihood).
Combines the negative log-likelihood of the Dirichlet-Categorical model with a KL divergence regularizer that penalizes evidence on incorrect classes.
Parameters
alpha- Predicted Dirichlet parameters[batch, num_classes]targets- One-hot encoded targets[batch, num_classes]
Options
:kl_weight- Weight for KL regularization term (default: 0.01)
Returns
Scalar loss tensor.
@spec expected_probability(Nx.Tensor.t()) :: Nx.Tensor.t()
Compute the expected class probabilities from Dirichlet parameters.
Parameters
alpha- Dirichlet parameters[batch, num_classes]
Returns
Expected probabilities [batch, num_classes] that sum to 1.
@spec output_size(keyword()) :: pos_integer()
Get the output size of an Evidential NN.
@spec uncertainty(Nx.Tensor.t()) :: {Nx.Tensor.t(), Nx.Tensor.t()}
Compute epistemic and aleatoric uncertainty from Dirichlet parameters.
Parameters
alpha- Dirichlet concentration parameters[batch, num_classes]
Returns
Tuple {epistemic, aleatoric}:
epistemic- Uncertainty due to lack of evidence[batch]aleatoric- Uncertainty due to inherent data noise[batch]