Normalizing Flows with RealNVP-style affine coupling layers.
Normalizing flows learn invertible transformations between a simple base distribution (standard normal) and a complex target distribution. Because each layer is invertible with a tractable Jacobian, we get exact log-likelihood computation -- unlike VAEs which optimize a bound.
Architecture (RealNVP Affine Coupling)
Each coupling layer:
- Splits input into two halves: (x1, x2)
- Computes scale and translation from x1: s, t = NN(x1)
- Transforms x2: y2 = x2 * exp(s) + t
- Passes x1 unchanged: y1 = x1
- Output: (y1, y2)
This is trivially invertible:
x2 = (y2 - t) * exp(-s)
x1 = y1The log-determinant of the Jacobian is simply sum(s), making density evaluation efficient.
z ~ N(0, I)
|
v
+------------------+
| Coupling Layer 1 | split -> NN -> affine transform -> concat
+------------------+
|
v
+------------------+
| Coupling Layer 2 | (alternating split pattern)
+------------------+
|
v
...
|
v
+------------------+
| Coupling Layer K |
+------------------+
|
v
x ~ p(x)Successive layers alternate which half is transformed to ensure all dimensions are eventually modified.
Usage
# Build a normalizing flow
model = NormalizingFlow.build(input_size: 16, num_flows: 4, hidden_sizes: [128])
# Forward pass (encoding: data -> latent)
{z, log_det} = NormalizingFlow.forward(x, params, num_flows: 4, input_size: 16)
# Inverse pass (generation: latent -> data)
x = NormalizingFlow.inverse(z, params, num_flows: 4, input_size: 16)
# Log-likelihood
log_prob = NormalizingFlow.log_probability(x, params, num_flows: 4, input_size: 16)
Summary
Functions
Build a single RealNVP affine coupling layer as part of an Axon graph.
Build a normalizing flow model.
Inverse of a single affine coupling layer (for generation).
Compute log-determinant of the Jacobian for an affine coupling layer.
Compute the log-probability of data under the flow model.
Negative log-likelihood loss for normalizing flow training.
Compute the log-determinant for a full forward pass through all coupling layers.
Types
@type build_opt() :: {:activation, atom()} | {:hidden_sizes, [pos_integer()]} | {:input_size, pos_integer()} | {:num_flows, pos_integer()}
Options for build/1.
Functions
@spec affine_coupling_layer(Axon.t(), non_neg_integer(), keyword()) :: Axon.t()
Build a single RealNVP affine coupling layer as part of an Axon graph.
On even-indexed layers, x1 (first half) conditions the transform of x2. On odd-indexed layers, x2 (second half) conditions the transform of x1. This alternation ensures all dimensions are transformed across layers.
The coupling network outputs scale (s) and translation (t) parameters. Scale is passed through tanh and scaled to prevent extreme values.
Parameters
input- Input Axon node[batch, input_size]flow_idx- Layer index (determines split direction)opts- Options including:input_size,:half_size,:hidden_sizes,:activation
Returns
An Axon node [batch, input_size] with the coupling transform applied.
Build a normalizing flow model.
Constructs num_flows affine coupling layers, each containing a
small neural network that computes scale and translation parameters.
The model is an Axon graph where the input flows through all coupling
layers sequentially.
Options
:input_size- Input dimension, must be even (required):num_flows- Number of coupling layers (default: 4):hidden_sizes- Hidden layer sizes for each coupling network (default: [256]):activation- Activation function for coupling networks (default: :relu)
Returns
An Axon model: [batch, input_size] -> [batch, input_size].
The model transforms inputs through the flow. For density evaluation
and generation, use the forward/3, inverse/3, and log_probability/3
Nx functions directly.
@spec inverse_coupling_layer( Nx.Tensor.t(), [{Nx.Tensor.t(), Nx.Tensor.t()}], Nx.Tensor.t(), Nx.Tensor.t(), pos_integer(), boolean() ) :: Nx.Tensor.t()
Inverse of a single affine coupling layer (for generation).
Given the output y of a coupling layer, recovers the input x:
x2 = (y2 - t) * exp(-s) where s, t = NN(y1)
x1 = y1Parameters
y- Coupling layer output[batch, input_size]s_params- Scale network parameters (list of{weight, bias}tuples)t_params- Translation network parameters (list of{weight, bias}tuples)half_size- Size of each split halfeven- Whether this is an even-indexed layer (determines split order)
Returns
Recovered input [batch, input_size].
@spec log_det_jacobian(Nx.Tensor.t()) :: Nx.Tensor.t()
Compute log-determinant of the Jacobian for an affine coupling layer.
For the affine coupling transform y2 = x2 * exp(s) + t, the Jacobian is triangular and its log-determinant is simply sum(s).
Parameters
scale- Scale parameterssfrom the coupling network[batch, half_size]
Returns
Log-determinant [batch] (summed over dimensions).
@spec log_probability(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t()
Compute the log-probability of data under the flow model.
Uses the change-of-variables formula:
log p(x) = log p(z) + log|det(dz/dx)|where z = f(x) is the forward transformation and p(z) = N(0, I).
Parameters
z- Transformed data in latent space[batch, input_size]total_log_det- Sum of log-determinants from forward pass[batch]
Returns
Log-probability [batch].
@spec nll_loss(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t()
Negative log-likelihood loss for normalizing flow training.
Minimizing NLL is equivalent to maximizing the log-probability of the training data under the flow model.
Parameters
z- Encoded latent vectors[batch, input_size]total_log_det- Sum of log-determinants[batch]
Returns
NLL loss scalar (mean over batch).
@spec total_log_det_jacobian([Nx.Tensor.t()]) :: Nx.Tensor.t()
Compute the log-determinant for a full forward pass through all coupling layers.
This is the sum of log-determinants from each individual coupling layer, needed for exact log-likelihood computation.
Parameters
scales- List of scale tensors from each coupling layer, each[batch, half_size]
Returns
Total log-determinant [batch].