# `Edifice.Generative.NormalizingFlow`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/generative/normalizing_flow.ex#L1)

Normalizing Flows with RealNVP-style affine coupling layers.

Normalizing flows learn invertible transformations between a simple
base distribution (standard normal) and a complex target distribution.
Because each layer is invertible with a tractable Jacobian, we get
exact log-likelihood computation -- unlike VAEs which optimize a bound.

## Architecture (RealNVP Affine Coupling)

Each coupling layer:
1. Splits input into two halves: (x1, x2)
2. Computes scale and translation from x1: s, t = NN(x1)
3. Transforms x2: y2 = x2 * exp(s) + t
4. Passes x1 unchanged: y1 = x1
5. Output: (y1, y2)

This is trivially invertible:
    x2 = (y2 - t) * exp(-s)
    x1 = y1

The log-determinant of the Jacobian is simply sum(s), making
density evaluation efficient.

```
z ~ N(0, I)
     |
     v
+------------------+
| Coupling Layer 1 |  split -> NN -> affine transform -> concat
+------------------+
     |
     v
+------------------+
| Coupling Layer 2 |  (alternating split pattern)
+------------------+
     |
     v
    ...
     |
     v
+------------------+
| Coupling Layer K |
+------------------+
     |
     v
x ~ p(x)
```

Successive layers alternate which half is transformed to ensure
all dimensions are eventually modified.

## Usage

    # Build a normalizing flow
    model = NormalizingFlow.build(input_size: 16, num_flows: 4, hidden_sizes: [128])

    # Forward pass (encoding: data -> latent)
    {z, log_det} = NormalizingFlow.forward(x, params, num_flows: 4, input_size: 16)

    # Inverse pass (generation: latent -> data)
    x = NormalizingFlow.inverse(z, params, num_flows: 4, input_size: 16)

    # Log-likelihood
    log_prob = NormalizingFlow.log_probability(x, params, num_flows: 4, input_size: 16)

# `build_opt`

```elixir
@type build_opt() ::
  {:activation, atom()}
  | {:hidden_sizes, [pos_integer()]}
  | {:input_size, pos_integer()}
  | {:num_flows, pos_integer()}
```

Options for `build/1`.

# `affine_coupling_layer`

```elixir
@spec affine_coupling_layer(Axon.t(), non_neg_integer(), keyword()) :: Axon.t()
```

Build a single RealNVP affine coupling layer as part of an Axon graph.

On even-indexed layers, x1 (first half) conditions the transform of x2.
On odd-indexed layers, x2 (second half) conditions the transform of x1.
This alternation ensures all dimensions are transformed across layers.

The coupling network outputs scale (s) and translation (t) parameters.
Scale is passed through tanh and scaled to prevent extreme values.

## Parameters
  - `input` - Input Axon node `[batch, input_size]`
  - `flow_idx` - Layer index (determines split direction)
  - `opts` - Options including `:input_size`, `:half_size`, `:hidden_sizes`, `:activation`

## Returns
  An Axon node `[batch, input_size]` with the coupling transform applied.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a normalizing flow model.

Constructs `num_flows` affine coupling layers, each containing a
small neural network that computes scale and translation parameters.
The model is an Axon graph where the input flows through all coupling
layers sequentially.

## Options
  - `:input_size` - Input dimension, must be even (required)
  - `:num_flows` - Number of coupling layers (default: 4)
  - `:hidden_sizes` - Hidden layer sizes for each coupling network (default: [256])
  - `:activation` - Activation function for coupling networks (default: :relu)

## Returns
  An Axon model: `[batch, input_size]` -> `[batch, input_size]`.

  The model transforms inputs through the flow. For density evaluation
  and generation, use the `forward/3`, `inverse/3`, and `log_probability/3`
  Nx functions directly.

# `inverse_coupling_layer`

```elixir
@spec inverse_coupling_layer(
  Nx.Tensor.t(),
  [{Nx.Tensor.t(), Nx.Tensor.t()}],
  Nx.Tensor.t(),
  Nx.Tensor.t(),
  pos_integer(),
  boolean()
) :: Nx.Tensor.t()
```

Inverse of a single affine coupling layer (for generation).

Given the output y of a coupling layer, recovers the input x:
    x2 = (y2 - t) * exp(-s)    where s, t = NN(y1)
    x1 = y1

## Parameters
  - `y` - Coupling layer output `[batch, input_size]`
  - `s_params` - Scale network parameters (list of `{weight, bias}` tuples)
  - `t_params` - Translation network parameters (list of `{weight, bias}` tuples)
  - `half_size` - Size of each split half
  - `even` - Whether this is an even-indexed layer (determines split order)

## Returns
  Recovered input `[batch, input_size]`.

# `log_det_jacobian`

```elixir
@spec log_det_jacobian(Nx.Tensor.t()) :: Nx.Tensor.t()
```

Compute log-determinant of the Jacobian for an affine coupling layer.

For the affine coupling transform y2 = x2 * exp(s) + t, the Jacobian
is triangular and its log-determinant is simply sum(s).

## Parameters
  - `scale` - Scale parameters `s` from the coupling network `[batch, half_size]`

## Returns
  Log-determinant `[batch]` (summed over dimensions).

# `log_probability`

```elixir
@spec log_probability(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t()
```

Compute the log-probability of data under the flow model.

Uses the change-of-variables formula:
    log p(x) = log p(z) + log|det(dz/dx)|

where z = f(x) is the forward transformation and p(z) = N(0, I).

## Parameters
  - `z` - Transformed data in latent space `[batch, input_size]`
  - `total_log_det` - Sum of log-determinants from forward pass `[batch]`

## Returns
  Log-probability `[batch]`.

# `nll_loss`

```elixir
@spec nll_loss(Nx.Tensor.t(), Nx.Tensor.t()) :: Nx.Tensor.t()
```

Negative log-likelihood loss for normalizing flow training.

Minimizing NLL is equivalent to maximizing the log-probability
of the training data under the flow model.

## Parameters
  - `z` - Encoded latent vectors `[batch, input_size]`
  - `total_log_det` - Sum of log-determinants `[batch]`

## Returns
  NLL loss scalar (mean over batch).

# `total_log_det_jacobian`

```elixir
@spec total_log_det_jacobian([Nx.Tensor.t()]) :: Nx.Tensor.t()
```

Compute the log-determinant for a full forward pass through all coupling layers.

This is the sum of log-determinants from each individual coupling layer,
needed for exact log-likelihood computation.

## Parameters
  - `scales` - List of scale tensors from each coupling layer,
    each `[batch, half_size]`

## Returns
  Total log-determinant `[batch]`.

---

*Consult [api-reference.md](api-reference.md) for complete listing*