# `Edifice.Recurrent.XLSTMv2`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/recurrent/xlstm_v2.ex#L1)

xLSTM v2: Improved Extended Long Short-Term Memory.

Implements improvements from the xLSTM 7B scaling paper, building on the
original xLSTM architecture with enhanced matrix memory and normalization.

## Key Improvements over xLSTM v1

1. **Block-diagonal matrix memory**: Reduces mLSTM parameters by partitioning
   the memory matrix into independent blocks. Each block operates on a subset
   of dimensions, reducing per-head memory from O(d^2) to O(d^2/B) where B
   is the number of blocks.

2. **Improved normalizer with learnable bias**: The normalizer
   `n_t = f_t * n_{t-1} + i_t` gains a learnable bias term for better
   gradient flow: `h_t = o_t * (c_t / max(|n_t + bias|, 1))`

3. **Pre-norm + post-norm hybrid**: Combines pre-LayerNorm for stable training
   with post-LayerNorm for better representation quality.

## Architecture

```
Input [batch, seq_len, embed_dim]
      |
      v
+-------------------------------------+
|     xLSTM v2 Block                   |
|  PreNorm -> mLSTM v2 -> PostNorm     |
|       + Residual                     |
|  PreNorm -> FFN -> PostNorm          |
|       + Residual                     |
+-------------------------------------+
      | (repeat for num_layers)
      v
Output [batch, hidden_size]
```

## Usage

    model = XLSTMv2.build(
      embed_dim: 287,
      hidden_size: 256,
      num_layers: 4,
      num_heads: 4,
      num_blocks: 2
    )

## References

- Beck et al., "xLSTM: Extended Long Short-Term Memory" (NeurIPS 2024)
- xLSTM 7B scaling paper improvements

# `build_opt`

```elixir
@type build_opt() ::
  {:embed_dim, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:num_heads, pos_integer()}
  | {:head_dim, pos_integer()}
  | {:num_blocks, pos_integer()}
  | {:expand_factor, pos_integer()}
  | {:dropout, float()}
  | {:window_size, pos_integer()}
```

Options for `build/1`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build an xLSTM v2 model for sequence processing.

## Options
  - `:embed_dim` - Size of input embedding per frame (required)
  - `:hidden_size` - Internal hidden dimension (default: 256)
  - `:num_layers` - Number of blocks (default: 4)
  - `:num_heads` - Number of heads for mLSTM (default: 4)
  - `:head_dim` - Dimension per head (default: 64)
  - `:num_blocks` - Number of block-diagonal memory blocks (default: 2)
  - `:expand_factor` - FFN expansion factor (default: 2)
  - `:dropout` - Dropout rate (default: 0.0)
  - `:window_size` - Expected sequence length (default: 60)

## Returns
  An Axon model that processes sequences and outputs the last hidden state.

# `default_dropout`

```elixir
@spec default_dropout() :: float()
```

Default dropout rate

# `default_expand_factor`

```elixir
@spec default_expand_factor() :: pos_integer()
```

Default feedforward expansion factor

# `default_head_dim`

```elixir
@spec default_head_dim() :: pos_integer()
```

Default head dimension for mLSTM

# `default_hidden_size`

```elixir
@spec default_hidden_size() :: pos_integer()
```

Default hidden dimension

# `default_num_blocks`

```elixir
@spec default_num_blocks() :: pos_integer()
```

Default number of memory blocks (block-diagonal)

# `default_num_heads`

```elixir
@spec default_num_heads() :: pos_integer()
```

Default number of heads

# `default_num_layers`

```elixir
@spec default_num_layers() :: pos_integer()
```

Default number of layers

# `output_size`

```elixir
@spec output_size(keyword()) :: non_neg_integer()
```

Get the output size of an xLSTM v2 model.

# `recommended_defaults`

```elixir
@spec recommended_defaults() :: keyword()
```

Get recommended defaults.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
