# `Edifice.Recurrent.SLSTM`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/recurrent/slstm.ex#L1)

sLSTM: Scalar LSTM with Exponential Gating.

Standalone extraction of the sLSTM variant from xLSTM. The sLSTM extends
traditional LSTM with exponential gating and log-domain stabilization,
enabling stable training with very large gate values.

## Key Innovation: Exponential Gating with Log-Domain Stabilization

Standard LSTM gates are bounded by sigmoid [0, 1]. sLSTM uses exponential
gates that can take any positive value, with a stabilization trick to
prevent overflow:

```
Standard LSTM:  i_t = sigmoid(...)        ∈ [0, 1]
sLSTM:          i_t = exp(log_i_t - m_t)  ∈ [0, ∞)
```

The stabilizer `m_t = max(log_f_t + m_{t-1}, log_i_t)` keeps values
numerically tractable while preserving the relative magnitudes.

## Equations

Gate pre-activations (with recurrent connections):
- `log_i_t = W_i x_t + R_i h_{t-1} + b_i`
- `log_f_t = W_f x_t + R_f h_{t-1} + b_f`
- `z_t = tanh(W_z x_t + R_z h_{t-1} + b_z)`
- `o_t = sigmoid(W_o x_t + R_o h_{t-1} + b_o)`

Log-domain stabilization:
- `m_t = max(log_f_t + m_{t-1}, log_i_t)`
- `i_t' = exp(log_i_t - m_t)`
- `f_t' = exp(log_f_t + m_{t-1} - m_t)`

State updates:
- `c_t = f_t' * c_{t-1} + i_t' * z_t`
- `n_t = f_t' * n_{t-1} + i_t'`
- `h_t = o_t * (c_t / max(|n_t|, 1))`

## Architecture

```
Input [batch, seq_len, embed_dim]
      |
      v
+-------------------------------------+
|         sLSTM Block                  |
|  LayerNorm -> sLSTM recurrence       |
|  LayerNorm -> Feedforward            |
|  Residual connections                |
+-------------------------------------+
      | (repeat for num_layers)
      v
Output [batch, hidden_size]
```

## Usage

    model = SLSTM.build(
      embed_dim: 287,
      hidden_size: 256,
      num_layers: 4
    )

## References

- Beck et al., "xLSTM: Extended Long Short-Term Memory" (NeurIPS 2024)
- https://arxiv.org/abs/2405.04517

# `build_opt`

```elixir
@type build_opt() ::
  {:embed_dim, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:expand_factor, pos_integer()}
  | {:dropout, float()}
  | {:window_size, pos_integer()}
```

Options for `build/1`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a standalone sLSTM model for sequence processing.

## Options
  - `:embed_dim` - Size of input embedding per frame (required)
  - `:hidden_size` - Internal hidden dimension (default: 256)
  - `:num_layers` - Number of sLSTM blocks (default: 4)
  - `:expand_factor` - Feedforward expansion factor (default: 2)
  - `:dropout` - Dropout rate (default: 0.0)
  - `:window_size` - Expected sequence length (default: 60)

## Returns
  An Axon model that processes sequences and outputs the last hidden state.

# `build_slstm_layer`

```elixir
@spec build_slstm_layer(Axon.t(), pos_integer(), String.t()) :: Axon.t()
```

Build a standalone sLSTM layer for use in custom architectures.

Returns an Axon node that applies sLSTM recurrence to the input sequence.

# `default_dropout`

```elixir
@spec default_dropout() :: float()
```

Default dropout rate

# `default_expand_factor`

```elixir
@spec default_expand_factor() :: pos_integer()
```

Default feedforward expansion factor

# `default_hidden_size`

```elixir
@spec default_hidden_size() :: pos_integer()
```

Default hidden dimension

# `default_num_layers`

```elixir
@spec default_num_layers() :: pos_integer()
```

Default number of layers

# `output_size`

```elixir
@spec output_size(keyword()) :: non_neg_integer()
```

Get the output size of an sLSTM model.

# `recommended_defaults`

```elixir
@spec recommended_defaults() :: keyword()
```

Get recommended defaults.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
