# `Edifice.SSM.Hyena`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/ssm/hyena.ex#L1)

Hyena: Sub-quadratic attention alternative via long convolutions and gating.

Implements the Hyena Hierarchy from "Hyena Hierarchy: Towards Larger
Convolutional Language Models" (Poli et al., ICML 2023). Hyena replaces
attention with a hierarchy of long convolutions and element-wise gating,
achieving sub-quadratic complexity in sequence length.

## Key Innovation: Implicit Long Convolution + Gating

Instead of attention's O(L^2) pairwise interactions, Hyena uses:
1. A learned implicit filter (small MLP) that generates long convolution kernels
2. Element-wise gating for non-linearity
3. Multiple "orders" of this operation for expressivity

```
Order 2 Hyena:
  v, x1, x2 = linear_projections(input)  # 3 projections
  y = v
  y = long_conv(y, filter_1) * x1        # First order
  y = long_conv(y, filter_2) * x2        # Second order
  output = linear(y)
```

## Architecture

```
Input [batch, seq_len, embed_dim]
      |
      v
+-----------------------+
| Input Projection      |
+-----------------------+
      |
      v
+-----------------------+
| Hyena Block x N       |
|  ShortConv(input)     |
|  Split: v, x1, x2    |
|  y = v                |
|  y = LongConv(y)*x1   |  <- Implicit filter via MLP
|  y = LongConv(y)*x2   |
|  OutProj + Residual   |
|  FFN                  |
+-----------------------+
      |
      v
[batch, hidden_size]    (last timestep)
```

## Complexity

| Operation | Attention | Hyena |
|-----------|-----------|-------|
| Training  | O(L^2)    | O(L log L) via FFT |
| Inference | O(L^2)    | O(L) with recurrence |

## Usage

    model = Hyena.build(
      embed_dim: 287,
      hidden_size: 256,
      order: 2,
      filter_size: 64,
      num_layers: 4
    )

## Reference

- Paper: "Hyena Hierarchy: Towards Larger Convolutional Language Models"
- arXiv: https://arxiv.org/abs/2302.10866

# `build_opt`

```elixir
@type build_opt() ::
  {:dropout, float()}
  | {:embed_dim, pos_integer()}
  | {:filter_size, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:order, pos_integer()}
  | {:seq_len, pos_integer()}
  | {:window_size, pos_integer()}
```

Options for `build/1`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a Hyena model for sequence processing.

## Options

  - `:embed_dim` - Size of input embedding per frame (required)
  - `:hidden_size` - Internal hidden dimension (default: 256)
  - `:order` - Number of gating levels (default: 2)
  - `:filter_size` - Implicit filter MLP hidden size (default: 64)
  - `:num_layers` - Number of Hyena blocks (default: 4)
  - `:dropout` - Dropout rate (default: 0.1)
  - `:window_size` - Expected sequence length (default: 60)

## Returns

  An Axon model that outputs [batch, hidden_size] from the last position.

# `build_hyena_block`

```elixir
@spec build_hyena_block(
  Axon.t(),
  keyword()
) :: Axon.t()
```

Build a single Hyena block with implicit long convolution and gating.

# `output_size`

```elixir
@spec output_size(keyword()) :: non_neg_integer()
```

Get the output size of a Hyena model.

# `param_count`

```elixir
@spec param_count(keyword()) :: non_neg_integer()
```

Calculate approximate parameter count for a Hyena model.

# `recommended_defaults`

```elixir
@spec recommended_defaults() :: keyword()
```

Get recommended defaults.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
