Edifice.SSM.StripedHyena (Edifice v0.2.0)

Copy Markdown View Source

Striped Hyena: interleaved Hyena long convolution and gated convolution layers.

Implements the Striped Hyena architecture from "StripedHyena: Moving Beyond Transformers with Hybrid Signal Processing Models" (Together AI, 2023). Striped Hyena alternates between Hyena long convolution blocks (for global context) and gated depthwise convolution blocks (for local patterns).

Key Innovation: Striped Block Pattern

Instead of using only Hyena blocks, Striped Hyena interleaves two block types:

  • Even layers: Hyena long convolution blocks (sub-quadratic global mixing)
  • Odd layers: Gated depthwise convolution blocks (efficient local mixing)

This striped pattern achieves better efficiency while maintaining the expressivity of pure Hyena models.

Architecture

Input [batch, seq_len, embed_dim]
      |
      v
+-----------------------+
| Input Projection      |
+-----------------------+
      |
      v
+-----------------------+
| Layer 1 (Hyena)       |
|  LongConv + Gating    |
+-----------------------+
      |
+-----------------------+
| Layer 2 (GatedConv)   |
|  DepthwiseConv + Gate |
+-----------------------+
      |
+-----------------------+
| Layer 3 (Hyena)       |
|  ...repeating pattern |
+-----------------------+
      |
      v
[batch, hidden_size]    (last timestep)

Gated Conv Block

norm(x) -> dense(2*H) -> split(x_val, x_gate)
  -> DepthwiseConv(x_val) * sigmoid(x_gate)
  -> dense(H) -> residual
  -> FFN -> residual

Usage

model = StripedHyena.build(
  embed_dim: 287,
  hidden_size: 256,
  order: 2,
  num_layers: 4
)

Reference

Summary

Types

Options for build/1.

Functions

Build a Striped Hyena model for sequence processing.

Build a gated depthwise convolution block.

Get the output size of a Striped Hyena model.

Get recommended defaults.

Types

build_opt()

@type build_opt() ::
  {:conv_kernel_size, pos_integer()}
  | {:dropout, float()}
  | {:embed_dim, pos_integer()}
  | {:filter_size, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:order, pos_integer()}
  | {:seq_len, pos_integer()}
  | {:window_size, pos_integer()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a Striped Hyena model for sequence processing.

Options

  • :embed_dim - Size of input embedding per frame (required)
  • :hidden_size - Internal hidden dimension (default: 256)
  • :order - Hyena gating order (default: 2)
  • :filter_size - Implicit filter MLP hidden size (default: 64)
  • :conv_kernel_size - Kernel size for gated conv blocks (default: 7)
  • :num_layers - Total number of layers (default: 4)
  • :dropout - Dropout rate (default: 0.1)
  • :window_size - Expected sequence length (default: 60)

Returns

An Axon model that outputs [batch, hidden_size] from the last position.

build_gated_conv_block(input, opts)

@spec build_gated_conv_block(
  Axon.t(),
  keyword()
) :: Axon.t()

Build a gated depthwise convolution block.

Architecture: norm -> dense(2H) -> split -> DWConv(val) sigmoid(gate) -> dense -> residual -> FFN -> residual

output_size(opts \\ [])

@spec output_size(keyword()) :: non_neg_integer()

Get the output size of a Striped Hyena model.