Striped Hyena: interleaved Hyena long convolution and gated convolution layers.
Implements the Striped Hyena architecture from "StripedHyena: Moving Beyond Transformers with Hybrid Signal Processing Models" (Together AI, 2023). Striped Hyena alternates between Hyena long convolution blocks (for global context) and gated depthwise convolution blocks (for local patterns).
Key Innovation: Striped Block Pattern
Instead of using only Hyena blocks, Striped Hyena interleaves two block types:
- Even layers: Hyena long convolution blocks (sub-quadratic global mixing)
- Odd layers: Gated depthwise convolution blocks (efficient local mixing)
This striped pattern achieves better efficiency while maintaining the expressivity of pure Hyena models.
Architecture
Input [batch, seq_len, embed_dim]
|
v
+-----------------------+
| Input Projection |
+-----------------------+
|
v
+-----------------------+
| Layer 1 (Hyena) |
| LongConv + Gating |
+-----------------------+
|
+-----------------------+
| Layer 2 (GatedConv) |
| DepthwiseConv + Gate |
+-----------------------+
|
+-----------------------+
| Layer 3 (Hyena) |
| ...repeating pattern |
+-----------------------+
|
v
[batch, hidden_size] (last timestep)Gated Conv Block
norm(x) -> dense(2*H) -> split(x_val, x_gate)
-> DepthwiseConv(x_val) * sigmoid(x_gate)
-> dense(H) -> residual
-> FFN -> residualUsage
model = StripedHyena.build(
embed_dim: 287,
hidden_size: 256,
order: 2,
num_layers: 4
)Reference
- Paper: "StripedHyena: Moving Beyond Transformers with Hybrid Signal Processing Models"
- Blog: https://www.together.ai/blog/stripedhyena-7b
Summary
Functions
Build a Striped Hyena model for sequence processing.
Build a gated depthwise convolution block.
Get the output size of a Striped Hyena model.
Get recommended defaults.
Types
@type build_opt() :: {:conv_kernel_size, pos_integer()} | {:dropout, float()} | {:embed_dim, pos_integer()} | {:filter_size, pos_integer()} | {:hidden_size, pos_integer()} | {:num_layers, pos_integer()} | {:order, pos_integer()} | {:seq_len, pos_integer()} | {:window_size, pos_integer()}
Options for build/1.
Functions
Build a Striped Hyena model for sequence processing.
Options
:embed_dim- Size of input embedding per frame (required):hidden_size- Internal hidden dimension (default: 256):order- Hyena gating order (default: 2):filter_size- Implicit filter MLP hidden size (default: 64):conv_kernel_size- Kernel size for gated conv blocks (default: 7):num_layers- Total number of layers (default: 4):dropout- Dropout rate (default: 0.1):window_size- Expected sequence length (default: 60)
Returns
An Axon model that outputs [batch, hidden_size] from the last position.
Build a gated depthwise convolution block.
Architecture: norm -> dense(2H) -> split -> DWConv(val) sigmoid(gate) -> dense -> residual -> FFN -> residual
@spec output_size(keyword()) :: non_neg_integer()
Get the output size of a Striped Hyena model.
@spec recommended_defaults() :: keyword()
Get recommended defaults.