Edifice.SSM.S5 (Edifice v0.2.0)

S5: Simplified State Space Sequence model.

S5 uses a single multi-input, multi-output (MIMO) state space model instead of the many independent SISO systems used in Mamba. This results in a simpler architecture while maintaining strong performance.

Key Innovation: MIMO SSM

Instead of having many parallel single-input single-output SSMs (like Mamba), S5 uses one large MIMO SSM:

Mamba: D separate SSMs, each with state size N
S5: 1 large SSM with D*N combined state

Architecture

Input [batch, seq_len, embed_dim]
      │
      ▼
┌─────────────────────────────────────┐
│  S5 Block                            │
│                                      │
│  Linear projection → Encoder         │
│                                      │
│  ┌─ MIMO SSM ────────────────────┐   │
│  │                               │   │
│  │  x'(t) = Ax(t) + Bu(t)        │   │
│  │  y(t) = Cx(t) + Du(t)         │   │
│  │                               │   │
│  │  (Diagonal A for efficiency)  │   │
│  │                               │   │
│  └───────────────────────────────┘   │
│                                      │
│  Decoder → Linear projection         │
│                                      │
└─────────────────────────────────────┘
      │ (repeat for num_layers)
      ▼
[batch, hidden_size]

Complexity

Aspect	Value
Training	O(L log L) via FFT or O(L) via scan
Inference	O(1) per step
Parameters	Fewer than Mamba

Key Difference from Mamba

Aspect	S5	Mamba
SSM structure	MIMO	Many SISOs
Input-dependence	Fixed A, B, C	Selective (input-dependent)
Complexity	Simpler	More complex
Gating	Optional	SiLU gating

Usage

model = S5.build(
  embed_dim: 287,
  hidden_size: 256,
  state_size: 64,
  num_layers: 4
)

Use Case

S5 is useful for ablation studies to understand what Mamba's added complexity (selective mechanism, gating) contributes.

Reference

Paper: "Simplified State Space Layers for Sequence Modeling" (ICLR 2023)
arXiv: 2208.04933

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build an S5 model for sequence processing.

build_ffn(input, opts)

Build the Feed-Forward Network layer.

build_mimo_ssm(input, opts)

Build the MIMO SSM layer.

build_s5_block(input, opts)

Build a single S5 block.

init_cache(opts \\ [])

Initialize hidden state for O(1) incremental inference.

output_size(opts \\ [])

Get the output size of an S5 model.

param_count(opts)

Calculate approximate parameter count for an S5 model.

recommended_defaults()

Get recommended defaults for real-time sequence processing (60fps).