Edifice.SSM.S4D (Edifice v0.2.0)

Copy Markdown View Source

S4D: S4 with Diagonal State Matrix.

A simplified variant of S4 where the state matrix A is purely diagonal, removing the need for the DPLR (Diagonal Plus Low-Rank) decomposition. S4D serves as the bridge between the original S4 (complex HiPPO matrices) and modern SSMs like S5 and Mamba.

Key Simplification

Original S4 uses DPLR decomposition of HiPPO:

A = V * diag(Lambda) * V^{-1} + P * Q^T

S4D directly uses diagonal A:

A = diag(a_1, a_2, ..., a_N)    (real or complex)

This dramatically simplifies implementation while maintaining strong performance on most benchmarks.

Architecture

Identical to S4 but with simpler diagonal-only A parameterization. Each block: LayerNorm -> Diagonal SSM -> Dropout -> Residual -> FFN.

Comparison

AspectS4S4D
A matrixDPLR (HiPPO)Pure diagonal
ImplementationComplexSimple
PerformanceStrongNearly identical
ParametersMoreFewer

Usage

model = S4D.build(
  embed_dim: 287,
  hidden_size: 256,
  state_size: 64,
  num_layers: 4
)

Reference

Summary

Types

Options for build/1.

Functions

Build an S4D model for sequence processing.

Build a single S4D block.

Get the output size of an S4D model.

Calculate approximate parameter count for an S4D model.

Get recommended defaults.

Types

build_opt()

@type build_opt() ::
  {:dropout, float()}
  | {:embed_dim, pos_integer()}
  | {:hidden_size, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:seq_len, pos_integer()}
  | {:state_size, pos_integer()}
  | {:window_size, pos_integer()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build an S4D model for sequence processing.

Options

  • :embed_dim - Size of input embedding per frame (required)
  • :hidden_size - Internal hidden dimension (default: 256)
  • :state_size - SSM state dimension N (default: 64)
  • :num_layers - Number of S4D blocks (default: 4)
  • :dropout - Dropout rate (default: 0.1)
  • :window_size - Expected sequence length (default: 60)

Returns

An Axon model that outputs [batch, hidden_size] from the last position.

build_s4d_block(input, opts)

@spec build_s4d_block(
  Axon.t(),
  keyword()
) :: Axon.t()

Build a single S4D block.

output_size(opts \\ [])

@spec output_size(keyword()) :: non_neg_integer()

Get the output size of an S4D model.

param_count(opts)

@spec param_count(keyword()) :: non_neg_integer()

Calculate approximate parameter count for an S4D model.