Edifice.SSM.Mamba (Edifice v0.2.0)

Mamba: True Selective State Space Model with optimized parallel scan.

Implements the Mamba architecture from "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" (Gu & Dao, 2023).

Key Innovation: Parallel Associative Scan

The SSM recurrence h[t] = A h[t-1] + B x[t] seems sequential, but can be parallelized using associativity:

Define: (a, b) ⊗ (c, d) = (a*c, a*d + b)

Then the scan:
  h[0] = B[0] * x[0]
  h[1] = A[1] * h[0] + B[1] * x[1]
  h[2] = A[2] * h[1] + B[2] * x[2]
  ...

Can be computed in O(log L) parallel time using prefix scan.

Selective Mechanism

Unlike linear time-invariant SSMs, Mamba makes A, B, C input-dependent:

Δ (discretization step) controls how much to update state
B (input matrix) projects input to state space
C (output matrix) projects state to output
These are computed from the input, enabling selective focus

Architecture

Input [batch, seq_len, embed_dim]
      │
      ▼
┌─────────────────────────────────────┐
│         Mamba Block                  │
│                                      │
│  ┌──── Linear (expand) ────┐        │
│  │           │              │        │
│  │   DepthwiseConv + SiLU   │        │
│  │           │              │        │
│  │   Parallel Scan SSM  Linear+SiLU  │
│  │           │              │        │
│  └───────── multiply ───────┘        │
│               │                      │
│         Linear (project)             │
└─────────────────────────────────────┘
      │
      ▼ (repeat for num_layers)

Usage

# Build Mamba backbone
model = Mamba.build(
  embed_dim: 287,
  hidden_size: 256,
  state_size: 16,
  num_layers: 2,
  expand_factor: 2
)

References

Paper: https://arxiv.org/abs/2312.00752
Original code: https://github.com/state-spaces/mamba

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build a Mamba model for sequence processing.

build_depthwise_conv1d(input, channels, kernel_size, name)

Build depthwise separable 1D convolution layer.

build_mamba_block(input, opts \\ [])

Build a single Mamba block with parallel scan SSM.

build_selective_ssm_parallel(input, opts \\ [])

Build the Selective SSM with parallel associative scan.

output_size(opts \\ [])

Get the output size of a Mamba model.

param_count(opts)

Calculate approximate parameter count for a Mamba model.

recommended_defaults()

Get recommended defaults for real-time sequence processing (60fps).