High-level model building utilities for sequence and vision architectures.
Provides standardized model skeletons that handle input creation, projection, block stacking, final normalization, and output extraction. Architecture-specific logic is provided via block builder callbacks.
Sequence Model
Input [batch, seq_len, embed_dim]
-> Optional projection to hidden_size
-> Stack N blocks (via block_builder callback)
-> Final LayerNorm
-> Output extraction (last_timestep / all / mean_pool)Vision Model
Input [batch, channels, height, width]
-> Patch embedding
-> Stack N blocks (via block_builder callback)
-> Final LayerNorm
-> Pooling (cls_token / mean_pool)
-> Optional classifier headUsage
# Build a sequence model with custom blocks
model = ModelBuilder.build_sequence_model(
embed_dim: 287,
hidden_size: 256,
num_layers: 4,
block_builder: fn input, opts -> MyBlock.layer(input, opts) end
)Design
Generalizes the pattern from Edifice.SSM.Common.build_model/2 to work
with any block type (SSM, attention, MLP mixer, etc.).
Summary
Functions
Build a sequence processing model.
Options
:embed_dim- Input embedding dimension (required):hidden_size- Internal hidden dimension (default: embed_dim):num_layers- Number of blocks to stack (required):block_builder- Function(input, opts) -> Axon.t()that builds one block (required):seq_len- Expected sequence length for JIT optimization (default: 60):output_mode- Output extraction::last_timestep,:all,:mean_pool(default: :last_timestep):final_norm- Whether to apply final layer norm (default: true):dropout- Dropout rate between blocks (default: 0.0)
Returns
An Axon model. Output shape depends on :output_mode:
:last_timestep->[batch, hidden_size]:all->[batch, seq_len, hidden_size]:mean_pool->[batch, hidden_size]
Build a vision model with patch embedding.
Options
:image_size- Input image size (square, default: 224):patch_size- Patch size (default: 16):in_channels- Number of input channels (default: 3):hidden_size- Hidden dimension (required):num_layers- Number of blocks to stack (required):block_builder- Function(input, opts) -> Axon.t()that builds one block (required):num_classes- If provided, adds a classifier head:output_mode- Pooling mode::mean_pool,:cls_token(default: :mean_pool):final_norm- Whether to apply final layer norm (default: true)
Returns
An Axon model outputting [batch, hidden_size] or [batch, num_classes].