Edifice.Attention.Perceiver (Edifice v0.2.0)

Perceiver IO: General-purpose architecture with learned latent array.

Perceiver IO uses cross-attention to map arbitrary inputs to a fixed-size latent array, processes latents with self-attention, then optionally cross-attends back for structured output. This decouples compute from input size.

Key Innovation: Latent Bottleneck

Instead of self-attending over the full input (O(N^2)), Perceiver cross-attends inputs to a small learned latent array (M << N), then self-attends over latents (O(M^2)). Total: O(N*M + M^2).

Input [batch, N, input_dim]     Latents [1, M, latent_dim] (learned)
      |                               |
      +-- Cross-Attention(L, Input) --+
                  |
            Latents' [batch, M, latent_dim]
                  |
            Self-Attention x num_layers
                  |
            Latents'' [batch, M, latent_dim]
                  |
            Pool -> [batch, latent_dim]

Architecture

Input [batch, seq_len, input_dim]
      |
      v
+-------------------------------------+
|  Cross-Attention                     |
|  Q = Latent Array (learned, M x D)  |
|  K, V = Input                       |
|  -> Latents absorb input info       |
+-------------------------------------+
      |
      v (repeat num_cross_layers)
+-------------------------------------+
|  Self-Attention Block                |
|  LayerNorm -> Self-Attn -> Residual  |
|  LayerNorm -> FFN -> Residual        |
+-------------------------------------+
      | (repeat num_layers)
      v
Mean pool over latents -> [batch, latent_dim]

Complexity

Component	Standard Transformer	Perceiver
Self-Attn	O(N^2)	O(M^2)
Cross-Attn	-	O(N*M)
Total	O(N^2)	O(N*M + M^2)

Where M = num_latents << N = input length.

Usage

model = Perceiver.build(
  input_dim: 287,
  latent_dim: 256,
  num_latents: 64,
  num_layers: 4,
  num_cross_layers: 1,
  num_heads: 4
)

References

Paper: "Perceiver IO: A General Architecture for Structured Inputs & Outputs" (Jaegle et al., DeepMind 2021)
Original: "Perceiver: General Perception with Iterative Attention" (2021)

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build a Perceiver IO model for sequence processing.

build_cross_attention_block(latents, input_kv, opts)

Build a cross-attention block where latents attend to input.

build_self_attention_block(input, opts)

Build a self-attention block over latents.

output_size(opts \\ [])

Get the output size of a Perceiver model.

param_count(opts)

Calculate approximate parameter count for a Perceiver model.

recommended_defaults()

Recommended default configuration for sequence processing.