Edifice.Graph.GraphTransformer (Edifice v0.2.0)

Copy Markdown View Source

Graph Transformer with structural encoding.

Applies transformer-style multi-head attention to graph-structured data, using the adjacency matrix as an attention bias/mask to incorporate graph structure. Includes graph positional encoding via random walk structural encoding (RWSE) or Laplacian eigenvectors approximated via the adjacency matrix powers.

Architecture

Node Features [batch, num_nodes, input_dim]
Adjacency     [batch, num_nodes, num_nodes]
      |
      v
+--------------------------------------+
| Input Projection + Positional Enc    |
+--------------------------------------+
      |
      v
+--------------------------------------+
| Graph Transformer Layer 1:           |
|   Pre-Norm -> Multi-Head Attention   |
|   (adjacency as attention bias)      |
|   + Residual                         |
|   Pre-Norm -> FFN + Residual         |
+--------------------------------------+
      |
      v
+--------------------------------------+
| Graph Transformer Layer N            |
+--------------------------------------+
      |
      v
Node Embeddings [batch, num_nodes, hidden_size]

Usage

model = GraphTransformer.build(
  input_dim: 16,
  hidden_size: 64,
  num_heads: 4,
  num_layers: 4,
  num_classes: 7
)

References

  • Dwivedi & Bresson, "A Generalization of Transformer Networks to Graphs" (AAAI 2021)
  • Ying et al., "Do Transformers Really Perform Bad for Graph Representation?" (NeurIPS 2021)

Summary

Types

Options for build/1.

Functions

Build a Graph Transformer.

Single Graph Transformer layer with pre-norm attention + FFN.

Get the output size of a Graph Transformer.

Types

build_opt()

@type build_opt() ::
  {:dropout, float()}
  | {:hidden_size, pos_integer()}
  | {:input_dim, pos_integer()}
  | {:num_classes, pos_integer() | nil}
  | {:num_heads, pos_integer()}
  | {:num_layers, pos_integer()}
  | {:pool, atom()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a Graph Transformer.

Options

  • :input_dim - Input feature dimension per node (required)
  • :hidden_size - Hidden dimension (default: 64)
  • :num_heads - Number of attention heads (default: 4)
  • :num_layers - Number of transformer layers (default: 4)
  • :num_classes - If provided, adds a classification head (default: nil)
  • :dropout - Dropout rate (default: 0.0)
  • :pool - Global pooling for graph classification (default: nil)

Returns

An Axon model with two inputs ("nodes" and "adjacency").

graph_transformer_layer(nodes, adjacency, hidden_size, opts \\ [])

@spec graph_transformer_layer(Axon.t(), Axon.t(), pos_integer(), keyword()) ::
  Axon.t()

Single Graph Transformer layer with pre-norm attention + FFN.

Options

  • :num_heads - Number of attention heads (default: 4)
  • :dropout - Dropout rate (default: 0.0)
  • :name - Layer name prefix

output_size(opts \\ [])

@spec output_size(keyword()) :: pos_integer()

Get the output size of a Graph Transformer.