Edifice.Graph.GraphTransformer (Edifice v0.2.0)

Graph Transformer with structural encoding.

Applies transformer-style multi-head attention to graph-structured data, using the adjacency matrix as an attention bias/mask to incorporate graph structure. Includes graph positional encoding via random walk structural encoding (RWSE) or Laplacian eigenvectors approximated via the adjacency matrix powers.

Architecture

Node Features [batch, num_nodes, input_dim]
Adjacency     [batch, num_nodes, num_nodes]
      |
      v
+--------------------------------------+
| Input Projection + Positional Enc    |
+--------------------------------------+
      |
      v
+--------------------------------------+
| Graph Transformer Layer 1:           |
|   Pre-Norm -> Multi-Head Attention   |
|   (adjacency as attention bias)      |
|   + Residual                         |
|   Pre-Norm -> FFN + Residual         |
+--------------------------------------+
      |
      v
+--------------------------------------+
| Graph Transformer Layer N            |
+--------------------------------------+
      |
      v
Node Embeddings [batch, num_nodes, hidden_size]

Usage

model = GraphTransformer.build(
  input_dim: 16,
  hidden_size: 64,
  num_heads: 4,
  num_layers: 4,
  num_classes: 7
)

References

Dwivedi & Bresson, "A Generalization of Transformer Networks to Graphs" (AAAI 2021)
Ying et al., "Do Transformers Really Perform Bad for Graph Representation?" (NeurIPS 2021)

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build a Graph Transformer.

graph_transformer_layer(nodes, adjacency, hidden_size, opts \\ [])

Single Graph Transformer layer with pre-norm attention + FFN.

output_size(opts \\ [])

Get the output size of a Graph Transformer.