# `Edifice.Graph.GAT`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/graph/gat.ex#L1)

Graph Attention Network (Velickovic et al., 2018).

Implements attention-based message passing where each node attends to its
neighbors with learned attention weights. Unlike GCN which uses fixed
normalization, GAT learns to weight neighbor contributions adaptively.

## Architecture

```
Node Features [batch, num_nodes, input_dim]
Adjacency     [batch, num_nodes, num_nodes]
      |
      v
+--------------------------------------+
| GAT Layer (K heads):                 |
|                                      |
|   For each head k:                   |
|     1. Project: z_i = W_k h_i        |
|     2. Attention: e_ij =             |
|        LeakyReLU(a^T [z_i || z_j])   |
|     3. Normalize: alpha_ij =         |
|        softmax_j(e_ij) * A_ij        |
|     4. Aggregate: h_i' =             |
|        sigma(SUM_j alpha_ij z_j)     |
|                                      |
|   Concatenate heads: [h1 || ... hK]  |
+--------------------------------------+
      |
      v
Node Embeddings [batch, num_nodes, num_heads * hidden_size]
```

Multi-head attention allows the model to jointly attend to information from
different representation subspaces at different positions.

## Usage

    # Build a GAT for node classification
    model = GAT.build(
      input_dim: 16,
      hidden_size: 8,
      num_heads: 8,
      num_classes: 7,
      dropout: 0.6
    )

## References

- "Graph Attention Networks" (Velickovic et al., ICLR 2018)

# `build_opt`

```elixir
@type build_opt() ::
  {:activation, atom()}
  | {:dropout, float()}
  | {:hidden_size, pos_integer()}
  | {:input_dim, pos_integer()}
  | {:num_classes, pos_integer() | nil}
  | {:num_heads, pos_integer()}
  | {:num_layers, pos_integer()}
```

Options for `build/1`.

# `attention_coefficients`

```elixir
@spec attention_coefficients(Axon.t(), Axon.t(), pos_integer(), keyword()) :: Axon.t()
```

Compute attention coefficients between connected nodes.

Returns the raw (pre-softmax) attention scores for visualization or analysis.
The attention mechanism is:

    e_ij = LeakyReLU(a^T [W h_i || W h_j])

## Parameters

- `nodes` - Node features Axon node `{batch, num_nodes, feature_dim}`
- `adjacency` - Adjacency matrix Axon node `{batch, num_nodes, num_nodes}`
- `hidden_size` - Projection dimension
- `opts` - Options

## Options

- `:name` - Layer name prefix (default: "gat_attn")
- `:negative_slope` - LeakyReLU slope (default: 0.2)

## Returns

Axon node with attention coefficients `{batch, num_nodes, num_nodes}`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a Graph Attention Network.

Constructs a two-layer GAT with multi-head attention in the first layer
(heads concatenated) and single-head attention in the output layer
(heads averaged), following the original paper's design.

## Options

- `:input_dim` - Input feature dimension per node (required)
- `:hidden_size` - Hidden dimension per attention head (default: 8)
- `:num_heads` - Number of attention heads (default: 8)
- `:num_classes` - Number of output classes (required)
- `:activation` - Activation function (default: :elu)
- `:dropout` - Dropout rate for features and attention (default: 0.0)
- `:num_layers` - Number of GAT layers (default: 2)

## Returns

An Axon model with two inputs ("nodes" and "adjacency"). Output shape is
`{batch, num_nodes, num_classes}` for node classification.

# `gat_layer`

```elixir
@spec gat_layer(Axon.t(), Axon.t(), pos_integer(), keyword()) :: Axon.t()
```

Single Graph Attention layer with multi-head attention.

Each attention head independently computes attention coefficients over
neighbors and produces an output. Heads are either concatenated (hidden
layers) or averaged (output layer).

## Parameters

- `nodes` - Node features Axon node `{batch, num_nodes, in_dim}`
- `adjacency` - Adjacency matrix Axon node `{batch, num_nodes, num_nodes}`
- `output_dim` - Output dimension per head

## Options

- `:num_heads` - Number of attention heads (default: 8)
- `:name` - Layer name prefix (default: "gat")
- `:activation` - Activation function, nil for none (default: :elu)
- `:dropout` - Dropout rate (default: 0.0)
- `:concat_heads` - Concatenate heads (true) or average (false) (default: true)
- `:negative_slope` - LeakyReLU negative slope (default: 0.2)

## Returns

If `concat_heads` is true: `{batch, num_nodes, num_heads * output_dim}`
If `concat_heads` is false: `{batch, num_nodes, output_dim}`

---

*Consult [api-reference.md](api-reference.md) for complete listing*
