# `Edifice.Blocks.ALiBi`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/blocks/alibi.ex#L1)

Attention with Linear Biases (ALiBi).

Replaces positional embeddings with a simple linear bias added to attention
scores. Each attention head gets a different slope, creating head-specific
position sensitivity. ALiBi provides strong length extrapolation without
any learned position parameters.

## Formula

    attention(Q, K) = softmax(QK^T / sqrt(d) + m * distance_matrix)

where m is a head-specific slope and distance_matrix[i,j] = -(|i - j|).

## Slope Schedule

Slopes are geometric: m_i = 2^(-8i/n_heads) for i = 1..n_heads.
Lower heads get steeper slopes (more local), higher heads get gentler
slopes (more global).

## Usage

    # Get ALiBi bias matrix for attention
    bias = ALiBi.compute_bias(seq_len: 128, num_heads: 8)

    # Add to attention scores before softmax
    scores = scores + bias

## References
- "Train Short, Test Long" (Press et al., 2022)
- https://arxiv.org/abs/2108.12409

# `compute_bias`

```elixir
@spec compute_bias(keyword()) :: Nx.Tensor.t()
```

Compute ALiBi bias matrix for a given sequence length and number of heads.

Returns bias of shape [num_heads, seq_len, seq_len] to add to attention scores.

## Options
  - `:seq_len` - Sequence length (required)
  - `:num_heads` - Number of attention heads (required)
  - `:causal` - Use causal (lower-triangular) distances (default: true)

# `compute_slopes`

```elixir
@spec compute_slopes(pos_integer()) :: Nx.Tensor.t()
```

Compute ALiBi slopes for each attention head.

Returns tensor of shape [num_heads] with geometric slopes.

---

*Consult [api-reference.md](api-reference.md) for complete listing*
