# `Edifice.Meta.DoRA`
[🔗](https://github.com/blasphemetheus/edifice/blob/main/lib/edifice/meta/dora.ex#L1)

DoRA: Weight-Decomposed Low-Rank Adaptation.

Implements DoRA from "DoRA: Weight-Decomposed Low-Rank Adaptation of Large
Language Models" (Liu et al., 2024). DoRA decomposes pretrained weights into
magnitude and direction components, then applies LoRA only to the direction.

## Key Innovation: Magnitude-Direction Decomposition

Standard LoRA modifies the full weight: `W' = W + BA`

DoRA decomposes W into magnitude m and direction V:
```
W = m * (V / ||V||)
```

Then applies LoRA only to the direction component:
```
W' = m * ((V + BA) / ||V + BA||)
```

Where:
- `m` is a learnable magnitude vector [output_size]
- `V` is the original weight direction
- `BA` is the standard LoRA low-rank update
- `||.||` is column-wise L2 normalization

## Why This Works

Separating magnitude from direction gives two benefits:
1. **Direction** captures "what" features are important (adapted by LoRA)
2. **Magnitude** captures "how much" each feature matters (learned separately)
3. This mirrors weight normalization, which is known to improve optimization

## Architecture

```
Input x [batch, input_size]
      |
      +---> W * x (frozen base)
      |        |
      +---> A * x -> B * (A * x)     (LoRA delta)
      |        |
      |     V + BA                    (direction update)
      |        |
      |     normalize(V + BA)         (unit direction)
      |        |
      |     m * normalized            (apply magnitude)
      |
      v
Output [batch, output_size]
```

## LoRA+ Note

LoRA+ (Hayou et al., 2024) proposes different learning rates for A vs B
matrices. This is a training configuration choice rather than architectural:
use a higher learning rate for B (e.g., 5-10x) than for A. We document
this recommendation but don't enforce it in the graph structure.

## Usage

    # Standalone DoRA layer
    dora = DoRA.build(input_size: 768, output_size: 768, rank: 8)

    # Wrap an existing layer with DoRA
    adapted = DoRA.wrap(input, original, rank: 8, name: "dora_attn")

## References

- Liu et al., "DoRA: Weight-Decomposed Low-Rank Adaptation" (2024)
- https://arxiv.org/abs/2402.09353
- Hayou et al., "LoRA+: Efficient Low Rank Adaptation of Large Models" (2024)

# `build_opt`

```elixir
@type build_opt() ::
  {:alpha, float()}
  | {:input_size, pos_integer()}
  | {:output_size, pos_integer()}
  | {:rank, pos_integer()}
```

Options for `build/1`.

# `build`

```elixir
@spec build([build_opt()]) :: Axon.t()
```

Build a standalone DoRA adapter layer.

Computes weight-decomposed adaptation: `m * normalize(V*x + (alpha/rank)*B(A(x)))`.

## Options

- `:input_size` - Input dimension (required)
- `:output_size` - Output dimension (required)
- `:rank` - Low-rank dimension (default: 8)
- `:alpha` - LoRA scaling factor (default: 16.0)
- `:name` - Layer name prefix (default: "dora")

## Returns

An Axon model: `[batch, input_size]` -> `[batch, output_size]`

# `dora_layer`

```elixir
@spec dora_layer(Axon.t(), pos_integer(), pos_integer(), keyword()) :: Axon.t()
```

Build a DoRA layer inline (for use in custom architectures).

## Parameters

- `input` - Axon input node
- `input_size` - Input dimension
- `output_size` - Output dimension

## Options

- `:rank` - Low-rank dimension (default: 8)
- `:alpha` - LoRA scaling factor (default: 16.0)
- `:name` - Layer name prefix (default: "dora")

# `output_size`

```elixir
@spec output_size(keyword()) :: pos_integer()
```

Get the output size of a DoRA layer.

# `recommended_defaults`

```elixir
@spec recommended_defaults() :: keyword()
```

Get recommended defaults.

# `wrap`

```elixir
@spec wrap(Axon.t(), Axon.t(), keyword()) :: Axon.t()
```

Wrap an existing dense layer with DoRA adaptation.

## Parameters

- `input` - The Axon input node
- `original` - The original Axon dense layer output

## Options

- `:output_size` - Output dimension (required)
- `:rank` - Low-rank dimension (default: 8)
- `:alpha` - Scaling factor (default: 16.0)
- `:name` - Layer name prefix (default: "dora")

---

*Consult [api-reference.md](api-reference.md) for complete listing*
