Edifice.Meta.LoRA (Edifice v0.2.0)

Low-Rank Adaptation (LoRA) for parameter-efficient finetuning.

LoRA freezes the original model weights and injects trainable low-rank decomposition matrices into each layer. Instead of updating the full weight matrix W, LoRA learns a low-rank update:

output = Wx + (alpha/rank) * B(Ax)

where A is [input_size, rank] and B is [rank, output_size]. This reduces the number of trainable parameters by orders of magnitude while maintaining model quality.

Architecture

Input x [batch, input_size]
      |
      +---> W * x (frozen)          [batch, output_size]
      |           |
      +---> A * x [batch, rank]      |
            |                        |
            v                        |
      B * (A * x) [batch, output]    |
            |                        |
            v                        v
      (alpha/rank) * B(Ax)    +    W*x
            |
            v
      Output [batch, output_size]

Usage

# Standalone LoRA layer
lora = LoRA.build(input_size: 768, output_size: 768, rank: 8, alpha: 16.0)

# Wrap an existing dense layer with LoRA
original = Axon.dense(input, 768, name: "layer")
adapted = LoRA.wrap(input, original, rank: 8, alpha: 16.0, name: "lora_layer")

References

Hu et al., "LoRA: Low-Rank Adaptation of Large Language Models" (ICLR 2022)
https://arxiv.org/abs/2106.09685

Summary

Types

build_opt()

Options for build/1.

Functions

build(opts \\ [])

Build a standalone LoRA adapter layer.

lora_delta(input, output_size, opts \\ [])

Build a LoRA delta: the low-rank component (alpha/rank) * B(A(x)).

output_size(opts \\ [])

Get the output size of a LoRA layer.

wrap(input, original, opts \\ [])

Wrap an existing dense layer with a LoRA adapter.

Types

build_opt()

@type build_opt() ::
  {:alpha, float()}
  | {:input_size, pos_integer()}
  | {:output_size, pos_integer()}
  | {:rank, pos_integer()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a standalone LoRA adapter layer.

Computes (alpha/rank) * B(A(x)) where A down-projects to rank and B up-projects back to output_size.

Options

:input_size - Input dimension (required)
:output_size - Output dimension (required)
:rank - Low-rank dimension (default: 8)
:alpha - Scaling factor (default: 16.0)
:name - Layer name prefix (default: "lora")

Returns

An Axon model: [batch, input_size] -> [batch, output_size]

lora_delta(input, output_size, opts \\ [])

@spec lora_delta(Axon.t(), pos_integer(), keyword()) :: Axon.t()

Build a LoRA delta: the low-rank component (alpha/rank) * B(A(x)).

Parameters

input - Axon input node
output_size - Target output dimension

Options

:rank - Low-rank dimension (default: 8)
:alpha - Scaling factor (default: 16.0)
:name - Layer name prefix

output_size(opts \\ [])

@spec output_size(keyword()) :: pos_integer()

Get the output size of a LoRA layer.

wrap(input, original, opts \\ [])

@spec wrap(Axon.t(), Axon.t(), keyword()) :: Axon.t()

Wrap an existing dense layer with a LoRA adapter.

The output is the sum of the original (frozen) layer output and the low-rank adaptation: original_output + (alpha/rank) * B(A(x)).

Parameters

input - The Axon input node that feeds the original layer
original - The original Axon dense layer output

Options

:rank - Low-rank dimension (default: 8)
:alpha - Scaling factor (default: 16.0)
:name - Layer name prefix (default: "lora")

Returns

An Axon node with the adapted output.