Edifice.Meta.LoRA (Edifice v0.2.0)

Copy Markdown View Source

Low-Rank Adaptation (LoRA) for parameter-efficient finetuning.

LoRA freezes the original model weights and injects trainable low-rank decomposition matrices into each layer. Instead of updating the full weight matrix W, LoRA learns a low-rank update:

output = Wx + (alpha/rank) * B(Ax)

where A is [input_size, rank] and B is [rank, output_size]. This reduces the number of trainable parameters by orders of magnitude while maintaining model quality.

Architecture

Input x [batch, input_size]
      |
      +---> W * x (frozen)          [batch, output_size]
      |           |
      +---> A * x [batch, rank]      |
            |                        |
            v                        |
      B * (A * x) [batch, output]    |
            |                        |
            v                        v
      (alpha/rank) * B(Ax)    +    W*x
            |
            v
      Output [batch, output_size]

Usage

# Standalone LoRA layer
lora = LoRA.build(input_size: 768, output_size: 768, rank: 8, alpha: 16.0)

# Wrap an existing dense layer with LoRA
original = Axon.dense(input, 768, name: "layer")
adapted = LoRA.wrap(input, original, rank: 8, alpha: 16.0, name: "lora_layer")

References

Summary

Types

Options for build/1.

Functions

Build a standalone LoRA adapter layer.

Build a LoRA delta: the low-rank component (alpha/rank) * B(A(x)).

Get the output size of a LoRA layer.

Wrap an existing dense layer with a LoRA adapter.

Types

build_opt()

@type build_opt() ::
  {:alpha, float()}
  | {:input_size, pos_integer()}
  | {:output_size, pos_integer()}
  | {:rank, pos_integer()}

Options for build/1.

Functions

build(opts \\ [])

@spec build([build_opt()]) :: Axon.t()

Build a standalone LoRA adapter layer.

Computes (alpha/rank) * B(A(x)) where A down-projects to rank and B up-projects back to output_size.

Options

  • :input_size - Input dimension (required)
  • :output_size - Output dimension (required)
  • :rank - Low-rank dimension (default: 8)
  • :alpha - Scaling factor (default: 16.0)
  • :name - Layer name prefix (default: "lora")

Returns

An Axon model: [batch, input_size] -> [batch, output_size]

lora_delta(input, output_size, opts \\ [])

@spec lora_delta(Axon.t(), pos_integer(), keyword()) :: Axon.t()

Build a LoRA delta: the low-rank component (alpha/rank) * B(A(x)).

Parameters

  • input - Axon input node
  • output_size - Target output dimension

Options

  • :rank - Low-rank dimension (default: 8)
  • :alpha - Scaling factor (default: 16.0)
  • :name - Layer name prefix

output_size(opts \\ [])

@spec output_size(keyword()) :: pos_integer()

Get the output size of a LoRA layer.

wrap(input, original, opts \\ [])

@spec wrap(Axon.t(), Axon.t(), keyword()) :: Axon.t()

Wrap an existing dense layer with a LoRA adapter.

The output is the sum of the original (frozen) layer output and the low-rank adaptation: original_output + (alpha/rank) * B(A(x)).

Parameters

  • input - The Axon input node that feeds the original layer
  • original - The original Axon dense layer output

Options

  • :rank - Low-rank dimension (default: 8)
  • :alpha - Scaling factor (default: 16.0)
  • :name - Layer name prefix (default: "lora")

Returns

An Axon node with the adapted output.