Low-Rank Adaptation (LoRA) for parameter-efficient finetuning.
LoRA freezes the original model weights and injects trainable low-rank decomposition matrices into each layer. Instead of updating the full weight matrix W, LoRA learns a low-rank update:
output = Wx + (alpha/rank) * B(Ax)where A is [input_size, rank] and B is [rank, output_size]. This reduces the number of trainable parameters by orders of magnitude while maintaining model quality.
Architecture
Input x [batch, input_size]
|
+---> W * x (frozen) [batch, output_size]
| |
+---> A * x [batch, rank] |
| |
v |
B * (A * x) [batch, output] |
| |
v v
(alpha/rank) * B(Ax) + W*x
|
v
Output [batch, output_size]Usage
# Standalone LoRA layer
lora = LoRA.build(input_size: 768, output_size: 768, rank: 8, alpha: 16.0)
# Wrap an existing dense layer with LoRA
original = Axon.dense(input, 768, name: "layer")
adapted = LoRA.wrap(input, original, rank: 8, alpha: 16.0, name: "lora_layer")References
- Hu et al., "LoRA: Low-Rank Adaptation of Large Language Models" (ICLR 2022)
- https://arxiv.org/abs/2106.09685
Summary
Functions
Build a standalone LoRA adapter layer.
Build a LoRA delta: the low-rank component (alpha/rank) * B(A(x)).
Get the output size of a LoRA layer.
Wrap an existing dense layer with a LoRA adapter.
Types
@type build_opt() :: {:alpha, float()} | {:input_size, pos_integer()} | {:output_size, pos_integer()} | {:rank, pos_integer()}
Options for build/1.
Functions
Build a standalone LoRA adapter layer.
Computes (alpha/rank) * B(A(x)) where A down-projects to rank and
B up-projects back to output_size.
Options
:input_size- Input dimension (required):output_size- Output dimension (required):rank- Low-rank dimension (default: 8):alpha- Scaling factor (default: 16.0):name- Layer name prefix (default: "lora")
Returns
An Axon model: [batch, input_size] -> [batch, output_size]
@spec lora_delta(Axon.t(), pos_integer(), keyword()) :: Axon.t()
Build a LoRA delta: the low-rank component (alpha/rank) * B(A(x)).
Parameters
input- Axon input nodeoutput_size- Target output dimension
Options
:rank- Low-rank dimension (default: 8):alpha- Scaling factor (default: 16.0):name- Layer name prefix
@spec output_size(keyword()) :: pos_integer()
Get the output size of a LoRA layer.
Wrap an existing dense layer with a LoRA adapter.
The output is the sum of the original (frozen) layer output and the
low-rank adaptation: original_output + (alpha/rank) * B(A(x)).
Parameters
input- The Axon input node that feeds the original layeroriginal- The original Axon dense layer output
Options
:rank- Low-rank dimension (default: 8):alpha- Scaling factor (default: 16.0):name- Layer name prefix (default: "lora")
Returns
An Axon node with the adapted output.