EMLXAxon.Qwen3.Loader (emlx_axon v0.3.0)

Copy Markdown View Source

Loads a lmstudio-community/Qwen3-*-MLX-4bit checkpoint from disk into an %EMLXAxon.Qwen3.Model.State{} struct.

Each linear weight is stored as a triplet in the safetensors file:

  • <name>.weight:u32, packed int4 data
  • <name>.scales:bf16 or :f16
  • <name>.biases — same dtype as scales

The loader constructs an annotated Nx.Tensor via EMLX.Quantization.quantized_tensor/5 so plain Nx.dot(act, weight, …) dispatches transparently to EMLX.quantized_matmul via the backend.

Usage

{:ok, state} = Loader.load("~/models/Qwen3-0.6B-MLX-4bit")

Summary

Functions

Loads a Qwen3 MLX-4bit checkpoint directory.

Types

config()

@type config() :: %{
  hidden_size: pos_integer(),
  intermediate_size: pos_integer(),
  num_attention_heads: pos_integer(),
  num_key_value_heads: pos_integer(),
  head_dim: pos_integer(),
  num_hidden_layers: pos_integer(),
  vocab_size: pos_integer(),
  rms_norm_eps: float(),
  rope_theta: float(),
  tie_word_embeddings: boolean()
}

Functions

load(path)

@spec load(Path.t()) :: {:ok, EMLXAxon.Qwen3.Model.State.t()} | {:error, term()}

Loads a Qwen3 MLX-4bit checkpoint directory.

Reads config.json to extract model dimensions, then loads all .safetensors shards found in the directory.

Returns {:ok, %State{}} or {:error, reason}.