Loads a lmstudio-community/Qwen3-*-MLX-4bit checkpoint from disk into
an %EMLXAxon.Qwen3.Model.State{} struct.
Each linear weight is stored as a triplet in the safetensors file:
<name>.weight—:u32, packed int4 data<name>.scales—:bf16or:f16<name>.biases— same dtype as scales
The loader constructs an annotated Nx.Tensor via
EMLX.Quantization.quantized_tensor/5 so plain Nx.dot(act, weight, …)
dispatches transparently to EMLX.quantized_matmul via the backend.
Usage
{:ok, state} = Loader.load("~/models/Qwen3-0.6B-MLX-4bit")
Summary
Functions
Loads a Qwen3 MLX-4bit checkpoint directory.
Types
@type config() :: %{ hidden_size: pos_integer(), intermediate_size: pos_integer(), num_attention_heads: pos_integer(), num_key_value_heads: pos_integer(), head_dim: pos_integer(), num_hidden_layers: pos_integer(), vocab_size: pos_integer(), rms_norm_eps: float(), rope_theta: float(), tie_word_embeddings: boolean() }
Functions
@spec load(Path.t()) :: {:ok, EMLXAxon.Qwen3.Model.State.t()} | {:error, term()}
Loads a Qwen3 MLX-4bit checkpoint directory.
Reads config.json to extract model dimensions, then loads all
.safetensors shards found in the directory.
Returns {:ok, %State{}} or {:error, reason}.