# emlx_axon v0.3.0 - Table of Contents > Axon model rewrites to swap supported nodes for EMLX.Fast Metal shaders ## Pages - [EMLXAxon](readme.md) ## Modules - [EMLXAxon](EMLXAxon.md): Axon model rewrites that swap supported nodes to `EMLX.Fast` Metal shaders. - [EMLXAxon.MLX4BitParams](EMLXAxon.MLX4BitParams.md): Loads Qwen3 weights from an MLX-4bit safetensors checkpoint into Bumblebee Axon params format (BF16, Bumblebee `{in, out}` key convention). - [EMLXAxon.QuantizeParams](EMLXAxon.QuantizeParams.md): Post-load param quantization for Bumblebee models. - [EMLXAxon.Qwen3.Attention](EMLXAxon.Qwen3.Attention.md): Grouped-query attention (GQA) for Qwen3, with a preallocated KV cache. - [EMLXAxon.Qwen3.Generate](EMLXAxon.Qwen3.Generate.md): Autoregressive token generation loop. - [EMLXAxon.Qwen3.Layers](EMLXAxon.Qwen3.Layers.md): Stateless layer primitives: RMSNorm, SwiGLU. - [EMLXAxon.Qwen3.Loader](EMLXAxon.Qwen3.Loader.md): Loads a `lmstudio-community/Qwen3-*-MLX-4bit` checkpoint from disk into an `%EMLXAxon.Qwen3.Model.State{}` struct. - [EMLXAxon.Qwen3.Model](EMLXAxon.Qwen3.Model.md): Qwen3 quantized model state struct and forward pass. - [EMLXAxon.Qwen3.Model.State](EMLXAxon.Qwen3.Model.State.md): Loaded model weights and config. - [EMLXAxon.Qwen3.Sampler](EMLXAxon.Qwen3.Sampler.md): Three sampling strategies for autoregressive token generation. - [EMLXAxon.TextGeneration](EMLXAxon.TextGeneration.md): A `Nx.Serving`-compatible wrapper around the native Qwen3 quantized model.