# `HuggingfaceClient.Hub.Compute.Training`
[🔗](https://github.com/huggingface/huggingface_client/blob/v0.1.0/lib/huggingface_client/hub/compute/training.ex#L1)

HuggingFace Training Stack — configuration helpers for fine-tuning and training.

Provides structured configuration builders for:
- **3.1 AutoTrain** — no-code training via the AutoTrain API
- **3.2 Trainer API** — Transformers Trainer configuration (TrainingArguments)
- **3.3 Fine-tuning techniques** — LoRA, PEFT, full fine-tuning, DPO, ORPO, SFT

These helpers generate configuration maps that can be:
1. Passed to `HuggingfaceClient.autotrain_create/1` to launch training on HF infra
2. Passed to `HuggingfaceClient.run_job/1` to run a custom training script
3. Serialized to JSON/YAML for use in local training scripts

## Example

    # Launch LoRA fine-tuning via AutoTrain
    config = HuggingfaceClient.Training.lora_config(
      base_model: "meta-llama/Llama-3.1-8B",
      dataset: "my-org/my-dataset",
      rank: 16, alpha: 32, epochs: 3
    )

    {:ok, project} = HuggingfaceClient.autotrain_create(
      Map.merge(config, %{project_name: "my-lora-model", access_token: token})
    )

    # Or run a training job on GPU infra
    {:ok, job} = HuggingfaceClient.run_job(
      image: "huggingface/transformers-pytorch-gpu:latest",
      command: ["python", "train.py"] ++ HuggingfaceClient.Training.to_args(config),
      flavor: "a10g-small",
      access_token: token
    )

# `accelerate_config`

```elixir
@spec accelerate_config(keyword()) :: map()
```

Builds an Accelerate launch configuration for distributed training.

This mirrors `accelerate config` / `accelerate launch` parameters.

## Options

- `:num_processes` — total number of processes (GPUs) (default: 1)
- `:num_machines` — number of machines/nodes (default: 1)
- `:machine_rank` — this machine's rank (default: 0)
- `:mixed_precision` — `"no"`, `"fp16"`, `"bf16"`, `"fp8"` (default: `"no"`)
- `:distributed_type` — `"NO"`, `"MULTI_GPU"`, `"DEEPSPEED"`, `"FSDP"`, `"TPU"` (default: `"NO"`)
- `:deepspeed_config` — path to DeepSpeed config JSON
- `:fsdp_config` — FSDP config map
- `:gradient_accumulation_steps` — steps between optimizer updates (default: 1)

## Example

    config = HuggingfaceClient.Training.accelerate_config(
      num_processes: 4,
      mixed_precision: "bf16",
      distributed_type: "MULTI_GPU"
    )

    # Run on 4× A10G GPU job
    {:ok, job} = HuggingfaceClient.run_job(
      image: "huggingface/transformers-pytorch-gpu:latest",
      command: ["accelerate", "launch"] ++
               HuggingfaceClient.Training.to_accelerate_args(config) ++
               ["train.py"],
      flavor: "a10g-largex4",
      access_token: token
    )

# `dpo_config`

```elixir
@spec dpo_config(keyword()) :: map()
```

Builds a DPO (Direct Preference Optimization) configuration.

DPO is used to align LLMs with human preferences without a reward model.

## Options

- `:base_model` — SFT-trained model to align (required)
- `:dataset` — preference dataset with chosen/rejected pairs (required)
- `:beta` — KL divergence coefficient (default: 0.1)
- `:max_length` — max total sequence length (default: 1024)
- `:max_prompt_length` — max prompt length (default: 512)
- `:epochs` — training epochs (default: 1)
- `:batch_size` — per-device batch size (default: 2)
- `:learning_rate` — learning rate (default: 5.0e-7)
- `:use_peft` — use LoRA for DPO (default: `true`)
- `:lora_r` — LoRA rank if using PEFT (default: 16)

# `full_finetune_config`

```elixir
@spec full_finetune_config(keyword()) :: map()
```

Builds a full fine-tuning configuration (no PEFT, all parameters trained).

Use this when you have sufficient GPU memory and want maximum model capacity.

## Options

- `:base_model` — HF model to fine-tune (required)
- `:dataset` — training dataset (required)
- `:task` — task type (default: `"llm-sft"`)
- `:epochs` — training epochs (default: 1)
- `:batch_size` — per-device batch size (default: 1)
- `:learning_rate` — learning rate (default: 1.0e-5)
- `:max_seq_length` — max token length (default: 2048)
- `:fp16` / `:bf16` — mixed precision training
- `:gradient_checkpointing` — save memory at compute cost (default: `true`)
- `:gradient_accumulation_steps` — accumulate gradients (default: 4)

# `lora_config`

```elixir
@spec lora_config(keyword()) :: map()
```

Builds a LoRA (Low-Rank Adaptation) configuration map.

LoRA is the most popular parameter-efficient fine-tuning technique.
It freezes the base model and adds small trainable rank-decomposition matrices.

## Options

- `:base_model` — HF model ID to fine-tune (required for AutoTrain)
- `:dataset` — HF dataset ID (required for AutoTrain)
- `:rank` / `:r` — LoRA rank (default: 16). Higher = more params, more capacity.
- `:alpha` / `:lora_alpha` — LoRA scaling factor (default: 32). Usually 2× rank.
- `:dropout` / `:lora_dropout` — dropout probability (default: 0.05)
- `:target_modules` — list of module names to apply LoRA to
  (default: `["q_proj", "v_proj"]` for most LLMs)
- `:bias` — `"none"`, `"all"`, `"lora_only"` (default: `"none"`)
- `:task_type` — `"CAUSAL_LM"`, `"SEQ_CLS"`, `"SEQ_2_SEQ_LM"` (default: `"CAUSAL_LM"`)
- `:use_rslora` — use rank-stabilized LoRA (default: `false`)
- `:use_dora` — use weight-decomposed LoRA (default: `false`)
- `:epochs` — training epochs (default: 3)
- `:batch_size` — per-device batch size (default: 2)
- `:learning_rate` — learning rate (default: 2.0e-4)
- `:max_seq_length` — max token length (default: 2048)
- `:use_4bit` — use 4-bit quantization base (default: `false`)
- `:use_8bit` — use 8-bit quantization base (default: `false`)

## Example

    config = HuggingfaceClient.Training.lora_config(
      base_model: "meta-llama/Llama-3.1-8B-Instruct",
      rank: 16, alpha: 32,
      target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"],
      epochs: 3, batch_size: 4, learning_rate: 2.0e-4,
      max_seq_length: 2048, use_4bit: true
    )

    # With AutoTrain
    {:ok, project} = HuggingfaceClient.autotrain_create(
      Map.merge(config, %{
        project_name: "llama-lora-ft",
        task: "llm-sft",
        dataset: "my-org/my-dataset",
        access_token: token
      })
    )

# `orpo_config`

```elixir
@spec orpo_config(keyword()) :: map()
```

Builds an ORPO (Odds Ratio Preference Optimization) configuration.

ORPO combines SFT and alignment in a single training pass — more efficient than DPO.

## Options

- `:base_model` — base model to train from scratch (required)
- `:dataset` — preference dataset (required)
- `:orpo_alpha` — ORPO alpha parameter (default: 0.1)
- `:max_length` — max sequence length (default: 1024)
- `:epochs` — training epochs (default: 3)
- `:batch_size` — batch size (default: 2)
- `:learning_rate` — learning rate (default: 8.0e-6)

# `recipe`

```elixir
@spec recipe(
  atom(),
  keyword()
) :: map()
```

Returns a pre-configured recipe for common fine-tuning scenarios.

## Recipes

- `:llama3_lora` — Llama 3 LoRA SFT (A10G-class GPU, 4-bit base)
- `:mistral_lora` — Mistral LoRA SFT (A10G-class GPU, 4-bit base)
- `:bert_text_classification` — BERT-style text classification fine-tuning
- `:t5_summarization` — T5/FLAN summarization fine-tuning
- `:vit_image_classification` — ViT image classification fine-tuning
- `:whisper_asr` — Whisper ASR fine-tuning

## Example

    config = HuggingfaceClient.Training.recipe(:llama3_lora,
      base_model: "meta-llama/Llama-3.1-8B-Instruct",
      dataset: "my-org/my-chat-data"
    )

# `reward_model_config`

```elixir
@spec reward_model_config(keyword()) :: map()
```

Builds a Reward Model training configuration.

Used in RLHF pipelines to train a reward model from preference data.

## Options

- `:base_model` — backbone model (required)
- `:dataset` — preference dataset with chosen/rejected (required)
- `:max_length` — max sequence length (default: 512)
- `:epochs` — training epochs (default: 1)
- `:batch_size` — batch size (default: 4)
- `:learning_rate` — learning rate (default: 1.0e-5)

# `to_accelerate_args`

```elixir
@spec to_accelerate_args(map()) :: [String.t()]
```

Converts an accelerate config to `accelerate launch` CLI arguments.

# `to_args`

```elixir
@spec to_args(map()) :: [String.t()]
```

Converts a training config map into CLI argument list form.

## Example

    args = HuggingfaceClient.Training.to_args(config)
    # ["--learning_rate", "2e-4", "--num_train_epochs", "3", ...]

    {:ok, job} = HuggingfaceClient.run_job(
      image: "pytorch/pytorch:latest",
      command: ["python", "train.py"] ++ args,
      flavor: "a10g-small"
    )

# `training_args`

```elixir
@spec training_args(keyword()) :: map()
```

Builds a TrainingArguments-compatible configuration map.

This mirrors the HuggingFace `transformers.TrainingArguments` parameters.

## Options

- `:output_dir` — where to save the model (required)
- `:num_train_epochs` — number of training epochs (default: 3)
- `:per_device_train_batch_size` — batch size per device (default: 8)
- `:per_device_eval_batch_size` — eval batch size (default: 8)
- `:learning_rate` — initial learning rate (default: 5.0e-5)
- `:weight_decay` — weight decay coefficient (default: 0.0)
- `:warmup_steps` — number of warmup steps (default: 0)
- `:warmup_ratio` — fraction of steps for warmup
- `:lr_scheduler_type` — `"linear"`, `"cosine"`, `"cosine_with_restarts"`, `"polynomial"`, `"constant"`, `"constant_with_warmup"` (default: `"linear"`)
- `:evaluation_strategy` — `"no"`, `"steps"`, `"epoch"` (default: `"epoch"`)
- `:save_strategy` — `"no"`, `"steps"`, `"epoch"` (default: `"epoch"`)
- `:save_total_limit` — max checkpoints to keep
- `:load_best_model_at_end` — load best checkpoint after training (default: `false`)
- `:metric_for_best_model` — metric name for best model selection
- `:fp16` — use FP16 mixed precision (default: `false`)
- `:bf16` — use BF16 mixed precision (default: `false`)
- `:gradient_accumulation_steps` — steps before optimizer update (default: 1)
- `:gradient_checkpointing` — trade compute for memory (default: `false`)
- `:dataloader_num_workers` — number of data loader workers (default: 0)
- `:seed` — random seed (default: 42)
- `:hub_model_id` — push checkpoints to this HF model ID
- `:push_to_hub` — auto-push checkpoints to Hub (default: `false`)
- `:report_to` — `"tensorboard"`, `"wandb"`, `"none"` (default: `"none"`)
- `:logging_steps` — log every N steps (default: 500)
- `:eval_steps` — evaluate every N steps (if strategy is "steps")
- `:save_steps` — save every N steps (if strategy is "steps")
- `:max_steps` — override epochs with max steps (-1 = use epochs)
- `:optim` — optimizer: `"adamw_torch"`, `"adamw_hf"`, `"sgd"`, `"adafactor"` (default: `"adamw_torch"`)

## Example

    config = HuggingfaceClient.Training.training_args(
      output_dir: "./my-model",
      num_train_epochs: 5,
      per_device_train_batch_size: 16,
      learning_rate: 2.0e-5,
      fp16: true,
      hub_model_id: "my-org/my-finetuned-model",
      push_to_hub: true
    )

---

*Consult [api-reference.md](api-reference.md) for complete listing*
