HuggingfaceClient.Hub.Compute.Training (huggingface_client v0.1.0)

Copy Markdown View Source

HuggingFace Training Stack — configuration helpers for fine-tuning and training.

Provides structured configuration builders for:

  • 3.1 AutoTrain — no-code training via the AutoTrain API
  • 3.2 Trainer API — Transformers Trainer configuration (TrainingArguments)
  • 3.3 Fine-tuning techniques — LoRA, PEFT, full fine-tuning, DPO, ORPO, SFT

These helpers generate configuration maps that can be:

  1. Passed to HuggingfaceClient.autotrain_create/1 to launch training on HF infra
  2. Passed to HuggingfaceClient.run_job/1 to run a custom training script
  3. Serialized to JSON/YAML for use in local training scripts

Example

# Launch LoRA fine-tuning via AutoTrain
config = HuggingfaceClient.Training.lora_config(
  base_model: "meta-llama/Llama-3.1-8B",
  dataset: "my-org/my-dataset",
  rank: 16, alpha: 32, epochs: 3
)

{:ok, project} = HuggingfaceClient.autotrain_create(
  Map.merge(config, %{project_name: "my-lora-model", access_token: token})
)

# Or run a training job on GPU infra
{:ok, job} = HuggingfaceClient.run_job(
  image: "huggingface/transformers-pytorch-gpu:latest",
  command: ["python", "train.py"] ++ HuggingfaceClient.Training.to_args(config),
  flavor: "a10g-small",
  access_token: token
)

Summary

Functions

Builds an Accelerate launch configuration for distributed training.

Builds a DPO (Direct Preference Optimization) configuration.

Builds a full fine-tuning configuration (no PEFT, all parameters trained).

Builds a LoRA (Low-Rank Adaptation) configuration map.

Builds an ORPO (Odds Ratio Preference Optimization) configuration.

Returns a pre-configured recipe for common fine-tuning scenarios.

Builds a Reward Model training configuration.

Converts an accelerate config to accelerate launch CLI arguments.

Converts a training config map into CLI argument list form.

Builds a TrainingArguments-compatible configuration map.

Functions

accelerate_config(opts \\ [])

@spec accelerate_config(keyword()) :: map()

Builds an Accelerate launch configuration for distributed training.

This mirrors accelerate config / accelerate launch parameters.

Options

  • :num_processes — total number of processes (GPUs) (default: 1)
  • :num_machines — number of machines/nodes (default: 1)
  • :machine_rank — this machine's rank (default: 0)
  • :mixed_precision"no", "fp16", "bf16", "fp8" (default: "no")
  • :distributed_type"NO", "MULTI_GPU", "DEEPSPEED", "FSDP", "TPU" (default: "NO")
  • :deepspeed_config — path to DeepSpeed config JSON
  • :fsdp_config — FSDP config map
  • :gradient_accumulation_steps — steps between optimizer updates (default: 1)

Example

config = HuggingfaceClient.Training.accelerate_config(
  num_processes: 4,
  mixed_precision: "bf16",
  distributed_type: "MULTI_GPU"
)

# Run on 4× A10G GPU job
{:ok, job} = HuggingfaceClient.run_job(
  image: "huggingface/transformers-pytorch-gpu:latest",
  command: ["accelerate", "launch"] ++
           HuggingfaceClient.Training.to_accelerate_args(config) ++
           ["train.py"],
  flavor: "a10g-largex4",
  access_token: token
)

dpo_config(opts \\ [])

@spec dpo_config(keyword()) :: map()

Builds a DPO (Direct Preference Optimization) configuration.

DPO is used to align LLMs with human preferences without a reward model.

Options

  • :base_model — SFT-trained model to align (required)
  • :dataset — preference dataset with chosen/rejected pairs (required)
  • :beta — KL divergence coefficient (default: 0.1)
  • :max_length — max total sequence length (default: 1024)
  • :max_prompt_length — max prompt length (default: 512)
  • :epochs — training epochs (default: 1)
  • :batch_size — per-device batch size (default: 2)
  • :learning_rate — learning rate (default: 5.0e-7)
  • :use_peft — use LoRA for DPO (default: true)
  • :lora_r — LoRA rank if using PEFT (default: 16)

full_finetune_config(opts \\ [])

@spec full_finetune_config(keyword()) :: map()

Builds a full fine-tuning configuration (no PEFT, all parameters trained).

Use this when you have sufficient GPU memory and want maximum model capacity.

Options

  • :base_model — HF model to fine-tune (required)
  • :dataset — training dataset (required)
  • :task — task type (default: "llm-sft")
  • :epochs — training epochs (default: 1)
  • :batch_size — per-device batch size (default: 1)
  • :learning_rate — learning rate (default: 1.0e-5)
  • :max_seq_length — max token length (default: 2048)
  • :fp16 / :bf16 — mixed precision training
  • :gradient_checkpointing — save memory at compute cost (default: true)
  • :gradient_accumulation_steps — accumulate gradients (default: 4)

lora_config(opts \\ [])

@spec lora_config(keyword()) :: map()

Builds a LoRA (Low-Rank Adaptation) configuration map.

LoRA is the most popular parameter-efficient fine-tuning technique. It freezes the base model and adds small trainable rank-decomposition matrices.

Options

  • :base_model — HF model ID to fine-tune (required for AutoTrain)
  • :dataset — HF dataset ID (required for AutoTrain)
  • :rank / :r — LoRA rank (default: 16). Higher = more params, more capacity.
  • :alpha / :lora_alpha — LoRA scaling factor (default: 32). Usually 2× rank.
  • :dropout / :lora_dropout — dropout probability (default: 0.05)
  • :target_modules — list of module names to apply LoRA to (default: ["q_proj", "v_proj"] for most LLMs)
  • :bias"none", "all", "lora_only" (default: "none")
  • :task_type"CAUSAL_LM", "SEQ_CLS", "SEQ_2_SEQ_LM" (default: "CAUSAL_LM")
  • :use_rslora — use rank-stabilized LoRA (default: false)
  • :use_dora — use weight-decomposed LoRA (default: false)
  • :epochs — training epochs (default: 3)
  • :batch_size — per-device batch size (default: 2)
  • :learning_rate — learning rate (default: 2.0e-4)
  • :max_seq_length — max token length (default: 2048)
  • :use_4bit — use 4-bit quantization base (default: false)
  • :use_8bit — use 8-bit quantization base (default: false)

Example

config = HuggingfaceClient.Training.lora_config(
  base_model: "meta-llama/Llama-3.1-8B-Instruct",
  rank: 16, alpha: 32,
  target_modules: ["q_proj", "k_proj", "v_proj", "o_proj"],
  epochs: 3, batch_size: 4, learning_rate: 2.0e-4,
  max_seq_length: 2048, use_4bit: true
)

# With AutoTrain
{:ok, project} = HuggingfaceClient.autotrain_create(
  Map.merge(config, %{
    project_name: "llama-lora-ft",
    task: "llm-sft",
    dataset: "my-org/my-dataset",
    access_token: token
  })
)

orpo_config(opts \\ [])

@spec orpo_config(keyword()) :: map()

Builds an ORPO (Odds Ratio Preference Optimization) configuration.

ORPO combines SFT and alignment in a single training pass — more efficient than DPO.

Options

  • :base_model — base model to train from scratch (required)
  • :dataset — preference dataset (required)
  • :orpo_alpha — ORPO alpha parameter (default: 0.1)
  • :max_length — max sequence length (default: 1024)
  • :epochs — training epochs (default: 3)
  • :batch_size — batch size (default: 2)
  • :learning_rate — learning rate (default: 8.0e-6)

recipe(name, opts \\ [])

@spec recipe(
  atom(),
  keyword()
) :: map()

Returns a pre-configured recipe for common fine-tuning scenarios.

Recipes

  • :llama3_lora — Llama 3 LoRA SFT (A10G-class GPU, 4-bit base)
  • :mistral_lora — Mistral LoRA SFT (A10G-class GPU, 4-bit base)
  • :bert_text_classification — BERT-style text classification fine-tuning
  • :t5_summarization — T5/FLAN summarization fine-tuning
  • :vit_image_classification — ViT image classification fine-tuning
  • :whisper_asr — Whisper ASR fine-tuning

Example

config = HuggingfaceClient.Training.recipe(:llama3_lora,
  base_model: "meta-llama/Llama-3.1-8B-Instruct",
  dataset: "my-org/my-chat-data"
)

reward_model_config(opts \\ [])

@spec reward_model_config(keyword()) :: map()

Builds a Reward Model training configuration.

Used in RLHF pipelines to train a reward model from preference data.

Options

  • :base_model — backbone model (required)
  • :dataset — preference dataset with chosen/rejected (required)
  • :max_length — max sequence length (default: 512)
  • :epochs — training epochs (default: 1)
  • :batch_size — batch size (default: 4)
  • :learning_rate — learning rate (default: 1.0e-5)

to_accelerate_args(config)

@spec to_accelerate_args(map()) :: [String.t()]

Converts an accelerate config to accelerate launch CLI arguments.

to_args(config)

@spec to_args(map()) :: [String.t()]

Converts a training config map into CLI argument list form.

Example

args = HuggingfaceClient.Training.to_args(config)
# ["--learning_rate", "2e-4", "--num_train_epochs", "3", ...]

{:ok, job} = HuggingfaceClient.run_job(
  image: "pytorch/pytorch:latest",
  command: ["python", "train.py"] ++ args,
  flavor: "a10g-small"
)

training_args(opts \\ [])

@spec training_args(keyword()) :: map()

Builds a TrainingArguments-compatible configuration map.

This mirrors the HuggingFace transformers.TrainingArguments parameters.

Options

  • :output_dir — where to save the model (required)
  • :num_train_epochs — number of training epochs (default: 3)
  • :per_device_train_batch_size — batch size per device (default: 8)
  • :per_device_eval_batch_size — eval batch size (default: 8)
  • :learning_rate — initial learning rate (default: 5.0e-5)
  • :weight_decay — weight decay coefficient (default: 0.0)
  • :warmup_steps — number of warmup steps (default: 0)
  • :warmup_ratio — fraction of steps for warmup
  • :lr_scheduler_type"linear", "cosine", "cosine_with_restarts", "polynomial", "constant", "constant_with_warmup" (default: "linear")
  • :evaluation_strategy"no", "steps", "epoch" (default: "epoch")
  • :save_strategy"no", "steps", "epoch" (default: "epoch")
  • :save_total_limit — max checkpoints to keep
  • :load_best_model_at_end — load best checkpoint after training (default: false)
  • :metric_for_best_model — metric name for best model selection
  • :fp16 — use FP16 mixed precision (default: false)
  • :bf16 — use BF16 mixed precision (default: false)
  • :gradient_accumulation_steps — steps before optimizer update (default: 1)
  • :gradient_checkpointing — trade compute for memory (default: false)
  • :dataloader_num_workers — number of data loader workers (default: 0)
  • :seed — random seed (default: 42)
  • :hub_model_id — push checkpoints to this HF model ID
  • :push_to_hub — auto-push checkpoints to Hub (default: false)
  • :report_to"tensorboard", "wandb", "none" (default: "none")
  • :logging_steps — log every N steps (default: 500)
  • :eval_steps — evaluate every N steps (if strategy is "steps")
  • :save_steps — save every N steps (if strategy is "steps")
  • :max_steps — override epochs with max steps (-1 = use epochs)
  • :optim — optimizer: "adamw_torch", "adamw_hf", "sgd", "adafactor" (default: "adamw_torch")

Example

config = HuggingfaceClient.Training.training_args(
  output_dir: "./my-model",
  num_train_epochs: 5,
  per_device_train_batch_size: 16,
  learning_rate: 2.0e-5,
  fp16: true,
  hub_model_id: "my-org/my-finetuned-model",
  push_to_hub: true
)