View Source ExOpenAI.Components.FineTuneDPOHyperparameters (ex_openai.ex v2.0.0-beta2)

The hyperparameters used for the DPO fine-tuning job.

Fields

:batch_size - optional - :auto | integer()
Number of examples in each batch. A larger batch size means that model parameters are updated less frequently, but with lower variance.
Default: "auto"
:beta - optional - :auto | number()
The beta value for the DPO method. A higher beta value will increase the weight of the penalty between the policy and reference model.
Default: "auto"
:learning_rate_multiplier - optional - :auto | number()
Scaling factor for the learning rate. A smaller learning rate may be useful to avoid overfitting.
Default: "auto"
:n_epochs - optional - :auto | integer()
The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
Default: "auto"

Summary

Types

t()

Types

t()

@type t() :: %ExOpenAI.Components.FineTuneDPOHyperparameters{
  batch_size: (:auto | integer()) | nil,
  beta: (:auto | number()) | nil,
  learning_rate_multiplier: (:auto | number()) | nil,
  n_epochs: (:auto | integer()) | nil
}