View Source EXGBoost.Booster (EXGBoost v0.5.0)

A Booster is the main object used for training and prediction. It is a wrapper around the underlying XGBoost C API. Booster have three main concepts for tracking associated data: parameters, attributes, and features. Parameters are used to configure the Booster and are from a set of valid options (such as tree_depth and eta -- refer to EXGBoost.Parameters for full list). Attributes are user-provided key-value pairs that are assigned to a Booster (such as best_iteration and best_score). Features are used to track the metadata associated with the features used in training (such as feature_names and feature_types).

Training

When using EXGBoost.train/2, a Booster is created and trained automatically with the given parameters. If you need more control over the training process, please refer to EXGBoost.Training.Callback for guidance on how to inject custom logic into the training process.

Creation

A Booster can be created using EXGBoost.Booster.booster from a list of DMatrices, a single DMatrix, or another Booster. If a list of DMatrices is provided, the first DMatrix is used as the training data and the rest are used for evaluation. If a single DMatrix is provided, it is used as the training data. If another Booster is provided, it is copied and returned as a new Booster with the same configuration -- if params are provided, they will override the configuration of the copied Booster.

Serialization

A Booster can be serialized to a file using EXGBoost.Booster.save and loaded from a file using EXGBoost.Booster.load. The file format can be specified using the :format option which can be either :json or :ubj. The default is :json. If the file already exists, it will be overwritten by default. Boosters can either be serialized to a file or to a binary string. Boosters can be serialized in three different ways: configuration only, configuration and model, or model only. Any function that uses the to and from buffer functions will serialize the Booster to a binary string. The to and from file functions will serialize the Booster to a file. Functions named with weights will serialize the model weights only. Functions named with config will serialize the configuration only. Functions that specify model will serialize both the model weights and the configuration.

Output Formats

  • file - Save to a file.
  • buffer - Save to a binary string.

Output Contents

  • config - Save the configuration only.
  • weights - Save the model weights only.
  • model - Save both the model weights and the configuration.

Summary

Functions

Boost the booster for one iteration, with customized gradient statistics.

Create a new Booster.

Evaluate the model on the given data.

Get the attribute value for the given key.

Get the attribute names for the booster.

Get the best iteration for the booster.

Get the number of boosted rounds for the booster.

Get a formatted representation of the Booster's model.

Get the names of the features for the booster.

Get the type for each feature in the booster

Get the number of features for the booster.

Load a Booster from the specified source. If a Booster is provided, the model will be loaded into that Booster. Otherwise, a new Booster will be created. If a Booster is provided, model parameters will be merged with the existing Booster's parameters using Map.merge/2, where the parameters of the provided Booster take precedence.

Save a Booster to the specified source.

Set attributes for booster.

Set parameters for booster. The parameters are passed as a keyword list. Please refer to EXGBoost.Parameters for a full list of parameters. Parameters can be set multiple times by passng a list of values for a given parameter. For example, set_params(booster, eval_metric: [:rmse, :auc]). Accepts both atoms and strings as keys. Nested keyword lists will simply be treated as more key-value pairs to be set. Returns the booster.

Slice a model using boosting index. The slice m:n indicates taking all trees that were fit during the boosting rounds m, (m+1), (m+2), …, (n-1).

Update for one iteration, with objective function calculated internally.

Types

@type t() :: %EXGBoost.Booster{
  best_iteration: integer(),
  best_score: float(),
  ref: reference()
}

Functions

Link to this function

boost(booster, dmatrix, grad, hess)

View Source

Boost the booster for one iteration, with customized gradient statistics.

Link to this function

booster(dmats, opts \\ [])

View Source

Create a new Booster.

A Booster can be created from a list of DMatrices, a single DMatrix, or another Booster. If a list of DMatrices is provided, the first DMatrix is used as the training data and the rest are used for evaluation. If a single DMatrix is provided, it is used as the training data. If another Booster is provided, it is copied and returned as a new Booster with the same configuration -- if params are provided, they will override the configuration of the copied Booster.

Options

Refer to EXGBoost.Parameters for a list of valid options.

Link to this function

eval(booster, data, opts \\ [])

View Source

Evaluate the model on the given data.

Options

  • :name - The name of the dataset.

  • :iteration - The current iteration number.

Returns the evaluation result string.

Link to this function

eval_set(booster, evals, iteration, opts \\ [])

View Source

Evaluate a set of data.

Options

  • iteration - Current iteration.
  • feval - Custom evaluation function.

Returns the resulting metrics as a list of 2-tuples in the form of {eval_metric, value}.

Get the attribute value for the given key.

Get the attribute names for the booster.

Link to this function

get_best_iteration(booster)

View Source

Get the best iteration for the booster.

Link to this function

get_boosted_rounds(booster)

View Source

Get the number of boosted rounds for the booster.

Link to this function

get_dump(booster, opts \\ [])

View Source

Get a formatted representation of the Booster's model.

Options

  • :fmap (String.t/0) - The path to the file containing the feature map. The default value is "".

  • :with_stats (boolean/0) - Whether or not to include the statistics in the dump. The default value is false.

  • :format - The format to dump to. Can be either :json or :text. The default value is :text.

Link to this function

get_feature_names(booster)

View Source

Get the names of the features for the booster.

Link to this function

get_feature_types(booster)

View Source

Get the type for each feature in the booster

Link to this function

get_num_features(booster)

View Source

Get the number of features for the booster.

Link to this function

load(source, opts \\ [])

View Source

Load a Booster from the specified source. If a Booster is provided, the model will be loaded into that Booster. Otherwise, a new Booster will be created. If a Booster is provided, model parameters will be merged with the existing Booster's parameters using Map.merge/2, where the parameters of the provided Booster take precedence.

Options

  • :from - The input format. Can be either :file or :buffer. The default value is :file.

  • :deserialize - The contents to deserialize. Can be either :config, :weights, or :model. The default value is :model.

  • :booster (struct of type EXGBoost.Booster) - The Booster to load the model into. If not provided, a new Booster will be created.

Link to this function

predict(booster, data, opts \\ [])

View Source
Link to this function

save(booster, opts \\ [])

View Source

Save a Booster to the specified source.

Options

  • :to - The output format. Can be either :file or :buffer. The default value is :file.

  • :path (String.t/0) - The path to the file to save to. Required if to is :file.

  • :serialize - The contents to serialize. Can be either :config, :weights, or :model. The default value is :model.

  • :format - The format to serialize to. Can be either :json or :ubj. The default value is :json.

  • :overwrite (boolean/0) - Whether or not to overwrite the file if it already exists. The default value is false.

Link to this function

set_attr(booster, attrs \\ [])

View Source

Set attributes for booster.

Key value pairs are passed as options. You can set an existing key to :nil to delete the attribute. Returns the booster.

Link to this function

set_params(booster, params \\ [])

View Source

Set parameters for booster. The parameters are passed as a keyword list. Please refer to EXGBoost.Parameters for a full list of parameters. Parameters can be set multiple times by passng a list of values for a given parameter. For example, set_params(booster, eval_metric: [:rmse, :auc]). Accepts both atoms and strings as keys. Nested keyword lists will simply be treated as more key-value pairs to be set. Returns the booster.

Link to this function

slice(boostr, begin_layer, end_layer, step)

View Source

Slice a model using boosting index. The slice m:n indicates taking all trees that were fit during the boosting rounds m, (m+1), (m+2), …, (n-1).

Link to this function

update(booster, dmatrix, iteration, objective)

View Source

Update for one iteration, with objective function calculated internally.

If an objective function is provided rather than a number of iterations, this updates for one iteration, with objective function defined by the user.

See Custom Objective for details.