View Source EXGBoost.Booster (EXGBoost v0.5.1)
A Booster is the main object used for training and prediction. It is a wrapper around the
underlying XGBoost C API. Booster have three main concepts for tracking associated data:
parameters, attributes, and features. Parameters are used to configure the Booster and are
from a set of valid options (such as tree_depth
and eta
-- refer to EXGBoost.Parameters
for full list).
Attributes are user-provided key-value pairs that are assigned to a Booster (such as best_iteration
and best_score
).
Features are used to track the metadata associated with the features used in training (such as feature_names
and feature_types
).
Training
When using EXGBoost.train/2
, a Booster is created and trained automatically with the given parameters.
If you need more control over the training process, please refer to EXGBoost.Training.Callback
for
guidance on how to inject custom logic into the training process.
Creation
A Booster can be created using EXGBoost.Booster.booster
from a list of DMatrices, a single DMatrix, or
another Booster. If a list of DMatrices is provided, the first DMatrix is used as the training
data and the rest are used for evaluation. If a single DMatrix is provided, it is used as the
training data. If another Booster is provided, it is copied and returned as a new Booster with
the same configuration -- if params are provided, they will override the configuration of the
copied Booster.
Serialization
A Booster can be serialized to a file using EXGBoost.Booster.save
and loaded from a file
using EXGBoost.Booster.load
. The file format can be specified using the :format
option
which can be either :json
or :ubj
. The default is :json
. If the file already exists, it will
be overwritten by default. Boosters can either be serialized to a file or to a binary string.
Boosters can be serialized in three different ways: configuration only, configuration and model, or
model only. Any function that uses the to
and from
buffer
functions will serialize the Booster
to a binary string. The to
and from
file
functions will serialize the Booster to a file.
Functions named with weights
will serialize the model weights only. Functions named with config
will
serialize the configuration only. Functions that specify model
will serialize both the model weights
and the configuration.
Output Formats
file
- Save to a file.buffer
- Save to a binary string.
Output Contents
config
- Save the configuration only.weights
- Save the model weights only.model
- Save both the model weights and the configuration.
Summary
Functions
Boost the booster for one iteration, with customized gradient statistics.
Create a new Booster.
Evaluate the model on the given data.
Evaluate a set of data.
Get the attribute value for the given key.
Get the attribute names for the booster.
Get the best iteration for the booster.
Get the number of boosted rounds for the booster.
Get a formatted representation of the Booster's model.
Get the names of the features for the booster.
Get the type for each feature in the booster
Get the number of features for the booster.
Load a Booster from the specified source. If a Booster is provided, the model will be loaded into that Booster. Otherwise, a new Booster will be created. If a Booster is provided, model parameters will be merged with the existing Booster's parameters using Map.merge/2, where the parameters of the provided Booster take precedence.
Save a Booster to the specified source.
Set attributes for booster.
Set parameters for booster. The parameters are passed as a keyword list. Please refer to
EXGBoost.Parameters
for a full list of parameters. Parameters can be set multiple times
by passng a list of values for a given parameter. For example, set_params(booster, eval_metric: [:rmse, :auc])
.
Accepts both atoms and strings as keys. Nested keyword lists will simply be treated as more key-value pairs
to be set. Returns the booster.
Slice a model using boosting index. The slice m:n indicates taking all trees that were fit during the boosting rounds m, (m+1), (m+2), …, (n-1).
Update for one iteration, with objective function calculated internally.
Types
Functions
Boost the booster for one iteration, with customized gradient statistics.
Create a new Booster.
A Booster can be created from a list of DMatrices, a single DMatrix, or another Booster. If a list of DMatrices is provided, the first DMatrix is used as the training data and the rest are used for evaluation. If a single DMatrix is provided, it is used as the training data. If another Booster is provided, it is copied and returned as a new Booster with the same configuration -- if params are provided, they will override the configuration of the copied Booster.
Options
Refer to EXGBoost.Parameters
for a list of valid options.
Evaluate the model on the given data.
Options
:name
- The name of the dataset.:iteration
- The current iteration number.
Returns the evaluation result string.
Evaluate a set of data.
Options
iteration
- Current iteration.feval
- Custom evaluation function.
Returns the resulting metrics as a list of 2-tuples in the form of {eval_metric, value}.
Get the attribute value for the given key.
Get the attribute names for the booster.
Get the best iteration for the booster.
Get the number of boosted rounds for the booster.
Get a formatted representation of the Booster's model.
Options
:fmap
(String.t/0
) - The path to the file containing the feature map. The default value is""
.:with_stats
(boolean/0
) - Whether or not to include the statistics in the dump. The default value isfalse
.:format
- The format to dump to. Can be either:json
or:text
. The default value is:text
.
Get the names of the features for the booster.
Get the type for each feature in the booster
Get the number of features for the booster.
Load a Booster from the specified source. If a Booster is provided, the model will be loaded into that Booster. Otherwise, a new Booster will be created. If a Booster is provided, model parameters will be merged with the existing Booster's parameters using Map.merge/2, where the parameters of the provided Booster take precedence.
Options
:from
- The input format. Can be either:file
or:buffer
. The default value is:file
.:deserialize
- The contents to deserialize. Can be either:config
,:weights
, or:model
. The default value is:model
.:booster
(struct of type EXGBoost.Booster) - The Booster to load the model into. If not provided, a new Booster will be created.
Save a Booster to the specified source.
Options
:to
- The output format. Can be either:file
or:buffer
. The default value is:file
.:path
(String.t/0
) - The path to the file to save to. Required ifto
is:file
.:serialize
- The contents to serialize. Can be either:config
,:weights
, or:model
. The default value is:model
.:format
- The format to serialize to. Can be either:json
or:ubj
. The default value is:json
.:overwrite
(boolean/0
) - Whether or not to overwrite the file if it already exists. The default value isfalse
.
Set attributes for booster.
Key value pairs are passed as options. You can set an existing key to :nil to delete the attribute. Returns the booster.
Set parameters for booster. The parameters are passed as a keyword list. Please refer to
EXGBoost.Parameters
for a full list of parameters. Parameters can be set multiple times
by passng a list of values for a given parameter. For example, set_params(booster, eval_metric: [:rmse, :auc])
.
Accepts both atoms and strings as keys. Nested keyword lists will simply be treated as more key-value pairs
to be set. Returns the booster.
Slice a model using boosting index. The slice m:n indicates taking all trees that were fit during the boosting rounds m, (m+1), (m+2), …, (n-1).
Update for one iteration, with objective function calculated internally.
If an objective function is provided rather than a number of iterations, this updates for one iteration, with objective function defined by the user.
See Custom Objective for details.