Scholar.ModelSelection (Scholar v0.4.1)

Copy Markdown View Source

Module containing cross validation, splitting function, and other model selection methods.

Summary

Functions

General interface of cross validation.

General interface of grid search.

Perform K-Fold split on the given data.

General interface of weighted cross validation.

General interface of weighted grid search.

Functions

cross_validate(x, y, folding_fun, scoring_fun)

General interface of cross validation.

Examples

iex> folding_fun = fn x -> Scholar.ModelSelection.k_fold_split(x, 3) end
iex> scoring_fun = fn x, y ->
...>   {x_train, x_test} = x
...>   {y_train, y_test} = y
...>   model = Scholar.Linear.LinearRegression.fit(x_train, y_train, fit_intercept?: true)
...>   y_pred = Scholar.Linear.LinearRegression.predict(model, x_test)
...>   mse = Scholar.Metrics.Regression.mean_square_error(y_test, y_pred)
...>   mae = Scholar.Metrics.Regression.mean_absolute_error(y_test, y_pred)
...>   [mse, mae]
...> end
iex> x = Nx.iota({7, 2})
iex> y = Nx.tensor([0, 1, 2, 0, 1, 1, 0])
iex> Scholar.ModelSelection.cross_validate(x, y, folding_fun, scoring_fun)
#Nx.Tensor<
  f32[2][3]
  [
    [1.5700000524520874, 1.2149654626846313, 0.005000002216547728],
    [1.100000023841858, 1.0735294818878174, 0.050000011920928955]
  ]
>

grid_search(x, y, folding_fun, scoring_fun, opts)

General interface of grid search.

The opts must be a keyword list of list values, which will become different combinations to perform the grid search on.

Examples

iex> folding_fun = fn x -> Scholar.ModelSelection.k_fold_split(x, 3) end
iex> scoring_fun = fn x, y, opts ->
...>   {x_train, x_test} = x
...>   {y_train, y_test} = y
...>   model = Scholar.Linear.LogisticRegression.fit(x_train, y_train, opts)
...>   y_pred = Scholar.Linear.LogisticRegression.predict(model, x_test)
...>   mse = Scholar.Metrics.Regression.mean_square_error(y_test, y_pred)
...>   mae = Scholar.Metrics.Regression.mean_absolute_error(y_test, y_pred)
...>   [mse, mae]
...> end
iex> x = Nx.iota({7, 2})
iex> y = Nx.tensor([0, 1, 2, 0, 1, 1, 0])
iex> opts = [
...>   num_classes: [3],
...>   max_iterations: [10, 20, 50],
...>   alpha: [0.0, 0.1, 1.0],
...> ]
iex> Scholar.ModelSelection.grid_search(x, y, folding_fun, scoring_fun, opts)

k_fold_split(x, k)

Perform K-Fold split on the given data.

Examples

iex> x = Nx.iota({7, 2})
iex> Scholar.ModelSelection.k_fold_split(x, 2) |> Enum.to_list()
[
  {Nx.tensor(
    [
      [6, 7],
      [8, 9],
      [10, 11]
    ]
  ),
  Nx.tensor(
    [
      [0, 1],
      [2, 3],
      [4, 5]
    ]
  )},
  {Nx.tensor(
    [
      [0, 1],
      [2, 3],
      [4, 5]
    ]
  ),
  Nx.tensor(
    [
      [6, 7],
      [8, 9],
      [10, 11]
    ]
  )}
]

weighted_cross_validate(x, y, weights, folding_fun, scoring_fun)

General interface of weighted cross validation.

Examples

iex> folding_fun = fn x -> Scholar.ModelSelection.k_fold_split(x, 3) end
iex> scoring_fun = fn x, y, weights ->
...>   {x_train, x_test} = x
...>   {y_train, y_test} = y
...>   {weights_train, _weights_test} = weights
...>   model = Scholar.Linear.LinearRegression.fit(x_train, y_train, fit_intercept?: true, sample_weights: weights_train)
...>   y_pred = Scholar.Linear.LinearRegression.predict(model, x_test)
...>   mse = Scholar.Metrics.Regression.mean_square_error(y_test, y_pred)
...>   mae = Scholar.Metrics.Regression.mean_absolute_error(y_test, y_pred)
...>   [mse, mae]
...> end
iex> x = Nx.iota({7, 2})
iex> y = Nx.tensor([0, 1, 2, 0, 1, 1, 0])
iex> weights = Nx.tensor([1, 2, 1, 2, 1, 2, 1])
iex> Scholar.ModelSelection.weighted_cross_validate(x, y, weights, folding_fun, scoring_fun)
#Nx.Tensor<
  f32[2][3]
  [
    [0.5010337233543396, 1.1419668197631836, 0.35123950242996216],
    [0.522727370262146, 1.0526316165924072, 0.590908944606781]
  ]
>

weighted_grid_search(x, y, weights, folding_fun, scoring_fun, opts)

General interface of weighted grid search.

If you want to use opts in some functions inside scoring_fun, you need to pass it as a parameter like in the example below.

Examples

iex> folding_fun = fn x -> Scholar.ModelSelection.k_fold_split(x, 3) end
iex> scoring_fun = fn x, y, weights, opts ->
...>   {x_train, x_test} = x
...>   {y_train, y_test} = y
...>   {weights_train, _weights_test} = weights
...>   opts = Keyword.put(opts, :sample_weights, weights_train)
...>   model = Scholar.Linear.RidgeRegression.fit(x_train, y_train, opts)
...>   y_pred = Scholar.Linear.RidgeRegression.predict(model, x_test)
...>   mse = Scholar.Metrics.Regression.mean_square_error(y_test, y_pred)
...>   mae = Scholar.Metrics.Regression.mean_absolute_error(y_test, y_pred)
...>   [mse, mae]
...> end
iex> x = Nx.iota({7, 2})
iex> y = Nx.tensor([0, 1, 2, 0, 1, 1, 0])
iex> weights = [Nx.tensor([1, 2, 1, 2, 1, 2, 1]), Nx.tensor([2, 1, 2, 1, 2, 1, 2])]
iex> opts = [
...>   alpha: [0, 1, 5],
...>   fit_intercept?: [true, false],
...> ]
iex> Scholar.ModelSelection.weighted_grid_search(x, y, weights, folding_fun, scoring_fun, opts)