View Source Scholar.Preprocessing.OneHotEncoder (Scholar v0.4.0)

Implements encoder that converts integer value (substitute of categorical data in tensors) into 0-1 vector. The index of 1 in the vector is aranged in sorted manner. This means that for x < y => one_index(x) < one_index(y).

Currently the module supports only 1D tensors.

Summary

Functions

Creates mapping from values into one-hot vectors.

Appl encoding on the provided tensor directly. It's equivalent to fit/2 and then transform/2 on the same data.

Encode labels as a one-hot numeric tensor. All values provided to transform/2 must be seen in fit/2 function, otherwise an error occurs.

Functions

fit(tensor, opts)

Creates mapping from values into one-hot vectors.

Options

  • :num_categories (pos_integer/0) - Required. The number of categories to be encoded.

Examples

iex> tensor = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> Scholar.Preprocessing.OneHotEncoder.fit(tensor, num_categories: 4)
%Scholar.Preprocessing.OneHotEncoder{
  ordinal_encoder: %Scholar.Preprocessing.OrdinalEncoder{
    categories: Nx.tensor([2, 3, 4, 56]
    )
  }
}

fit_transform(tensor, opts)

Appl encoding on the provided tensor directly. It's equivalent to fit/2 and then transform/2 on the same data.

Examples

iex> tensor = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> Scholar.Preprocessing.OneHotEncoder.fit_transform(tensor, num_categories: 4)
#Nx.Tensor<
  u8[7][4]
  [
    [0, 1, 0, 0],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [1, 0, 0, 0]
  ]
>

transform(arg1, tensor)

Encode labels as a one-hot numeric tensor. All values provided to transform/2 must be seen in fit/2 function, otherwise an error occurs.

Examples

iex> tensor = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> encoder = Scholar.Preprocessing.OneHotEncoder.fit(tensor, num_categories: 4)
iex> Scholar.Preprocessing.OneHotEncoder.transform(encoder, tensor)
#Nx.Tensor<
  u8[7][4]
  [
    [0, 1, 0, 0],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [1, 0, 0, 0]
  ]
>

iex> tensor = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> encoder = Scholar.Preprocessing.OneHotEncoder.fit(tensor, num_categories: 4)
iex> new_tensor = Nx.tensor([2, 3, 4, 3, 4, 56, 2])
iex> Scholar.Preprocessing.OneHotEncoder.transform(encoder, new_tensor)
#Nx.Tensor<
  u8[7][4]
  [
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
    [1, 0, 0, 0]
  ]
>