View Source Scholar.Preprocessing.OneHotEncoder (Scholar v0.3.0)

Implements encoder that converts integer value (substitute of categorical data in tensors) into 0-1 vector. The index of 1 in the vector is aranged in sorted manner. This means that for x < y => one_index(x) < one_index(y).

Currently the module supports only 1D tensors.

Summary

Functions

Creates mapping from values into one-hot vectors.

Apply encoding on the provided tensor directly. It's equivalent to fit/2 and then transform/2 on the same data.

Encode labels as a one-hot numeric tensor. All values provided to transform/2 must be seen in fit/2 function, otherwise an error occurs.

Functions

Creates mapping from values into one-hot vectors.

Options

  • :num_classes (pos_integer/0) - Required. Number of classes to be encoded.

Examples

iex> t = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> Scholar.Preprocessing.OneHotEncoder.fit(t, num_classes: 4)
%Scholar.Preprocessing.OneHotEncoder{
  encoder: %Scholar.Preprocessing.OrdinalEncoder{
    encoding_tensor: Nx.tensor(
      [
        [0, 2],
        [1, 3],
        [2, 4],
        [3, 56]
      ]
    )
  },
  one_hot: Nx.tensor(
    [
      [1, 0, 0, 0],
      [0, 1, 0, 0],
      [0, 0, 1, 0],
      [0, 0, 0, 1]
    ], type: :u8
  )
}
Link to this function

fit_transform(tensor, opts \\ [])

View Source

Apply encoding on the provided tensor directly. It's equivalent to fit/2 and then transform/2 on the same data.

Examples

iex> t = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> Scholar.Preprocessing.OneHotEncoder.fit_transform(t, num_classes: 4)
#Nx.Tensor<
  u8[7][4]
  [
    [0, 1, 0, 0],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [1, 0, 0, 0]
  ]
>

Encode labels as a one-hot numeric tensor. All values provided to transform/2 must be seen in fit/2 function, otherwise an error occurs.

Examples

iex> t = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> enoder = Scholar.Preprocessing.OneHotEncoder.fit(t, num_classes: 4)
iex> Scholar.Preprocessing.OneHotEncoder.transform(enoder, t)
#Nx.Tensor<
  u8[7][4]
  [
    [0, 1, 0, 0],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
    [1, 0, 0, 0],
    [0, 0, 1, 0],
    [1, 0, 0, 0]
  ]
>

iex> t = Nx.tensor([3, 2, 4, 56, 2, 4, 2])
iex> enoder = Scholar.Preprocessing.OneHotEncoder.fit(t, num_classes: 4)
iex> new_tensor = Nx.tensor([2, 3, 4, 3, 4, 56, 2])
iex> Scholar.Preprocessing.OneHotEncoder.transform(enoder, new_tensor)
#Nx.Tensor<
  u8[7][4]
  [
    [1, 0, 0, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 1, 0, 0],
    [0, 0, 1, 0],
    [0, 0, 0, 1],
    [1, 0, 0, 0]
  ]
>