View Source Scholar.Cluster.GaussianMixture (Scholar v0.3.1)

Gaussian Mixture Model.

Gaussian Mixture Model is a probabilistic model that assumes every data point is generated by choosing one of several fixed Gaussian distributions and then sampling from it. Its parameters are estimated using the Expectation-Maximization (EM) algorithm, which is an iterative algorithm alternating between the two steps: the E-step which computes the expectation of the Gaussian assignment for each data point x and the M-step which updates the parameters to maximize the expectations found in E-step. While every iteration of the algorithm is guaranteed to improve the log-likelihood, the final result depends on the initial values of the parameters. Thus the procedure consists of repeating the algorithm several times and taking the best obtained result.

Time complexity is $O(NKD^3)$ for $N$ data points, $K$ Gaussian components and $D$ dimensions

References:

[1] - Mixtures of Gaussians and the EM algorithm https://cs229.stanford.edu/notes2020spring/cs229-notes7b.pdf
[2] - Density Estimation with Gaussian Mixture Models https://mml-book.github.io/book/mml-book.pdf Chapter 11

Summary

Functions

fit(x, opts \\ [])

Fits a Gaussian Mixture Model for sample inputs x.

predict(model, x)

Makes predictions with the given model on inputs x.

predict_prob(model, x)

Makes predictions with the given model on inputs x.

Functions

fit(x, opts \\ [])

Fits a Gaussian Mixture Model for sample inputs x.

Options

:num_gaussians (pos_integer/0) - Required. The number of Gaussian distributions in the mixture.
:num_runs (pos_integer/0) - The number of times to initialize parameters and run the entire EM algorithm. The default value is 1.
:max_iter (pos_integer/0) - The number of EM iterations to perform. The default value is 100.
:tol - The convergence threshold. The default value is 0.001.
:covariance_regularization_eps - The non-negative number that is added to each element of the diagonal of the covariance matrix to ensure it is positive. Usually a small number. The default value is 1.0e-6.
:key - Used for random number generation in parameter initialization. If the key is not provided, it is set to Nx.Random.key(System.system_time()).

Return Values

The function returns a struct with the following parameters:

:weights - The fractions of data sampled from each Gaussian, respectively.
:means - Means of the Gaussian components.
:covariances - Covariance matrices of the Gaussian components.
:precisions_cholesky - Cholesky decomposition of the precision matrices (inverses of covariances). This is useful for the model inference.

Examples

iex> key = Nx.Random.key(12)
iex> x = Nx.tensor([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
iex> Scholar.Cluster.GaussianMixture.fit(x, num_gaussians: 2, key: key).means
Nx.tensor(
  [
    [1.0, 2.0],
    [10.0, 2.0]
  ]
)

predict(model, x)

Makes predictions with the given model on inputs x.

Return Values

It returns a tensor with Gaussian assignments for every input point.

Examples

iex> key = Nx.Random.key(12)
iex> x = Nx.tensor([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
iex> model = Scholar.Cluster.GaussianMixture.fit(x, num_gaussians: 2, key: key)
iex> Scholar.Cluster.GaussianMixture.predict(model, Nx.tensor([[8, 1], [2, 3]]))
Nx.tensor(
  [1, 0]
)

predict_prob(model, x)

Makes predictions with the given model on inputs x.

Return Values

It returns a tensor probabilities of Gaussian assignments for every input point.

Examples

iex> key = Nx.Random.key(12)
iex> x = Nx.tensor([[1, 2], [1, 4], [1, 0], [10, 2], [10, 4], [10, 0]])
iex> model = Scholar.Cluster.GaussianMixture.fit(x, num_gaussians: 2, key: key)
iex> Scholar.Cluster.GaussianMixture.predict_prob(model, Nx.tensor([[8, 1], [2, 3]]))
Nx.tensor(
  [
    [0.0, 1.0],
    [1.0, 0.0]
  ]
)