View Source Scholar.Cluster.OPTICS (Scholar v0.4.0)

OPTICS (Ordering Points To Identify the Clustering Structure) is an algorithm for finding density-based clusters in spatial data.

It is closely related to DBSCAN, finds core sample of high density and expands clusters from them. Unlike DBSCAN, keeps cluster hierarchy for a variable neighborhood radius. Clusters are then extracted using a DBSCAN-like method.

Summary

Functions

Perform OPTICS clustering for x which is tensor of {n_samples, n_features} shape.

Functions

fit(x, opts \\ [])

Perform OPTICS clustering for x which is tensor of {n_samples, n_features} shape.

Options

  • :min_samples (pos_integer/0) - The number of samples in a neighborhood for a point to be considered as a core point. The default value is 5.

  • :max_eps - The maximum distance between two samples for one to be considered as in the neighborhood of the other. Default value of Nx.Constants.infinity() will identify clusters across all scales.

  • :eps - The maximum distance between two samples for one to be considered as in the neighborhood of the other. By default it assumes the same value as max_eps.

  • :algorithm (atom/0) - Algorithm used to compute the k-nearest neighbors. Possible values:

    * `:brute` - Brute-force search. See `Scholar.Neighbors.BruteKNN` for more details.
    
    * `:kd_tree` - k-d tree. See `Scholar.Neighbors.KDTree` for more details.
    
    * `:random_projection_forest` - Random projection forest. See `Scholar.Neighbors.RandomProjectionForest` for more details.
    
    * Module implementing `fit(data, opts)` and `predict(model, query)`. predict/2 must return a tuple containing indices
    of k-nearest neighbors of query points as well as distances between query points and their k-nearest neighbors.
    Also has to take num_neighbors as argument.

    The default value is :brute.

Return Values

The function returns a labels tensor of shape {n_samples}. Cluster labels for each point in the dataset given to fit. Noisy samples are labeled as -1.

Examples

iex> x = Nx.tensor([[1, 2], [2, 5], [3, 6], [8, 7], [8, 8], [7, 3]])
iex> Scholar.Cluster.OPTICS.fit(x, min_samples: 2).labels
#Nx.Tensor<
  s32[6]
  [-1, -1, -1, -1, -1, -1]
>
iex> Scholar.Cluster.OPTICS.fit(x, eps: 4.5, min_samples: 2).labels
#Nx.Tensor<
  s32[6]
  [0, 0, 0, 1, 1, 1]
>
iex> Scholar.Cluster.OPTICS.fit(x, eps: 2, min_samples: 2).labels
#Nx.Tensor<
  s32[6]
  [-1, 0, 0, 1, 1, -1]
>
iex> Scholar.Cluster.OPTICS.fit(x, eps: 2, min_samples: 2, algorithm: :kd_tree, metric: {:minkowski, 1}).labels
#Nx.Tensor<
  s32[6]
  [-1, 0, 0, 1, 1, -1]
>
iex> Scholar.Cluster.OPTICS.fit(x, eps: 1, min_samples: 2).labels
#Nx.Tensor<
  s32[6]
  [-1, -1, -1, 0, 0, -1]
>
iex> Scholar.Cluster.OPTICS.fit(x, eps: 4.5, min_samples: 3).labels
#Nx.Tensor<
  s32[6]
  [0, 0, 0, 1, 1, -1]
>