View Source Scholar.Metrics.Clustering (Scholar v0.4.0)

Metrics related to clustering algorithms.

Summary

Functions

Compute the Silhouette Coefficient for each sample.

Compute the mean Silhouette Coefficient of all samples.

Functions

silhouette_samples(x, labels, opts \\ [])

Compute the Silhouette Coefficient for each sample.

The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate. If many points have a low or negative value, then the clustering configuration may have too many or too few clusters.

Time complexity of silhouette score is O(N2)O(N^2) where NN is the number of samples.

Options

  • :num_clusters (pos_integer/0) - Required. Number of clusters in clustering.

Examples

iex> x = Nx.tensor([[0, 0], [1, 0], [1, 1], [3, 3], [4, 4.5]])
iex> labels = Nx.tensor([0, 0, 0, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_samples(x, labels, num_clusters: 2)
#Nx.Tensor<
  f32[5]
  [0.7647753357887268, 0.7781199216842651, 0.6754303574562073, 0.49344196915626526, 0.6627992987632751]
>

iex> x = Nx.tensor([[0.1, 0], [0, 1], [22, 65], [42, 3], [4.2, 51]])
iex> labels = Nx.tensor([0, 1, 2, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_samples(x, labels, num_clusters: 3)
#Nx.Tensor<
  f32[5]
  [0.0, -0.9782054424285889, 0.0, -0.18546827137470245, -0.5929659008979797]
>

silhouette_score(x, labels, opts \\ [])

Compute the mean Silhouette Coefficient of all samples.

Options

  • :num_clusters (pos_integer/0) - Required. Number of clusters in clustering.

Examples

iex> x = Nx.tensor([[0, 0], [1, 0], [1, 1], [3, 3], [4, 4.5]])
iex> labels = Nx.tensor([0, 0, 0, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_score(x, labels, num_clusters: 2)
#Nx.Tensor<
  f32
  0.6749133467674255
>

iex> x = Nx.tensor([[0.1, 0], [0, 1], [22, 65], [42, 3], [4.2, 51]])
iex> labels = Nx.tensor([0, 1, 2, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_score(x, labels, num_clusters: 3)
#Nx.Tensor<
  f32
  -0.35132792592048645
>