View Source Scholar.Metrics.Clustering (Scholar v0.3.1)

Metrics related to clustering algorithms.

Summary

Functions

Compute the Silhouette Coefficient for each sample.

Compute the mean Silhouette Coefficient of all samples.

Functions

Link to this function

silhouette_samples(x, labels, opts \\ [])

View Source

Compute the Silhouette Coefficient for each sample.

The silhouette value is a measure of how similar an object is to its own cluster (cohesion) compared to other clusters (separation). The silhouette ranges from −1 to +1, where a high value indicates that the object is well matched to its own cluster and poorly matched to neighboring clusters. If most objects have a high value, then the clustering configuration is appropriate. If many points have a low or negative value, then the clustering configuration may have too many or too few clusters.

Time complexity of silhouette score is $O(N^2)$ where $N$ is the number of samples.

Options

  • :num_clusters (pos_integer/0) - Required. Number of clusters in clustering.

Examples

iex> x = Nx.tensor([[0, 0], [1, 0], [1, 1], [3, 3], [4, 4.5]])
iex> labels = Nx.tensor([0, 0, 0, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_samples(x, labels, num_clusters: 2)
#Nx.Tensor<
  f32[5]
  [0.7647753357887268, 0.7781199216842651, 0.6754303574562073, 0.49344196915626526, 0.6627992987632751]
>

iex> x = Nx.tensor([[0.1, 0], [0, 1], [22, 65], [42, 3], [4.2, 51]])
iex> labels = Nx.tensor([0, 1, 2, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_samples(x, labels, num_clusters: 3)
#Nx.Tensor<
  f32[5]
  [0.0, -0.9782054424285889, 0.0, -0.18546827137470245, -0.5929659008979797]
>
Link to this function

silhouette_score(x, labels, opts \\ [])

View Source

Compute the mean Silhouette Coefficient of all samples.

Options

  • :num_clusters (pos_integer/0) - Required. Number of clusters in clustering.

Examples

iex> x = Nx.tensor([[0, 0], [1, 0], [1, 1], [3, 3], [4, 4.5]])
iex> labels = Nx.tensor([0, 0, 0, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_score(x, labels, num_clusters: 2)
#Nx.Tensor<
  f32
  0.6749133467674255
>

iex> x = Nx.tensor([[0.1, 0], [0, 1], [22, 65], [42, 3], [4.2, 51]])
iex> labels = Nx.tensor([0, 1, 2, 1, 1])
iex> Scholar.Metrics.Clustering.silhouette_score(x, labels, num_clusters: 3)
#Nx.Tensor<
  f32
  -0.35132792592048645
>