View Source Scholar.Cluster.DBSCAN (Scholar v0.3.0)
Perform DBSCAN clustering from vector array or distance matrix.
DBSCAN - Density-Based Spatial Clustering of Applications with Noise. Finds core samples of high density and expands clusters from them. Good for data which contains clusters of similar density.
The time complexity is $O(N^2)$ for $N$ samples. The space complexity is $O(N^2)$.
Summary
Functions
Perform DBSCAN clustering from vector array or distance matrix.
Functions
Perform DBSCAN clustering from vector array or distance matrix.
Options
:eps
- The maximum distance between two samples for them to be considered as in the same neighborhood. The default value is0.5
.:min_samples
(integer/0
) - The number of samples (or total weight) in a neighborhood for a point to be considered as a core point. This includes the point itself. The default value is5
.:metric
- The function that measures the pairwise distance between two points. Possible values:{:minkowski, p}
- Minkowski metric. By changing value ofp
parameter (a positive number or:infinity
) we can set Manhattan (1
), Euclidean (2
), Chebyshev (:infinity
), or any arbitrary $L_p$ metric.:cosine
- Cosine metric.Anonymous function of arity 2 that takes two rank-2 tensors.
The default value is
&Scholar.Metrics.Distance.pairwise_minkowski/2
.:weights
- The weights for each observation inx
. If equals tonil
, all observations are assigned equal weight.
Return Values
The function returns a struct with the following parameters:
:core_sample_indices
- Indices of core samples represented as a mask. The mask is a boolean array of shape{num_samples}
where1
indicates that the corresponding sample is a core sample and0
otherwise.:labels
- Cluster labels for each point in the dataset given to fit(). Noisy samples are given the label-1
.
Examples
iex> x = Nx.tensor([[1, 2], [2, 2], [2, 3], [8, 7], [8, 8], [25, 80]])
iex> Scholar.Cluster.DBSCAN.fit(x, eps: 3, min_samples: 2)
%Scholar.Cluster.DBSCAN{
core_sample_indices: Nx.tensor(
[1, 1, 1, 1, 1, 0], type: :u8
),
labels: Nx.tensor(
[0, 0, 0, 1, 1, -1]
)
}