View Source Scholar.Neighbors.RandomProjectionForest (Scholar v0.4.0)

Random Projection Forest for k-Nearest Neighbor Search.

Each tree in a forest is constructed using a divide and conquer approach. We start with the entire dataset and at every node we project the data onto a random hyperplane and split it in the following way: the points with the projection smaller than or equal to the median are put into the left subtree and the points with projection greater than the median are put into the right subtree. We then proceed recursively with the left and right subtree.

In this implementation the trees are complete, i.e. there are 2^l nodes at level l. The leaves of the trees are arranged as blocks in the field indices. We use the same hyperplane for all nodes on the same level as in [2].

[1] - Randomized partition trees for nearest neighbor search
[2] - Fast Nearest Neighbor Search through Sparse Random Projections and Voting

Summary

Functions

fit(tensor, opts)

Grows a random projection forest.

fit_n(tensor, key, opts)

predict(forest, query)

Computes approximate nearest neighbors of query tensor using random projection forest. Returns the neighbor indices and distances from query points.

Functions

fit(tensor, opts)

Grows a random projection forest.

Options

:num_neighbors (pos_integer/0) - Required. The number of nearest neighbors.
:metric - The function that measures the distance between two points. The default value is :euclidean.
:min_leaf_size (pos_integer/0) - The minumum number of points in the leaf.
:num_trees (pos_integer/0) - Required. The number of trees in the forest.
:key - Used for random number generation in hyperplane initialization. If the key is not provided, it is set to Nx.Random.key(System.system_time()).

Examples

iex> key = Nx.Random.key(12)
iex> tensor = Nx.iota({5, 2})
iex> forest = Scholar.Neighbors.RandomProjectionForest.fit(tensor, num_neighbors: 2, num_trees: 3, key: key)
iex> forest.indices
#Nx.Tensor<
  u32[3][5]
  [
    [0, 1, 2, 3, 4],
    [0, 1, 2, 3, 4],
    [4, 3, 2, 1, 0]
  ]
>

fit_n(tensor, key, opts)

predict(forest, query)

Computes approximate nearest neighbors of query tensor using random projection forest. Returns the neighbor indices and distances from query points.

Examples

iex> key = Nx.Random.key(12)
iex> tensor = Nx.iota({5, 2})
iex> forest = Scholar.Neighbors.RandomProjectionForest.fit(tensor, num_neighbors: 2, metric: :squared_euclidean, num_trees: 3, key: key)
iex> query = Nx.tensor([[3, 4]])
iex> {neighbors, distances} = Scholar.Neighbors.RandomProjectionForest.predict(forest, query)
iex> neighbors
#Nx.Tensor<
  u32[1][2]
  [
    [1, 2]
  ]
>
iex> distances
#Nx.Tensor<
  f32[1][2]
  [
    [2.0, 2.0]
  ]
>