View Source Vettore.Distance (Vettore v0.2.3)
Stand-alone distance/similarity helpers and the collection-agnostic MMR re-ranker exposed by the Rust NIF layer.
Link to this section Summary
Functions
Compress a float vector into its sign-bit representation (64 floats → 64 bits →
one u64
). Useful for ultra-fast binary similarity.
Cosine similarity in 0.0..1.0
((dot + 1) / 2
after length-normalisation).
Raw dot product (no post-processing).
Similarity based on Euclidean (L2) distance.
Bit-wise Hamming distance between two compressed vectors (see
compress_f32_vector/1
).
MMR (Maximal-Marginal-Relevance) re-ranker that trades off query relevance and result diversity.
Link to this section Types
@type alpha() :: float()
@type distance() :: String.t()
@type final_k() :: pos_integer()
@type vector() :: [float()]
@type vector_bits() :: [integer()]
Link to this section Functions
@spec compress_f32_vector(vector()) :: vector_bits()
Compress a float vector into its sign-bit representation (64 floats → 64 bits →
one u64
). Useful for ultra-fast binary similarity.
Examples
iex> Vettore.Distance.compress_f32_vector([1.0, 2.0, 3.0])
[1, 0, 0,..... 1, 0, 0]
Cosine similarity in 0.0..1.0
((dot + 1) / 2
after length-normalisation).
Examples
iex> Vettore.Distance.cosine([1, 0, 0, 1, 0, 0], [1, 0, 0, 1, 0, 0])
0.0
Raw dot product (no post-processing).
Examples
iex> Vettore.Distance.dot_product([1, 0, 0, 1, 0, 0], [1, 0, 0, 1, 0, 0])
1
Similarity based on Euclidean (L2) distance.
The result is in 0.0..1.0
via the mapping 1 / (1 + d)
so that
identical vectors yield 1.0
.
Examples
iex> Vettore.Distance.euclidean([1, 0, 0, 1, 0, 0], [1, 0, 0, 1, 0, 0])
0.0
@spec hamming(vector_bits(), vector_bits()) :: float()
Bit-wise Hamming distance between two compressed vectors (see
compress_f32_vector/1
).
Examples
iex> Vettore.Distance.hamming([1, 0, 0, 1, 0, 0], [1, 0, 0, 1, 0, 0])
0
@spec mmr_rerank( search_result(), embeddings(), distance(), alpha(), final_k() ) :: embeddings()
MMR (Maximal-Marginal-Relevance) re-ranker that trades off query relevance and result diversity.
initial
– list of{id, similarity_to_query}
tuples (first-pass hits)embeddings
–{id, vector}
pairs (dimension must be consistent)distance
–"euclidean" | "cosine" | "dot" | "binary"
alpha
– 0 ⇢ only diversity, 1 ⇢ only query-relevancefinal_k
– length of the wanted output list
Examples
iex> Vettore.Distance.mmr_rerank([{"my_id", 0.0}], [{"my_id", [1.0, 2.0, 3.0]}], "euclidean", 0.5, 1)
[{"my_id", 0.0}]