View Source Similarity (Similarity v0.4.0)

Contains basic functions for similarity calculation.

Similarity.Cosine - easy cosine similarity calculation

Similarity.Simhash - simhash similarity calculation between two strings

Summary

Functions

Calculates Cosine similarity between two vectors.

Multiplies cosine similarity with the square root of compared vectors length.

Calculates Euclidean dot product of two vectors.

Calculates Euclidean magnitude of one vector.

Functions

Calculates Cosine similarity between two vectors.

https://en.wikipedia.org/wiki/Cosine_similarity#Definition

Example:

Similarity.cosine([1, 2, 3], [1, 2, 8])
Link to this function

cosine_srol(list_a, list_b)

View Source

Multiplies cosine similarity with the square root of compared vectors length.

srol here means square root of length

This gives better comparable numbers where the number of attributes compared might differ. You can try to use this instead of cosine/2 if the number of shared attributes differ.

Example:

Similarity.cosine_srol([1, 2, 3], [1, 2, 8])
Link to this function

dot_product(list_a, list_b, acc \\ 0)

View Source

Calculates Euclidean dot product of two vectors.

https://en.wikipedia.org/wiki/Euclidean_vector#Dot_product

Example:

iex> Similarity.dot_product([1, 2], [3, 4])
11
Link to this function

magnitude(list, acc \\ 0)

View Source

Calculates Euclidean magnitude of one vector.

https://en.wikipedia.org/wiki/Magnitude_(mathematics)#Euclidean_vector_space

Example:

iex> Similarity.magnitude([2])
2.0
Link to this function

simhash(left, right, options \\ [])

View Source

For docs see Similarity.Simhash

Link to this function

sorensen_dice(left, right, options \\ [])

View Source

For docs see Similarity.SorensenDice