View Source FuzzyCompare.SubstringComparison (fuzzy_compare v1.1.0)
This module offers the functionality of comparing strings of different lengths.
iex> FuzzyCompare.SubstringComparison.similarity("DEUTSCHLAND", "BUNDESREPUBLIK DEUTSCHLAND")
0.9090909090909092
iex> String.jaro_distance("DEUTSCHLAND", "BUNDESREPUBLIK DEUTSCHLAND")
0.5399600399600399
Summary
Functions
The ratio function takes two strings as arguments and returns the substring similarity of those strings as a float between 0 and 1.
Functions
The ratio function takes two strings as arguments and returns the substring similarity of those strings as a float between 0 and 1.
The substring matching works by generating a list of equal substrings by means of Myers Difference, comparing these substrings with the Jaro-Winkler function against the shorter one of the two input strings and finally returning the maximum comparison value found.
Let us assume as the input string the following: "DEUTSCHLAND"
and
"BUNDESREPUBLIK DEUTSCHLAND"
. This yields the the matching substrings of
["DE", "U", "TSCHLAND"]
.
We compare each one of them to the shorter one of the input strings:
iex> String.jaro_distance("DE", "DEUTSCHLAND")
0.7272727272727272
iex> String.jaro_distance("U", "DEUTSCHLAND")
0.6969696969696969
iex> String.jaro_distance("TSCHLAND", "DEUTSCHLAND")
0.9090909090909092
Of all comparisons the highest value gets returned.