View Source FuzzyCompare.SubstringComparison (fuzzy_compare v1.1.0)

This module offers the functionality of comparing strings of different lengths.

iex> FuzzyCompare.SubstringComparison.similarity("DEUTSCHLAND", "BUNDESREPUBLIK DEUTSCHLAND")
0.9090909090909092

iex> String.jaro_distance("DEUTSCHLAND", "BUNDESREPUBLIK DEUTSCHLAND")
0.5399600399600399

Summary

Functions

The ratio function takes two strings as arguments and returns the substring similarity of those strings as a float between 0 and 1.

Functions

similarity(left, right)

@spec similarity(binary(), binary()) :: float()

The ratio function takes two strings as arguments and returns the substring similarity of those strings as a float between 0 and 1.

The substring matching works by generating a list of equal substrings by means of Myers Difference, comparing these substrings with the Jaro-Winkler function against the shorter one of the two input strings and finally returning the maximum comparison value found.

Let us assume as the input string the following: "DEUTSCHLAND" and "BUNDESREPUBLIK DEUTSCHLAND". This yields the the matching substrings of ["DE", "U", "TSCHLAND"].

We compare each one of them to the shorter one of the input strings:

iex> String.jaro_distance("DE", "DEUTSCHLAND")
0.7272727272727272

iex> String.jaro_distance("U", "DEUTSCHLAND")
0.6969696969696969

iex> String.jaro_distance("TSCHLAND", "DEUTSCHLAND")
0.9090909090909092

Of all comparisons the highest value gets returned.