View Source ExFuzzywuzzy (ex_fuzzywuzzy v0.3.0)
ex_fuzzywuzzy is a fuzzy string matching library that uses a customizable measure to calculate a distance ratio
Choose the ratio function which fits best your needs among the available, providing the two strings to be matched and - if needed - overwriting options over the configured ones.
Available methods are:
- Simple ratio
- Quick ratio
- Partial ratio
- Token sort ratio
- Partial token sort ratio
- Token set ratio
- Partial token set ratio
- Best score ratio
Available options are:
- Similarity function (Levenshtein and Jaro-Winkler provided in library)
- Case sensitiveness of match
- Decimal precision of output score
Here are some examples.
Simple ratio
iex> ExFuzzywuzzy.ratio("this is a test", "this is a test!")
96.55
Quick ratio
iex> ExFuzzywuzzy.quick_ratio("this is a test", "this is a test!")
100.0
Partial ratio
iex> ExFuzzywuzzy.partial_ratio("this is a test", "this is a test!")
100.0
Best Score ratio
iex> ExFuzzywuzzy.best_score_ratio("this is a test", "this is a test!")
{:quick, 100.0}
Summary
Types
Ratio methods available that match the full string
Configurable runtime option types
Configurable runtime options for ratio
All ratio methods available
Ratio methods available that works on the best matching substring
Ratio calculator-like signature
Functions
Calculates the ratio between the strings using various methods, returning the best score and algorithm
Calculates the partial ratio between two strings, that is the ratio between the best matching m-length substrings
Like token set ratio, but a partial ratio - instead a full one - is applied
Like token sort ratio, but a partial ratio - instead of a standard one - is applied
Process a list of strings, finding the best match on a string reference. Not implemented yet
Like standard ratio, but ignores any non-alphanumeric character
Calculates the standard ratio between two strings as a percentage. It demands the calculus to the chosen measure, standardizing the produced output
Calculates the token set ratio between two strings, that is the ratio calculated after tokenizing each string, splitting in two sets (a set with fully matching tokens, a set with other tokens), then sorting on set membership and alphabetically
Calculates the token sort ratio between two strings, that is the ratio calculated after tokenizing and sorting alphabetically each string
Weighted ratio. Not implemented yet
Types
@type full_match_method() :: :standard | :quick | :token_sort | :token_set
Ratio methods available that match the full string
@type fuzzywuzzy_option() :: {:similarity_fn, ratio_calculator()} | {:case_sensitive, boolean()} | {:precision, non_neg_integer()}
Configurable runtime option types
@type fuzzywuzzy_options() :: [fuzzywuzzy_option()]
Configurable runtime options for ratio
@type match_method() :: full_match_method() | partial_match_method()
All ratio methods available
@type partial_match_method() :: :partial | :partial_token_sort | :partial_token_set
Ratio methods available that works on the best matching substring
Ratio calculator-like signature
Functions
@spec best_score_ratio(String.t(), String.t(), boolean(), fuzzywuzzy_options()) :: {match_method(), float()}
Calculates the ratio between the strings using various methods, returning the best score and algorithm
@spec partial_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Calculates the partial ratio between two strings, that is the ratio between the best matching m-length substrings
iex> partial_ratio("this is a test", "this is a test!")
100.0
iex> partial_ratio("yankees", "new york yankees")
100.0
@spec partial_token_set_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Like token set ratio, but a partial ratio - instead a full one - is applied
iex> partial_token_set_ratio("grizzly was a bear", "a grizzly inside a box")
100.0
iex> partial_token_set_ratio("grizzly was a bear", "be what you wear")
43.75
@spec partial_token_sort_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Like token sort ratio, but a partial ratio - instead of a standard one - is applied
iex> partial_token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
100.0
iex> partial_token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
81.25
@spec process(String.t(), [String.t()], fuzzywuzzy_options()) :: String.t()
Process a list of strings, finding the best match on a string reference. Not implemented yet
@spec quick_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Like standard ratio, but ignores any non-alphanumeric character
iex> quick_ratio("this is a test", "this is a test!")
100.0
@spec ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Calculates the standard ratio between two strings as a percentage. It demands the calculus to the chosen measure, standardizing the produced output
iex> ratio("this is a test", "this is a test!")
96.55
@spec token_set_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Calculates the token set ratio between two strings, that is the ratio calculated after tokenizing each string, splitting in two sets (a set with fully matching tokens, a set with other tokens), then sorting on set membership and alphabetically
iex> token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
100.0
iex> token_set_ratio("fuzzy was a bear", "muzzy wuzzy was a bear")
78.95
@spec token_sort_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Calculates the token sort ratio between two strings, that is the ratio calculated after tokenizing and sorting alphabetically each string
iex> token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
100.0
iex> token_sort_ratio("fuzzy muzzy was a bear", "wuzzy fuzzy was a bear")
77.27
@spec weighted_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()
Weighted ratio. Not implemented yet