View Source ExFuzzywuzzy (ex_fuzzywuzzy v0.3.0)

ex_fuzzywuzzy is a fuzzy string matching library that uses a customizable measure to calculate a distance ratio

Choose the ratio function which fits best your needs among the available, providing the two strings to be matched and - if needed - overwriting options over the configured ones.

Available methods are:

  • Simple ratio
  • Quick ratio
  • Partial ratio
  • Token sort ratio
  • Partial token sort ratio
  • Token set ratio
  • Partial token set ratio
  • Best score ratio

Available options are:

  • Similarity function (Levenshtein and Jaro-Winkler provided in library)
  • Case sensitiveness of match
  • Decimal precision of output score

Here are some examples.

Simple ratio

iex> ExFuzzywuzzy.ratio("this is a test", "this is a test!")
96.55

Quick ratio

iex> ExFuzzywuzzy.quick_ratio("this is a test", "this is a test!")
100.0

Partial ratio

iex> ExFuzzywuzzy.partial_ratio("this is a test", "this is a test!")
100.0

Best Score ratio

iex> ExFuzzywuzzy.best_score_ratio("this is a test", "this is a test!")
{:quick, 100.0}

Summary

Types

Ratio methods available that match the full string

Configurable runtime option types

Configurable runtime options for ratio

All ratio methods available

Ratio methods available that works on the best matching substring

Ratio calculator-like signature

Functions

Calculates the ratio between the strings using various methods, returning the best score and algorithm

Calculates the partial ratio between two strings, that is the ratio between the best matching m-length substrings

Like token set ratio, but a partial ratio - instead a full one - is applied

Like token sort ratio, but a partial ratio - instead of a standard one - is applied

Process a list of strings, finding the best match on a string reference. Not implemented yet

Like standard ratio, but ignores any non-alphanumeric character

Calculates the standard ratio between two strings as a percentage. It demands the calculus to the chosen measure, standardizing the produced output

Calculates the token set ratio between two strings, that is the ratio calculated after tokenizing each string, splitting in two sets (a set with fully matching tokens, a set with other tokens), then sorting on set membership and alphabetically

Calculates the token sort ratio between two strings, that is the ratio calculated after tokenizing and sorting alphabetically each string

Weighted ratio. Not implemented yet

Types

@type full_match_method() :: :standard | :quick | :token_sort | :token_set

Ratio methods available that match the full string

@type fuzzywuzzy_option() ::
  {:similarity_fn, ratio_calculator()}
  | {:case_sensitive, boolean()}
  | {:precision, non_neg_integer()}

Configurable runtime option types

@type fuzzywuzzy_options() :: [fuzzywuzzy_option()]

Configurable runtime options for ratio

@type match_method() :: full_match_method() | partial_match_method()

All ratio methods available

Link to this type

partial_match_method()

View Source
@type partial_match_method() :: :partial | :partial_token_sort | :partial_token_set

Ratio methods available that works on the best matching substring

@type ratio_calculator() :: (String.t(), String.t() -> float())

Ratio calculator-like signature

Functions

Link to this function

best_score_ratio(left, right, partial \\ false, options \\ [])

View Source
@spec best_score_ratio(String.t(), String.t(), boolean(), fuzzywuzzy_options()) ::
  {match_method(), float()}

Calculates the ratio between the strings using various methods, returning the best score and algorithm

Link to this function

partial_ratio(left, right, options \\ [])

View Source
@spec partial_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Calculates the partial ratio between two strings, that is the ratio between the best matching m-length substrings

iex> partial_ratio("this is a test", "this is a test!")
100.0

iex> partial_ratio("yankees", "new york yankees")
100.0
Link to this function

partial_token_set_ratio(left, right, options \\ [])

View Source
@spec partial_token_set_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Like token set ratio, but a partial ratio - instead a full one - is applied

iex> partial_token_set_ratio("grizzly was a bear", "a grizzly inside a box")
100.0

iex> partial_token_set_ratio("grizzly was a bear", "be what you wear")
43.75
Link to this function

partial_token_sort_ratio(left, right, options \\ [])

View Source
@spec partial_token_sort_ratio(String.t(), String.t(), fuzzywuzzy_options()) ::
  float()

Like token sort ratio, but a partial ratio - instead of a standard one - is applied

iex> partial_token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
100.0

iex> partial_token_sort_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
81.25
@spec process(String.t(), [String.t()], fuzzywuzzy_options()) :: String.t()

Process a list of strings, finding the best match on a string reference. Not implemented yet

Link to this function

quick_ratio(left, right, options \\ [])

View Source
@spec quick_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Like standard ratio, but ignores any non-alphanumeric character

iex> quick_ratio("this is a test", "this is a test!")
100.0
Link to this function

ratio(left, right, options \\ [])

View Source
@spec ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Calculates the standard ratio between two strings as a percentage. It demands the calculus to the chosen measure, standardizing the produced output

iex> ratio("this is a test", "this is a test!")
96.55
Link to this function

token_set_ratio(left, right, options \\ [])

View Source
@spec token_set_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Calculates the token set ratio between two strings, that is the ratio calculated after tokenizing each string, splitting in two sets (a set with fully matching tokens, a set with other tokens), then sorting on set membership and alphabetically

iex> token_set_ratio("fuzzy was a bear", "fuzzy fuzzy was a bear")
100.0

iex> token_set_ratio("fuzzy was a bear", "muzzy wuzzy was a bear")
78.95
Link to this function

token_sort_ratio(left, right, options \\ [])

View Source
@spec token_sort_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Calculates the token sort ratio between two strings, that is the ratio calculated after tokenizing and sorting alphabetically each string

iex> token_sort_ratio("fuzzy wuzzy was a bear", "wuzzy fuzzy was a bear")
100.0

iex> token_sort_ratio("fuzzy muzzy was a bear", "wuzzy fuzzy was a bear")
77.27
@spec weighted_ratio(String.t(), String.t(), fuzzywuzzy_options()) :: float()

Weighted ratio. Not implemented yet