View Source Akin (Akin v0.2.0)
Akin
Functions for comparing two strings for similarity using a collection of string comparison algorithms for Elixir. Algorithms can be called independently or in total to return a map of metrics.
Options
Options accepted in a keyword list (i.e. [ngram_size: 3]).
algorithms
: algorithms to use in comparision. Accepts the name or a keyword list. Default is algorithms/0.metric
- algorithm metric. Default is both
- "string": uses string algorithms
- "phonetic": uses phonetic algorithms
unit
- algorithm unit. Default is both.
- "whole": uses algorithms best suited for whole string comparison (distance)
- "partial": uses algorithms best suited for partial string comparison (substring)
level
- level for double phonetic matching. Default is "normal".- "strict": both encodings for each string must match
- "strong": the primary encoding for each string must match
- "normal": the primary encoding of one string must match either encoding of other string (default)
- "weak": either primary or secondary encoding of one string must match one encoding of other string
match_at
: an algorith score equal to or above this value is condsidered a match. Default is 0.9ngram_size
: number of contiguous letters to split strings into. Default is 2.short_length
: qualifies as "short" to recieve a shortness boost. Used by Name Metric. Default is 8.stem
: boolean representing whether to compare the stemmed version the strings; uses Stemmer. Defaultfalse
Summary
Functions
Compare two strings. Return map of algorithm metrics.
Compare a string to a string with logic specific to names. Matches are determined by algorithem
metrics equal to or higher than the match_at
option. Return a list of strings that are a likely
match and their algorithm metrics.
Compare a string against a list of strings. Matches are determined by algorithem metrics equal to or higher than the
match_at
option. Return a list of strings that are a likely match.
Compare a string against a list of strings. Matches are determined by algorithem metrics equal to or higher than the
match_at
option. Return a list of strings that are a likely match and their algorithm metrics.
Returns list of unique phonetic encodings produces by the single and double metaphone algorithms.
Functions
@spec compare( binary() | %Akin.Corpus{ list: term(), original: term(), set: term(), stems: term(), string: term() }, binary() | %Akin.Corpus{ list: term(), original: term(), set: term(), stems: term(), string: term() }, keyword() ) :: map()
Compare two strings. Return map of algorithm metrics.
Options accepted as a keyword list. If no options are given, default values will be used.
@spec match_name_metrics(binary(), binary(), Keyword.t()) :: %{ left: binary(), match: 0 | 1, metrics: [any()], right: binary() }
Compare a string to a string with logic specific to names. Matches are determined by algorithem
metrics equal to or higher than the match_at
option. Return a list of strings that are a likely
match and their algorithm metrics.
@spec match_names( binary() | %Akin.Corpus{ list: term(), original: term(), set: term(), stems: term(), string: term() }, binary() | %Akin.Corpus{ list: term(), original: term(), set: term(), stems: term(), string: term() } | list(), keyword() ) :: float()
Compare a string against a list of strings. Matches are determined by algorithem metrics equal to or higher than the
match_at
option. Return a list of strings that are a likely match.
Future Plans
- if the name part is an initial, give the
initials
score its weight, otherwise reduce it - if the
initials
score is significantly higher than the average of the others, reduce theinitials
score to the average of the others - add options
- "use_average", "top_three", and/or "average_of_top_three"
- "group" to results into strong matches and weak matches
- "details" to include the scores in the result list
Compare a string against a list of strings. Matches are determined by algorithem metrics equal to or higher than the
match_at
option. Return a list of strings that are a likely match and their algorithm metrics.
@spec phonemes( binary() | %Akin.Corpus{ list: term(), original: term(), set: term(), stems: term(), string: term() } ) :: list()
Returns list of unique phonetic encodings produces by the single and double metaphone algorithms.