New York State Identification and Intelligence System (NYSIIS) phonetic encoding (Robert L. Taft, 1970).
NYSIIS was designed as a Soundex successor for English personal-name matching. Compared to Soundex it:
- keeps letters rather than digits, so the codes are pronounceable;
- is more discriminating in practice (
RobertsandDobertsget different codes); - handles common English-name patterns natively (
MAC→MCC,KN→NN,PH/PF→FF,SCH→SSS, etc.).
This module implements Taft's original algorithm, optionally with the
6-character truncation that the 1970 specification mandated. Pass
max_length: nil (the default) to skip truncation; pass 6 for the
classical fixed-length code.
When to use
NYSIIS is a strong default for fuzzy English name matching when you
want the matching key to remain readable. For maximum discrimination
on multi-cultural names, prefer Text.Phonetic.DoubleMetaphone.
References
Taft, R. L. (1970). Name Search Techniques. New York State Identification and Intelligence System.
https://www.archives.gov/research/census/soundex/ describes the Soundex / NYSIIS lineage.
Summary
Functions
Returns the NYSIIS code for name.
Returns true if name_a and name_b produce the same NYSIIS code.
Functions
Returns the NYSIIS code for name.
Arguments
nameis a string. Non-Latin letters and diacritics are folded to ASCII viaText.Clean.unaccent/1before encoding.
Options
:max_length— truncate the resulting code to this length. Pass6for the classical Taft NYSIIS. Defaults tonil(no truncation).
Returns
- The NYSIIS code as an uppercase ASCII string. Returns
""for empty input or input with no Latin letters.
Examples
iex> Text.Phonetic.NYSIIS.encode("Watkins")
"WATCAN"
iex> Text.Phonetic.NYSIIS.encode("MacDonald")
"MCDANALD"
iex> Text.Phonetic.NYSIIS.encode("MacDonald", max_length: 6)
"MCDANA"
Returns true if name_a and name_b produce the same NYSIIS code.
Arguments
name_ais a string.name_bis a string.
Options
Same as encode/2. The same options are applied to both inputs.
Returns
truewhen both inputs produce a non-empty NYSIIS code and the codes are equal.falseotherwise (including when either input is empty or contains no Latin letters).
Examples
iex> Text.Phonetic.NYSIIS.match?("MacDonald", "McDonald")
true
iex> Text.Phonetic.NYSIIS.match?("Smith", "Schmidt")
false