Text.Phonetic.Metaphone (Text v0.5.0)

Copy Markdown View Source

Metaphone phonetic encoding (Lawrence Philips, 1990).

A more discriminating phonetic encoder than Text.Phonetic.Soundex. Metaphone produces variable-length codes that better reflect English pronunciation rules, handling common digraphs (gh, kn, wr), silent letters, and context-dependent consonant pronunciation.

Differences from Soundex

  • Variable-length output. Metaphone codes are as long as needed to encode the word, unlike Soundex's fixed four-character output. Use encode/2 with a :max_length option to truncate.

  • Letters, not digits. The output uses letters that approximate the phonemes in the source — K for hard C/K, S for soft C/S/Z, 0 (the digit zero) for the unvoiced th in "thin", X for sh, etc.

  • Better discrimination. Phillips/Phyllips/Filips all encode to FLPS; Tomson/Thompson both encode to TMSN/TMSN. Metaphone is rarely tighter than Soundex on collisions but is much more discriminating on intentionally different inputs.

When to use

Metaphone is the standard choice for fuzzy English-name matching in search, deduplication, and record-linkage pipelines. For multilingual or non-Anglo names, Double Metaphone (not yet implemented in this module) produces better results.

Algorithm

Implementation follows the rules described in Philips' 1990 paper "Hanging on the Metaphone" and the canonical reference port at https://en.wikipedia.org/wiki/Metaphone. Where references disagree on edge cases, we follow the most commonly-cited interpretation.

Summary

Functions

Returns the Metaphone code for an English word.

Returns true if word_a and word_b produce the same Metaphone code (and both produce a non-empty code).

Functions

encode(word, options \\ [])

@spec encode(
  String.t(),
  keyword()
) :: String.t()

Returns the Metaphone code for an English word.

Arguments

  • word is a string. Non-letter characters are stripped before encoding.

Options

  • :max_length — truncate the output to this many characters. Defaults to no truncation. Pass 4 to mimic Soundex's fixed length.

Returns

  • An uppercase ASCII string. Returns "" for empty or letter-free input.

Examples

iex> Text.Phonetic.Metaphone.encode("Thompson")
"0MPSN"

iex> Text.Phonetic.Metaphone.encode("Phillips")
"FLPS"

iex> Text.Phonetic.Metaphone.encode("Knight")
"NT"

iex> Text.Phonetic.Metaphone.encode("Wright")
"RT"

iex> Text.Phonetic.Metaphone.encode("")
""

match?(word_a, word_b, options \\ [])

@spec match?(String.t(), String.t(), keyword()) :: boolean()

Returns true if word_a and word_b produce the same Metaphone code (and both produce a non-empty code).

Arguments

  • word_a is a string.

  • word_b is a string.

Options

Same as encode/2. The same options are applied to both inputs.

Returns

  • true when both inputs produce a non-empty Metaphone code and the codes are equal.

  • false otherwise.

Examples

iex> Text.Phonetic.Metaphone.match?("knight", "night")
true

iex> Text.Phonetic.Metaphone.match?("Wright", "Right")
true

iex> Text.Phonetic.Metaphone.match?("Smith", "Schmidt")
false