# `Text.Phonetic.Soundex`
[🔗](https://github.com/kipcole9/text/blob/v0.5.0/lib/phonetic/soundex.ex#L1)

Soundex phonetic encoding (Russell-Odell, 1918).

Encodes a word as a four-character code that groups names sharing
rough English pronunciation under the same key. Designed for the
English-language US census of 1880; the original use case was finding
surnames despite spelling variations on hand-filled forms ("Smith" vs
"Smyth", "Robert" vs "Roberts").

The encoding is **deliberately lossy**:

* Only the first letter is preserved verbatim.

* `H`, `W`, and the vowels `A E I O U Y` are dropped after the first
  position.

* The remaining consonants are mapped to one of six numeric classes
  based on phonetic similarity (`B F P V` → `1`, `C G J K Q S X Z` →
  `2`, etc.).

* Adjacent duplicates of the same class are collapsed to one digit.

* The result is padded or truncated to four characters: one letter
  followed by three digits.

### When to use

Soundex is primarily useful for **English surname matching** — the
domain it was designed for. It is well-known and widely implemented,
which makes it a useful interchange format with legacy systems
(Oracle, MySQL, and many genealogy tools all expose it).

For modern fuzzy-name matching, consider Metaphone or Double
Metaphone instead — both produce more discriminating codes and
handle non-Anglo-Saxon names better. This module ships Soundex
primarily for compatibility with those legacy systems and as a
baseline reference.

### Algorithm reference

Implementation follows the variant codified by the U.S. National
Archives at https://www.archives.gov/research/census/soundex.html,
which is the de-facto standard.

# `encode`

```elixir
@spec encode(String.t()) :: String.t()
```

Returns the Soundex code for an English word.

### Arguments

* `word` is a string. Non-letter characters are ignored. The first
  letter of the result preserves the case-folded first letter of the
  input.

### Returns

* A four-character string of the form `<letter><digit><digit><digit>`,
  e.g. `"R163"`. Returns `""` for an empty or letter-free input.

### Examples

    iex> Text.Phonetic.Soundex.encode("Robert")
    "R163"

    iex> Text.Phonetic.Soundex.encode("Rupert")
    "R163"

    iex> Text.Phonetic.Soundex.encode("Rubin")
    "R150"

    iex> Text.Phonetic.Soundex.encode("Ashcraft")
    "A261"

    iex> Text.Phonetic.Soundex.encode("Tymczak")
    "T522"

    iex> Text.Phonetic.Soundex.encode("Pfister")
    "P236"

    iex> Text.Phonetic.Soundex.encode("Smith")
    "S530"

    iex> Text.Phonetic.Soundex.encode("Smyth")
    "S530"

    iex> Text.Phonetic.Soundex.encode("")
    ""

# `match?`

```elixir
@spec match?(String.t(), String.t()) :: boolean()
```

Returns `true` if `name_a` and `name_b` produce the same Soundex
code (and both produce a non-empty code).

### Arguments

* `name_a` is a string.

* `name_b` is a string.

### Returns

* `true` when both inputs produce a non-empty Soundex code and the
  codes are equal.

* `false` otherwise (including when either input is empty or
  contains no letters).

### Examples

    iex> Text.Phonetic.Soundex.match?("Robert", "Rupert")
    true

    iex> Text.Phonetic.Soundex.match?("Smith", "Schmidt")
    true

    iex> Text.Phonetic.Soundex.match?("Roberts", "Doberts")
    false

    iex> Text.Phonetic.Soundex.match?("anything", "")
    false

---

*Consult [api-reference.md](api-reference.md) for complete listing*
