Text.Phonetic.Cologne (Text v0.5.0)

Copy Markdown View Source

Cologne phonetics (Kölner Phonetik), the German-language counterpart to Soundex.

Designed by Hans Joachim Postel in 1969 specifically for German names and German-language text. Encodes a word as a digit string in which similarly-pronounced German names share a key. Particularly good with German spelling variants — Müller / Mueller / Muller, Meyer / Maier / Mayer / Meier all collapse to the same code.

Encoding

Each letter maps to a digit 08:

LetterCode
A, E, I, J, O, U, Y0
H(skipped)
B1
P (not before H)1
P before H3
D, T (not before C, S, Z)2
D, T before C, S, Z8
F, V, W3
G, K, Q4
C — see context rules below4 or 8
X — see context rules below48 or 8
L5
M, N6
R7
S, Z8

Context rules for C and X are slightly elaborate (initial vs medial position; preceding and following letter); see the implementation for the full table.

Post-processing:

  1. Collapse runs of identical adjacent codes to a single code.
  2. Drop every 0 except the one in the first position.

Result: a digit string of variable length.

Examples

iex> Text.Phonetic.Cologne.encode("Müller")
"657"

iex> Text.Phonetic.Cologne.encode("Mueller")
"657"

iex> Text.Phonetic.Cologne.encode("Meyer")
"67"

iex> Text.Phonetic.Cologne.encode("Mayer")
"67"

iex> Text.Phonetic.Cologne.encode("Maier")
"67"

iex> Text.Phonetic.Cologne.encode("Schmidt")
"862"

iex> Text.Phonetic.Cologne.encode("Wikipedia")
"3412"

Reference

Postel, H. J. (1969). Die Kölner Phonetik: Ein Verfahren zur Identifizierung von Personennamen auf der Grundlage der Gestaltanalyse. IBM-Nachrichten 19, 925–931.

Summary

Functions

Returns the Kölner-Phonetik code for name.

Returns true if name_a and name_b produce the same Kölner code (and both produce a non-empty code).

Functions

encode(name)

@spec encode(String.t()) :: String.t()

Returns the Kölner-Phonetik code for name.

Arguments

  • name is a string. German umlauts (Ä Ö Ü ß) are normalised before encoding, and other diacritics are folded via Text.Clean.unaccent/1.

Returns

  • A digit string. The first character is always 08. Returns "" for empty input or input containing no Latin letters.

Examples

iex> Text.Phonetic.Cologne.encode("Schmitt")
"862"

iex> Text.Phonetic.Cologne.encode("Wikipedia")
"3412"

match?(name_a, name_b)

@spec match?(String.t(), String.t()) :: boolean()

Returns true if name_a and name_b produce the same Kölner code (and both produce a non-empty code).

Arguments

  • name_a is a string.

  • name_b is a string.

Returns

  • true when both inputs produce a non-empty Kölner code and the codes are equal.

  • false otherwise.

Examples

iex> Text.Phonetic.Cologne.match?("Müller", "Mueller")
true

iex> Text.Phonetic.Cologne.match?("Meyer", "Maier")
true

iex> Text.Phonetic.Cologne.match?("Schmidt", "Schneider")
false