Cologne phonetics (Kölner Phonetik), the German-language counterpart to Soundex.
Designed by Hans Joachim Postel in 1969 specifically for German names
and German-language text. Encodes a word as a digit string in which
similarly-pronounced German names share a key. Particularly good with
German spelling variants — Müller / Mueller / Muller,
Meyer / Maier / Mayer / Meier all collapse to the same code.
Encoding
Each letter maps to a digit 0–8:
| Letter | Code |
|---|---|
| A, E, I, J, O, U, Y | 0 |
| H | (skipped) |
| B | 1 |
| P (not before H) | 1 |
| P before H | 3 |
| D, T (not before C, S, Z) | 2 |
| D, T before C, S, Z | 8 |
| F, V, W | 3 |
| G, K, Q | 4 |
| C — see context rules below | 4 or 8 |
| X — see context rules below | 48 or 8 |
| L | 5 |
| M, N | 6 |
| R | 7 |
| S, Z | 8 |
Context rules for C and X are slightly elaborate (initial vs medial position; preceding and following letter); see the implementation for the full table.
Post-processing:
- Collapse runs of identical adjacent codes to a single code.
- Drop every
0except the one in the first position.
Result: a digit string of variable length.
Examples
iex> Text.Phonetic.Cologne.encode("Müller")
"657"
iex> Text.Phonetic.Cologne.encode("Mueller")
"657"
iex> Text.Phonetic.Cologne.encode("Meyer")
"67"
iex> Text.Phonetic.Cologne.encode("Mayer")
"67"
iex> Text.Phonetic.Cologne.encode("Maier")
"67"
iex> Text.Phonetic.Cologne.encode("Schmidt")
"862"
iex> Text.Phonetic.Cologne.encode("Wikipedia")
"3412"Reference
Postel, H. J. (1969). Die Kölner Phonetik: Ein Verfahren zur Identifizierung von Personennamen auf der Grundlage der Gestaltanalyse. IBM-Nachrichten 19, 925–931.
Summary
Functions
Returns the Kölner-Phonetik code for name.
Returns true if name_a and name_b produce the same Kölner
code (and both produce a non-empty code).
Functions
Returns the Kölner-Phonetik code for name.
Arguments
nameis a string. German umlauts (Ä Ö Ü ß) are normalised before encoding, and other diacritics are folded viaText.Clean.unaccent/1.
Returns
- A digit string. The first character is always
0–8. Returns""for empty input or input containing no Latin letters.
Examples
iex> Text.Phonetic.Cologne.encode("Schmitt")
"862"
iex> Text.Phonetic.Cologne.encode("Wikipedia")
"3412"
Returns true if name_a and name_b produce the same Kölner
code (and both produce a non-empty code).
Arguments
name_ais a string.name_bis a string.
Returns
truewhen both inputs produce a non-empty Kölner code and the codes are equal.falseotherwise.
Examples
iex> Text.Phonetic.Cologne.match?("Müller", "Mueller")
true
iex> Text.Phonetic.Cologne.match?("Meyer", "Maier")
true
iex> Text.Phonetic.Cologne.match?("Schmidt", "Schneider")
false