Class ColognePhonetic
Encodes a string into a Cologne Phonetic value.
Inheritance
System.Object
ColognePhonetic
Assembly: Lucene.Net.Analysis.Phonetic.dll
Syntax
public class ColognePhonetic : object, IStringEncoder
Remarks
Implements the KÖlner Phonetik (Cologne Phonetic) algorithm issued by Hans Joachim Postel in 1969.
The KÖlner Phonetik is a phonetic algorithm which is optimized for the German language. It is related to the well-known soundex algorithm.
Algorithm
- Step 1:
After preprocessing (conversion to upper case, transcription of germanic umlauts, removal of non alphabetical characters) the
letters of the supplied text are replaced by their phonetic code according to the following table.
LetterContextCode A, E, I, J, O, U, Y0 H- B1 Pnot before H1 D, Tnot before C, S, Z2 F, V, W3 Pbefore H3 G, K, Q4 Ct onset before A, H, K, L, O, Q, R, U, X OR
before A, H, K, O, Q, U, X except after S, Z4Xnot after C, K, Q48 L5 M, N6 R7 S, Z8 Cafter S, Z OR
at onset except before A, H, K, L, O, Q, R, U, XOR
not before A, H, K, O, Q, U, X 8D, Tbefore C, S, Z8 Xafter C, K, Q8 (Source: Wikipedia (de): KÖlner Phonetik -- Buchstabencodes)
Example:
"MÜller-LÜdenscheidt" => "MULLERLUDENSCHEIDT" => "6005507500206880022"
- Step 2:
Collapse of all multiple consecutive code digits.
Example:
"6005507500206880022" => "6050750206802"
- Step 3:
Removal of all codes "0" except at the beginning. This means that two or more identical consecutive digits can occur
if they occur after removing the "0" digits.
Example:
"6050750206802" => "65752682"
This class is thread-safe.
See: Wikipedia (de): Kölner Phonetik (in German)
since 1.5
Methods
Name | Description |
---|---|
Encode(String) | |
GetColognePhonetic(String) | Implements the Kölner Phonetik algorithm. In contrast to the initial description of the algorithm, this implementation does the encoding in one pass. |
IsEncodeEqual(String, String) |