Class Soundex
Encodes a string into a Soundex value. Soundex is an encoding used to relate similar names, but can also be used as a general purpose scheme to find word with similar phonemes.
This class is thread-safe. Although not strictly immutable, the Lucene.Net.Analysis.Phonetic.Language.Soundex.maxLength field is not actually used.
Inheritance
Assembly: Lucene.Net.Analysis.Phonetic.dll
Syntax
public class Soundex : object, IStringEncoder
Constructors
Name | Description |
---|---|
Soundex() | Creates an instance using Lucene.Net.Analysis.Phonetic.Language.Soundex.US_ENGLISH_MAPPING. |
Soundex(Char[]) | Creates a soundex instance using the given mapping. This constructor can be used to provide an internationalized mapping for a non-Western character set. Every letter of the alphabet is "mapped" to a numerical value. This char array holds the values to which each letter is mapped. This implementation contains a default map for US_ENGLISH. If the mapping contains an instance of SILENT_MARKER then H and W are not given special treatment. |
Soundex(String) | Creates a refined soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set. If the mapping contains an instance of SILENT_MARKER then H and W are not given special treatment. since 1.4 |
Soundex(String, Boolean) | Creates a refined soundex instance using a custom mapping. This constructor can be used to customize the mapping, and/or possibly provide an internationalized mapping for a non-Western character set. since 1.11 |
Fields
Name | Description |
---|---|
SILENT_MARKER | The marker character used to indicate a silent (ignored) character. These are ignored except when they appear as the first character. Note: the US_ENGLISH_MAPPING_STRING does not use this mechanism because changing it might break existing code. Mappings that don't contain a silent marker code are treated as though H and W are silent. To override this, use the Soundex(String, Boolean) constructor. since 1.11 |
US_ENGLISH | An instance of Soundex using the US_ENGLISH_MAPPING mapping. This treats H and W as silent letters. Apart from when they appear as the first letter, they are ignored. They don't act as separators between duplicate codes. |
US_ENGLISH_GENEALOGY | An instance of Soundex using the mapping as per the Genealogy site: http://www.genealogy.com/articles/research/00000060.html This treats vowels (AEIOUY), H and W as silent letters. Such letters are ignored (after the first) and do not act as separators when dropping duplicate codes. The codes for consonants are otherwise the same as for US_ENGLISH_MAPPING_STRING and US_ENGLISH_SIMPLIFIED. since 1.11 |
US_ENGLISH_MAPPING_STRING | This is a default mapping of the 26 letters used in US English. A value of (This constant is provided as both an implementation convenience and to allow documentation to pick up the value for the constant values page.) Note that letters H and W are treated specially. They are ignored (after the first letter) and don't act as separators between consonants with the same code. |
US_ENGLISH_SIMPLIFIED | An instance of Soundex using the Simplified Soundex mapping, as described here: http://west-penwith.org.uk/misc/soundex.htm This treats H and W the same as vowels (AEIOUY). Such letters aren't encoded (after the first), but they do act as separators when dropping duplicate codes. The mapping is otherwise the same as for US_ENGLISH. since 1.11 |
Properties
Name | Description |
---|---|
MaxLength | Gets or Sets the maxLength. Standard Soundex |
Methods
Name | Description |
---|---|
Difference(String, String) | Encodes the strings and returns the number of characters in the two encoded strings that are the same. This return value ranges from 0 through 4: 0 indicates little or no similarity, and 4 indicates strong similarity or identical values. See: MS T-SQL DIFFERENCE since 1.3 |
Encode(String) | Encodes a string using the soundex algorithm. |
GetSoundex(String) | Retrieves the Soundex code for a given string. |