Class ArabicNormalizer
Normalizer for Arabic.
Normalization is done in-place for efficiency, operating on a termbuffer.
Normalization is defined as:
- Normalization of hamza with alef seat to a bare alef.
- Normalization of teh marbuta to heh
- Normalization of dotless yeh (alef maksura) to yeh.
- Removal of Arabic diacritics (the harakat)
- Removal of tatweel (stretching character).
Inheritance
System.Object
ArabicNormalizer
Inherited Members
System.Object.ToString()
System.Object.Equals(System.Object)
System.Object.Equals(System.Object, System.Object)
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.GetHashCode()
System.Object.GetType()
System.Object.MemberwiseClone()
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
[Serializable]
public class ArabicNormalizer
Fields
Name | Description |
---|---|
ALEF | |
ALEF_HAMZA_ABOVE | |
ALEF_HAMZA_BELOW | |
ALEF_MADDA | |
DAMMA | |
DAMMATAN | |
DOTLESS_YEH | |
FATHA | |
FATHATAN | |
HEH | |
KASRA | |
KASRATAN | |
SHADDA | |
SUKUN | |
TATWEEL | |
TEH_MARBUTA | |
YEH |
Methods
Name | Description |
---|---|
Normalize(Char[], Int32) | Normalize an input buffer of Arabic text |