Field SOUTH_EAST_ASIAN_TYPE

Chars in class \p{Line_Break = Complex_Context} are from South East Asian scripts (Thai, Lao, Myanmar, Khmer, etc.). Sequences of these are kept together as as a single token rather than broken up, because the logic required to break them at word boundaries is too complex for UAX#29.

See Unicode Line Breaking Algorithm: http://www.unicode.org/reports/tr14/#SA

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

public static readonly int SOUTH_EAST_ASIAN_TYPE

Returns

Type	Description
System.Int32

Field SOUTH_EAST_ASIAN_TYPE

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

Returns

Contact Us