Enum JapaneseTokenizerMode
Tokenization mode: this determines how the tokenizer handles compound and unknown words.
Assembly: Lucene.Net.Analysis.Kuromoji.dll
Syntax
public enum JapaneseTokenizerMode : int
Fields
Name | Description |
---|---|
EXTENDED | Extended mode outputs unigrams for unknown words. @lucene.experimental |
NORMAL | Ordinary segmentation: no decomposition for compounds, |
SEARCH | Segmentation geared towards search: this includes a decompounding process for long nouns, also including the full compound token as a synonym. |