Class CompoundWordTokenFilterBase
Base class for decomposition token filters.
You must specify the required LuceneVersion compatibility when creating CompoundWordTokenFilterBase:
- As of 3.1, CompoundWordTokenFilterBase correctly handles Unicode 4.0 supplementary characters in strings and char arrays provided as compound word dictionaries.
- As of 4.4, CompoundWordTokenFilterBase doesn't update offsets.
Inherited Members
System.Object.Equals(System.Object, System.Object)
System.Object.ReferenceEquals(System.Object, System.Object)
System.Object.GetType()
System.Object.MemberwiseClone()
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
[Serializable]
public abstract class CompoundWordTokenFilterBase : TokenFilter, IDisposable
Constructors
Fields
Name | Description |
---|---|
DEFAULT_MAX_SUBWORD_SIZE | The default for maximal length of subwords that get propagated to the output of this filter |
DEFAULT_MIN_SUBWORD_SIZE | The default for minimal length of subwords that get propagated to the output of this filter |
DEFAULT_MIN_WORD_SIZE | The default for minimal word length that gets decomposed |
m_dictionary | |
m_matchVersion | |
m_maxSubwordSize | |
m_minSubwordSize | |
m_minWordSize | |
m_offsetAtt | |
m_onlyLongestMatch | |
m_termAtt | |
m_tokens |
Methods
Name | Description |
---|---|
Decompose() | Decomposes the current m_termAtt and places CompoundWordTokenFilterBase.CompoundToken instances in the m_tokens list. The original token may not be placed in the list, as it is automatically passed through this filter. |
IncrementToken() | |
Reset() |