Class CompoundWordTokenFilterBase

Base class for decomposition token filters.

You must specify the required LuceneVersion compatibility when creating CompoundWordTokenFilterBase:

As of 3.1, CompoundWordTokenFilterBase correctly handles Unicode 4.0 supplementary characters in strings and char arrays provided as compound word dictionaries.
As of 4.4, CompoundWordTokenFilterBase doesn't update offsets.

Inheritance

System.Object

AttributeSource

TokenStream

TokenFilter

CompoundWordTokenFilterBase

DictionaryCompoundWordTokenFilter

HyphenationCompoundWordTokenFilter

Inherited Members

TokenFilter.m_input

TokenFilter.End()

TokenFilter.Dispose(Boolean)

TokenStream.Dispose()

AttributeSource.GetAttributeFactory()

AttributeSource.GetAttributeClassesEnumerator()

AttributeSource.GetAttributeImplsEnumerator()

AttributeSource.AddAttributeImpl(Attribute)

AttributeSource.AddAttribute<T>()

AttributeSource.HasAttributes

AttributeSource.HasAttribute<T>()

AttributeSource.GetAttribute<T>()

AttributeSource.ClearAttributes()

AttributeSource.CaptureState()

AttributeSource.RestoreState(AttributeSource.State)

AttributeSource.GetHashCode()

AttributeSource.Equals(Object)

AttributeSource.ReflectAsString(Boolean)

AttributeSource.ReflectWith(IAttributeReflector)

AttributeSource.CloneAttributes()

AttributeSource.CopyTo(AttributeSource)

AttributeSource.ToString()

System.Object.Equals(System.Object, System.Object)

System.Object.ReferenceEquals(System.Object, System.Object)

System.Object.GetType()

System.Object.MemberwiseClone()

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

[Serializable]
public abstract class CompoundWordTokenFilterBase : TokenFilter, IDisposable

Constructors

Name	Description
CompoundWordTokenFilterBase(LuceneVersion, TokenStream, CharArraySet)
CompoundWordTokenFilterBase(LuceneVersion, TokenStream, CharArraySet, Boolean)
CompoundWordTokenFilterBase(LuceneVersion, TokenStream, CharArraySet, Int32, Int32, Int32, Boolean)

Fields

Name	Description
DEFAULT_MAX_SUBWORD_SIZE	The default for maximal length of subwords that get propagated to the output of this filter
DEFAULT_MIN_SUBWORD_SIZE	The default for minimal length of subwords that get propagated to the output of this filter
DEFAULT_MIN_WORD_SIZE	The default for minimal word length that gets decomposed
m_dictionary
m_matchVersion
m_maxSubwordSize
m_minSubwordSize
m_minWordSize
m_offsetAtt
m_onlyLongestMatch
m_termAtt
m_tokens

Methods

Name	Description
Decompose()	Decomposes the current m_termAtt and places CompoundWordTokenFilterBase.CompoundToken instances in the m_tokens list. The original token may not be placed in the list, as it is automatically passed through this filter.
IncrementToken()
Reset()

Extension Methods

Number.IsNumber(Object)

SystemTypesHelpers.toString(Object)

SystemTypesHelpers.equals(Object, Object)