Class Tokenizer
A Tokenizer is a TokenStream whose input is a
This is an abstract class; subclasses must override IncrementToken()
NOTE: Subclasses overriding IncrementToken() must call ClearAttributes() before setting attributes.
Inherited Members
Assembly: DistributedLucene.Net.dll
Syntax
public abstract class Tokenizer : TokenStream, IDisposable
Constructors
Name | Description |
---|---|
Tokenizer(AttributeSource.AttributeFactory, TextReader) | Construct a token stream processing the given input using the given AttributeSource.AttributeFactory. |
Tokenizer(TextReader) | Construct a token stream processing the given input. |
Fields
Name | Description |
---|---|
m_input | The text source for this Tokenizer. |
Methods
Name | Description |
---|---|
CorrectOffset(Int32) | Return the corrected offset. If m_input is a CharFilter subclass
this method calls CorrectOffset(Int32), else returns |
Dispose(Boolean) | Releases resources associated with this stream.
If you override this method, always call |
Reset() | |
SetReader(TextReader) | Expert: Set a new reader on the Tokenizer. Typically, an analyzer (in its tokenStream method) will use this to re-use a previously created tokenizer. |