Class Tokenizer
A Tokenizer is a TokenStream whose input is a Reader.
This is an abstract class; subclasses must override IncrementToken()
NOTE: Subclasses overriding IncrementToken() must call Lucene.Net.Util.AttributeSource.ClearAttributes before setting attributes.
Inherited Members
Namespace:
Assembly: Lucene.Net.NetCore.dll
Syntax
public abstract class Tokenizer : TokenStream, IDisposable
Constructors
Name | Description |
---|---|
Tokenizer() | Construct a tokenizer with null input. |
Tokenizer(AttributeSource) | Construct a token stream processing the given input using the given AttributeSource. |
Tokenizer(AttributeSource, IO.TextReader) | Construct a token stream processing the given input using the given AttributeSource. |
Tokenizer(AttributeSource.AttributeFactory) | Construct a tokenizer with null input using the given AttributeFactory. |
Tokenizer(AttributeSource.AttributeFactory, IO.TextReader) | Construct a token stream processing the given input using the given AttributeFactory. |
Tokenizer(IO.TextReader) | Construct a token stream processing the given input. |
Fields
Name | Description |
---|---|
input | The text source for this Tokenizer. |
Methods
Name | Description |
---|---|
CorrectOffset(Int32) | Return the corrected offset. If input is a CharStream subclass
this method calls CorrectOffset(Int32), else returns |
Dispose(Boolean) | |
Reset(IO.TextReader) | Expert: Reset the tokenizer to a new reader. Typically, an analyzer (in its reusableTokenStream method) will use this to re-use a previously created tokenizer. |