Class CharTokenizer
An abstract base class for simple, character-oriented tokenizers.
Inherited Members
Namespace:
Assembly: Lucene.Net.NetCore.dll
Syntax
public abstract class CharTokenizer : Tokenizer, IDisposable
Constructors
Name | Description |
---|---|
CharTokenizer(AttributeSource, IO.TextReader) | |
CharTokenizer(AttributeSource.AttributeFactory, IO.TextReader) | |
CharTokenizer(IO.TextReader) |
Methods
Name | Description |
---|---|
End() | |
IncrementToken() | |
IsTokenChar(Char) | Returns true iff a character should be included in a token. This tokenizer generates as tokens adjacent sequences of characters which satisfy this predicate. Characters for which this is false are used to define token boundaries and are not included in tokens. |
Normalize(Char) | Called on each token character to normalize it before it is added to the token. The default implementation does nothing. Subclasses may use this to, e.g., lowercase tokens. |
Reset(IO.TextReader) |