Class StreamTokenizer
Parses a stream into a set of defined tokens, one at a time. The different types of tokens that can be found are numbers, identifiers, quoted strings, and different comment styles. The class can be used for limited processing of source code of programming languages like Java, although it is nowhere near a full parser.
Inheritance
Assembly: DistributedLucene.Net.dll
Syntax
public class StreamTokenizer : object
Constructors
Name | Description |
---|---|
StreamTokenizer(Stream) | Constructs a new StreamTokenizer with |
StreamTokenizer(TextReader) | Constructs a new {@code StreamTokenizer} with {@code r} as source reader. The tokenizer's initial state is as follows:
|
Fields
Name | Description |
---|---|
TT_EOF | The constant representing the end of the stream. |
TT_EOL | The constant representing the end of the line. |
TT_NUMBER | The constant representing a number token. |
TT_WORD | The constant representing a word token. |
Properties
Name | Description |
---|---|
IsEOLSignificant | Specifies whether the end of a line is significant and should be returned
as TT_EOF in TokenType by this tokenizer.
|
LineNumber | Gets the current line number. |
LowerCaseMode | Specifies whether word tokens should be converted to lower case when they
are stored in StringValue. |
NumberValue | Contains a number if the current token is a number (TokenType == TT_NUMBER). |
SlashSlashComments | Specifies whether "slash-slash" (C++-style) comments shall be recognized.
This kind of comment ends at the end of the line.
|
SlashStarComments | Specifies whether "slash-star" (C-style) comments shall be recognized.
Slash-star comments cannot be nested and end when a star-slash
combination is found.
|
StringValue | Contains a string if the current token is a word (TokenType == TT_WORD). |
TokenType | After calling {@code nextToken()}, {@code ttype} contains the type of token that has been read. When a single character is read, its value converted to an integer is stored in {@code ttype}. For a quoted string, the value is the quoted character. Otherwise, its value is one of the following:
|
Methods
Name | Description |
---|---|
CommentChar(Int32) | Specifies that the character |
NextToken() | Parses the next token from this tokenizer's source stream or reader. The type of the token is stored in the TokenType field, additional information may be stored in the NumberValue or StringValue fields. |
OrdinaryChar(Int32) | Specifies that the character |
OrdinaryChars(Int32, Int32) | Specifies that the characters in the range from |
ParseNumbers() | Specifies that this tokenizer shall parse numbers. |
PushBack() | Indicates that the current token should be pushed back and returned again the next time NextToken() is called. |
QuoteChar(Int32) | Specifies that the character |
ResetSyntax() | Specifies that all characters shall be treated as ordinary characters. |
ToString() | Returns the state of this tokenizer in a readable format. |
WhitespaceChars(Int32, Int32) | Specifies that the characters in the range from |
WordChars(Int32, Int32) | Specifies that the characters in the range from |