Class StreamTokenizer

Parses a stream into a set of defined tokens, one at a time. The different types of tokens that can be found are numbers, identifiers, quoted strings, and different comment styles. The class can be used for limited processing of source code of programming languages like Java, although it is nowhere near a full parser.

Inheritance

System.Object

StreamTokenizer

Assembly: DistributedLucene.Net.dll

Syntax

public class StreamTokenizer : object

Constructors

Name	Description
StreamTokenizer(Stream)	Constructs a new StreamTokenizer with `input` as source input stream. This constructor is deprecated; instead, the constructor that takes a as an arugment should be used.
StreamTokenizer(TextReader)	Constructs a new {@code StreamTokenizer} with {@code r} as source reader. The tokenizer's initial state is as follows: All byte values 'A' through 'Z', 'a' through 'z', and '\u00A0' through '\u00FF' are considered to be alphabetic. All byte values '\u0000' through '\u0020' are considered to be white space. '/' is a comment character. Single quote ''' and double quote '"' are string quote characters. Numbers are parsed. End of lines are considered to be white space rather than separate tokens. C-style and C++-style comments are not recognized.

Fields

Name	Description
TT_EOF	The constant representing the end of the stream.
TT_EOL	The constant representing the end of the line.
TT_NUMBER	The constant representing a number token.
TT_WORD	The constant representing a word token.

Properties

Name	Description
IsEOLSignificant	Specifies whether the end of a line is significant and should be returned as TT_EOF in TokenType by this tokenizer. `true` if EOL is significant, `false` otherwise.
LineNumber	Gets the current line number.
LowerCaseMode	Specifies whether word tokens should be converted to lower case when they are stored in StringValue. `true` if StringValue should be converted to lower case, `false` otherwise.
NumberValue	Contains a number if the current token is a number (TokenType == TT_NUMBER).
SlashSlashComments	Specifies whether "slash-slash" (C++-style) comments shall be recognized. This kind of comment ends at the end of the line. `true` if `//` should be recognized as the start of a comment, `false` otherwise.
SlashStarComments	Specifies whether "slash-star" (C-style) comments shall be recognized. Slash-star comments cannot be nested and end when a star-slash combination is found. `true` if `/*` should be recognized as the start of a comment, `false` otherwise.
StringValue	Contains a string if the current token is a word (TokenType == TT_WORD).
TokenType	After calling {@code nextToken()}, {@code ttype} contains the type of token that has been read. When a single character is read, its value converted to an integer is stored in {@code ttype}. For a quoted string, the value is the quoted character. Otherwise, its value is one of the following: TT_WORD - the token is a word. TT_NUMBER - the token is a number. TT_EOL - the end of line has been reached. Depends on whether IsEOLSignificant is `true`. TT_EOF - the end of the stream has been reached.

Methods

Name	Description
CommentChar(Int32)	Specifies that the character `ch` shall be treated as a comment character.
NextToken()	Parses the next token from this tokenizer's source stream or reader. The type of the token is stored in the TokenType field, additional information may be stored in the NumberValue or StringValue fields.
OrdinaryChar(Int32)	Specifies that the character `ch` shall be treated as an ordinary character by this tokenizer. That is, it has no special meaning as a comment character, word component, white space, string delimiter or number.
OrdinaryChars(Int32, Int32)	Specifies that the characters in the range from `low` to `hi` shall be treated as an ordinary character by this tokenizer. That is, they have no special meaning as a comment character, word component, white space, string delimiter or number.
ParseNumbers()	Specifies that this tokenizer shall parse numbers.
PushBack()	Indicates that the current token should be pushed back and returned again the next time NextToken() is called.
QuoteChar(Int32)	Specifies that the character `ch` shall be treated as a quote character.
ResetSyntax()	Specifies that all characters shall be treated as ordinary characters.
ToString()	Returns the state of this tokenizer in a readable format.
WhitespaceChars(Int32, Int32)	Specifies that the characters in the range from `low` to `hi` shall be treated as whitespace characters by this tokenizer.
WordChars(Int32, Int32)	Specifies that the characters in the range from `low` to `hi` shall be treated as word characters by this tokenizer. A word consists of a word character followed by zero or more word or number characters.

Extension Methods

Number.IsNumber(Object)

SystemTypesHelpers.toString(Object)

SystemTypesHelpers.equals(Object, Object)