Class StringTokenizer
The StringTokenizer class allows an application to break a string into tokens by performing code point comparison. The StringTokenizer methods do not distinguish among identifiers, numbers, and quoted strings, nor do they recognize and skip comments.
Inheritance
Assembly: DistributedLucene.Net.dll
Syntax
public class StringTokenizer : object
Remarks
The set of delimiters (the codepoints that separate tokens) may be specified either at creation time or on a per-token basis.
An instance of StringTokenizer behaves in one of three ways,
depending on whether it was created with the returnDelimiters
flag
having the value true
or false
:
- If returnDelims is
false
, delimiter code points serve to separate tokens. A token is a maximal sequence of consecutive code points that are not delimiters. - If returnDelims is
true
, delimiter code points are themselves considered to be tokens. In this case a token will be received for each delimiter code point.
A token is thus either one delimiter code point, or a maximal sequence of consecutive code points that are not delimiters.
A StringTokenizer object internally maintains a current position within the string to be tokenized. Some operations advance this current position past the code point processed.
A token is returned by taking a substring of the string that was used to create the StringTokenizer object.
Here's an example of the use of the default delimiter StringTokenizer:
StringTokenizer st = new StringTokenizer("this is a test");
while (st.HasMoreTokens()) {
println(st.NextToken());
}
This prints the following output:
this
is
a
test
Here's an example of how to use a StringTokenizer with a user specified delimiter:
StringTokenizer st = new StringTokenizer(
"this is a test with supplementary characters \ud800\ud800\udc00\udc00",
" \ud800\udc00");
while (st.HasMoreTokens()) {
println(st.NextToken());
}
This prints the following output:
this
is
a
test
with
supplementary
characters
\ud800
\udc00
Constructors
Name | Description |
---|---|
StringTokenizer(String) | Constructs a new StringTokenizer for the parameter string using
whitespace as the delimiter. The Lucene.Net.Support.StringTokenizer.returnDelimiters flag is set to
|
StringTokenizer(String, String) | Constructs a new StringTokenizer for the parameter string using
the specified delimiters. The Lucene.Net.Support.StringTokenizer.returnDelimiters flag is set to
|
StringTokenizer(String, String, Boolean) | Constructs a new StringTokenizer for the parameter string using
the specified delimiters, returning the delimiters as tokens if the
parameter |
Methods
Name | Description |
---|---|
CountTokens() | Returns the number of unprocessed tokens remaining in the string. |
HasMoreTokens() | Returns |
NextToken() | Returns the next token in the string as a |
NextToken(String) | Returns the next token in the string as a |