Constructor WikipediaTokenizer
WikipediaTokenizer(TextReader)
Creates a new instance of the WikipediaTokenizer. Attaches the
input
to a newly created JFlex scanner.
Declaration
public WikipediaTokenizer(TextReader input)
Parameters
Type | Name | Description |
---|---|---|
System.IO.TextReader | input | The Input System.IO.TextReader |
WikipediaTokenizer(TextReader, Int32, ICollection<String>)
Creates a new instance of the WikipediaTokenizer. Attaches the
input
to a the newly created JFlex scanner.
Declaration
public WikipediaTokenizer(TextReader input, int tokenOutput, ICollection<string> untokenizedTypes)
Parameters
Type | Name | Description |
---|---|---|
System.IO.TextReader | input | The input |
System.Int32 | tokenOutput | One of TOKENS_ONLY, UNTOKENIZED_ONLY, BOTH |
System.Collections.Generic.ICollection<System.String> | untokenizedTypes | Untokenized types |
WikipediaTokenizer(AttributeSource.AttributeFactory, TextReader, Int32, ICollection<String>)
Creates a new instance of the WikipediaTokenizer. Attaches the
input
to a the newly created JFlex scanner. Uses the given AttributeSource.AttributeFactory.
Declaration
public WikipediaTokenizer(AttributeSource.AttributeFactory factory, TextReader input, int tokenOutput, ICollection<string> untokenizedTypes)
Parameters
Type | Name | Description |
---|---|---|
AttributeSource.AttributeFactory | factory | |
System.IO.TextReader | input | The input |
System.Int32 | tokenOutput | One of TOKENS_ONLY, UNTOKENIZED_ONLY, BOTH |
System.Collections.Generic.ICollection<System.String> | untokenizedTypes | Untokenized types |