Class WordlistLoader
Loader for text files that represent a list of stopwords.
IOUtils to obtain System.IO.TextReader instances. @lucene.internal
Inheritance
Inherited Members
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
[Serializable]
public class WordlistLoader
Methods
Name | Description |
---|---|
GetLines(Stream, Encoding) | Accesses a resource by name and returns the (non comment) lines containing data using the given character encoding. A comment line is any line that starts with the character "#" |
GetSnowballWordSet(TextReader, CharArraySet) | Reads stopwords from a stopword list in Snowball format. The snowball format is the following:
|
GetSnowballWordSet(TextReader, LuceneVersion) | Reads stopwords from a stopword list in Snowball format. The snowball format is the following:
|
GetStemDict(TextReader, CharArrayMap<String>) | Reads a stem dictionary. Each line contains:
(i.e. two tab separated words) |
GetWordSet(TextReader, CharArraySet) | Reads lines from a System.IO.TextReader and adds every line as an entry to a CharArraySet (omitting leading and trailing whitespace). Every line of the System.IO.TextReader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer). |
GetWordSet(TextReader, LuceneVersion) | Reads lines from a System.IO.TextReader and adds every line as an entry to a CharArraySet (omitting leading and trailing whitespace). Every line of the System.IO.TextReader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer). |
GetWordSet(TextReader, String, CharArraySet) | Reads lines from a System.IO.TextReader and adds every non-comment line as an entry to a CharArraySet (omitting leading and trailing whitespace). Every line of the System.IO.TextReader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer). |
GetWordSet(TextReader, String, LuceneVersion) | Reads lines from a System.IO.TextReader and adds every non-comment line as an entry to a CharArraySet (omitting leading and trailing whitespace). Every line of the System.IO.TextReader should contain only one word. The words need to be in lowercase if you make use of an Analyzer which uses LowerCaseFilter (like StandardAnalyzer). |