Class CzechAnalyzer
Analyzer for Czech language.
Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.
You must specify the required LuceneVersion compatibility when creating CzechAnalyzer:
- As of 3.1, words are stemmed with CzechStemFilter
- As of 2.9, StopFilter preserves position increments
- As of 2.4, Tokens incorrectly identified as acronyms are corrected (see LUCENE-1068)
Inherited Members
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
[Serializable]
public sealed class CzechAnalyzer : StopwordAnalyzerBase, IDisposable
Constructors
Name | Description |
---|---|
CzechAnalyzer(LuceneVersion) | Builds an analyzer with the default stop words (DefaultStopSet). |
CzechAnalyzer(LuceneVersion, CharArraySet) | Builds an analyzer with the given stop words. |
CzechAnalyzer(LuceneVersion, CharArraySet, CharArraySet) | Builds an analyzer with the given stop words and a set of work to be excluded from the CzechStemFilter. |
Fields
Name | Description |
---|---|
DEFAULT_STOPWORD_FILE | File containing default Czech stopwords. |
Properties
Name | Description |
---|---|
DefaultStopSet | Returns a set of default Czech-stopwords |
Methods
Name | Description |
---|---|
CreateComponents(String, TextReader) | Creates TokenStreamComponents used to tokenize all the text in the provided System.IO.TextReader. |