Namespace Lucene.Net.Analysis.Nl
Classes
DutchAnalyzer
Analyzer for Dutch language.
Supports an external list of stopwords (words that will not be indexed at all), an external list of exclusions (word that will not be stemmed, but indexed) and an external list of word-stem pairs that overrule the algorithm (dictionary stemming). A default set of stopwords is used unless an alternative list is specified, but the exclusion list is empty by default.
You must specify the required LuceneVersion compatibility when creating DutchAnalyzer:
- As of 3.6, DutchAnalyzer(LuceneVersion, CharArraySet) and DutchAnalyzer(LuceneVersion, CharArraySet, CharArraySet) also populate the default entries for the stem override dictionary
- As of 3.1, Snowball stemming is done with SnowballFilter, LowerCaseFilter is used prior to StopFilter, and Snowball stopwords are used by default.
- As of 2.9, StopFilter preserves position increments
NOTE: This class uses the same LuceneVersion dependent settings as StandardAnalyzer.
DutchStemFilter
A TokenFilter that stems Dutch words.
It supports a table of words that should not be stemmed at all. The stemmer used can be changed at runtime after the filter object is created (as long as it is a DutchStemmer).
To prevent terms from being stemmed use an instance of KeywordMarkerFilter or a custom TokenFilter that sets the KeywordAttribute before this TokenStream.
DutchStemmer
A stemmer for Dutch words.
The algorithm is an implementation of the dutch stemming algorithm in Martin Porter's snowball project.