Class QueryAutoStopWordAnalyzer
An Analyzer used primarily at query time to wrap another analyzer and provide a layer of protection which prevents very common words from being passed into queries.
For very large indexes the cost of reading TermDocs for a very common word can be high. This analyzer was created after experience with a 38 million doc index which had a term in around 50% of docs and was causing TermQueries for this term to take 2 seconds.
Inherited Members
Assembly: Lucene.Net.Analysis.Common.dll
Syntax
[Serializable]
public sealed class QueryAutoStopWordAnalyzer : AnalyzerWrapper, IDisposable
Constructors
Name | Description |
---|---|
QueryAutoStopWordAnalyzer(LuceneVersion, Analyzer, IndexReader) | Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for all indexed fields from terms with a document frequency percentage greater than defaultMaxDocFreqPercent |
QueryAutoStopWordAnalyzer(LuceneVersion, Analyzer, IndexReader, ICollection<String>, Int32) | Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
given selection of fields from terms with a document frequency greater than
the given |
QueryAutoStopWordAnalyzer(LuceneVersion, Analyzer, IndexReader, ICollection<String>, Single) | Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
given selection of fields from terms with a document frequency percentage
greater than the given |
QueryAutoStopWordAnalyzer(LuceneVersion, Analyzer, IndexReader, Int32) | Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for all
indexed fields from terms with a document frequency greater than the given
|
QueryAutoStopWordAnalyzer(LuceneVersion, Analyzer, IndexReader, Single) | Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for all
indexed fields from terms with a document frequency percentage greater than
the given |
Fields
Name | Description |
---|---|
defaultMaxDocFreqPercent |
Methods
Name | Description |
---|---|
GetStopWords() | Provides information on which stop words have been identified for all fields |
GetStopWords(String) | Provides information on which stop words have been identified for a field |
GetWrappedAnalyzer(String) | |
WrapComponents(String, TokenStreamComponents) |