Class CommonTermsQuery
A query that executes high-frequency terms in a optional sub-query to prevent
slow queries due to "common" terms like stopwords. This query
builds 2 queries off the Add(Term) added terms: low-frequency
terms are added to a required boolean clause and high-frequency terms are
added to an optional boolean clause. The optional clause is only executed if
the required "low-frequency" clause matches. Scores produced by this query
will be slightly different than plain BooleanQuery scorer mainly due to
differences in the
CommonTermsQuery has several advantages over stopword filtering at index or query time since a term can be "classified" based on the actual document frequency in the index and can prevent slow queries even across domains without specialized stopword files.
Note: if the query only contains high-frequency terms the query is rewritten into a plain conjunction query ie. all high-frequency terms need to match in order to match a document.
Collection initializer note: To create and populate a CommonTermsQuery in a single statement, you can use the following example as a guide:
var query = new CommonTermsQuery() {
new Term("field", "microsoft"),
new Term("field", "office")
};
Inherited Members
Assembly: DistributedLucene.Net.Queries.dll
Syntax
public class CommonTermsQuery : Query, IEnumerable<Term>
Constructors
Name | Description |
---|---|
CommonTermsQuery(Occur, Occur, Single) | Creates a new CommonTermsQuery |
CommonTermsQuery(Occur, Occur, Single, Boolean) | Creates a new CommonTermsQuery |
Fields
Name | Description |
---|---|
m_disableCoord | |
m_highFreqBoost | |
m_highFreqMinNrShouldMatch | |
m_highFreqOccur | |
m_lowFreqBoost | |
m_lowFreqMinNrShouldMatch | |
m_lowFreqOccur | |
m_maxTermFrequency | |
m_terms |
Properties
Name | Description |
---|---|
HighFreqMinimumNumberShouldMatch | Gets or Sets a minimum number of the high frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part. This method accepts a float value in the range [0..1) as a fraction of the actual query terms in the low frequent clause or a number >=1 as an absolut number of clauses that need to match. By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required. |
IsCoordDisabled | Returns true iff |
LowFreqMinimumNumberShouldMatch | Gets or Sets a minimum number of the low frequent optional BooleanClauses which must be satisfied in order to produce a match on the low frequency terms query part. This method accepts a float value in the range [0..1) as a fraction of the actual query terms in the low frequent clause or a number >=1 as an absolut number of clauses that need to match. By default no optional clauses are necessary for a match (unless there are no required clauses). If this method is used, then the specified number of clauses is required. |
Methods
Name | Description |
---|---|
Add(Term) | Adds a term to the CommonTermsQuery |
BuildQuery(Int32, TermContext[], Term[]) | |
CalcHighFreqMinimumNumberShouldMatch(Int32) | |
CalcLowFreqMinimumNumberShouldMatch(Int32) | |
CollectTermContext(IndexReader, IList<AtomicReaderContext>, TermContext[], Term[]) | |
Equals(Object) | |
ExtractTerms(ISet<Term>) | |
GetEnumerator() | Returns an enumerator that iterates through the m_terms collection. |
GetHashCode() | |
NewTermQuery(Term, TermContext) | Builds a new TermQuery instance. This is intended for subclasses that wish to customize the generated queries. |
Rewrite(IndexReader) | |
ToString(String) |