Class BM25Similarity
BM25 Similarity. Introduced in Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. Okapi at TREC-3. In Proceedings of the Third Text REtrieval Conference (TREC 1994). Gaithersburg, USA, November 1994.
@lucene.experimental
Assembly: DistributedLucene.Net.dll
Syntax
public class BM25Similarity : Similarity
Constructors
Name | Description |
---|---|
BM25Similarity() | BM25 with these default values:
|
BM25Similarity(Single, Single) | BM25 with the supplied parameter values. |
Properties
Name | Description |
---|---|
B | Returns the |
DiscountOverlaps | Gets or Sets whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is true, meaning overlap tokens do not count when computing norms. |
K1 | Returns the |
Methods
Name | Description |
---|---|
AvgFieldLength(CollectionStatistics) | The default implementation computes the average as |
ComputeNorm(FieldInvertState) | |
ComputeWeight(Single, CollectionStatistics, TermStatistics[]) | |
DecodeNormValue(Byte) | The default implementation returns |
EncodeNormValue(Single, Int32) | The default implementation encodes |
GetSimScorer(Similarity.SimWeight, AtomicReaderContext) | |
Idf(Int64, Int64) | Implemented as |
IdfExplain(CollectionStatistics, TermStatistics) | Computes a score factor for a simple term and returns an explanation for that score factor. The default implementation uses:
Note that MaxDoc is used instead of Lucene.Net.Index.IndexReader.IntNumDocs because also DocFreq is used, and when the latter is inaccurate, so is MaxDoc, and in the same direction. In addition, MaxDoc is more efficient to compute |
IdfExplain(CollectionStatistics, TermStatistics[]) | Computes a score factor for a phrase. The default implementation sums the idf factor for each term in the phrase. |
ScorePayload(Int32, Int32, Int32, BytesRef) | The default implementation returns |
SloppyFreq(Int32) | Implemented as |
ToString() |