Class SimilarityBase
A subclass of Similarity that provides a simplified API for its descendants. Subclasses are only required to implement the Score(BasicStats, Single, Single) and ToString() methods. Implementing Explain(Explanation, BasicStats, Int32, Single, Single) is optional, inasmuch as SimilarityBase already provides a basic explanation of the score and the term frequency. However, implementers of a subclass are encouraged to include as much detail about the scoring method as possible.
Note: multi-word queries such as phrase queries are scored in a different way than Lucene's default ranking algorithm: whereas it "fakes" an IDF value for the phrase as a whole (since it does not know it), this class instead scores phrases as a summation of the individual term scores.
@lucene.experimental
Assembly: DistributedLucene.Net.dll
Syntax
public abstract class SimilarityBase : Similarity
Constructors
Name | Description |
---|---|
SimilarityBase() | Sole constructor. (For invocation by subclass constructors, typically implicit.) |
Properties
Name | Description |
---|---|
DiscountOverlaps | Determines whether overlap tokens (Tokens with
0 position increment) are ignored when computing
norm. By default this is @lucene.experimental |
Methods
Name | Description |
---|---|
ComputeNorm(FieldInvertState) | Encodes the document length in the same way as TFIDFSimilarity. |
ComputeWeight(Single, CollectionStatistics, TermStatistics[]) | |
DecodeNormValue(Byte) | Decodes a normalization factor (document length) stored in an index. |
EncodeNormValue(Single, Single) | Encodes the length to a byte via SmallSingle. |
Explain(Explanation, BasicStats, Int32, Single, Single) | Subclasses should implement this method to explain the score. The default implementation does nothing. |
Explain(BasicStats, Int32, Explanation, Single) | Explains the score. The implementation here provides a basic explanation in the format Score(name-of-similarity, doc=doc-id, freq=term-frequency), computed from:, and attaches the score (computed via the Score(BasicStats, Single, Single) method) and the explanation for the term frequency. Subclasses content with this format may add additional details in Explain(Explanation, BasicStats, Int32, Single, Single). |
FillBasicStats(BasicStats, CollectionStatistics, TermStatistics) | Fills all member fields defined in BasicStats in |
GetSimScorer(Similarity.SimWeight, AtomicReaderContext) | |
Log2(Double) | Returns the base two logarithm of |
NewStats(String, Single) | Factory method to return a custom stats object |
Score(BasicStats, Single, Single) | Scores the document Subclasses must apply their scoring formula in this class. |
ToString() | Subclasses must override this method to return the name of the Similarity and preferably the values of parameters (if any) as well. |