Class BloomFilteringPostingsFormat
A PostingsFormat useful for low doc-frequency fields such as primary keys. Bloom filters are maintained in a ".blm" file which offers "fast-fail" for reads in segments known to have no record of the key. A choice of delegate PostingsFormat is used to record all other Postings data.
A choice of BloomFilterFactory can be passed to tailor Bloom Filter settings on a per-field basis. The default configuration is DefaultBloomFilterFactory which allocates a ~8mb bitset and hashes values using MurmurHash2. This should be suitable for most purposes.
The format of the blm file is as follows:
- BloomFilter (.blm) --> Header, DelegatePostingsFormatName, NumFilteredFields, FilterNumFilteredFields, Footer
- Filter --> FieldNumber, FuzzySet
- FuzzySet -->See Serialize(DataOutput)
- Header --> CodecHeader (WriteHeader(DataOutput, String, Int32))
- DelegatePostingsFormatName --> String (WriteString(String)) The name of a ServiceProvider registered PostingsFormat
- NumFilteredFields --> Uint32 (WriteInt32(Int32))
- FieldNumber --> Uint32 (WriteInt32(Int32)) The number of the field in this segment
- Footer --> CodecFooter (WriteFooter(IndexOutput))
@lucene.experimental
Inherited Members
Assembly: Lucene.Net.Codecs.dll
Syntax
public sealed class BloomFilteringPostingsFormat : PostingsFormat
Constructors
Name | Description |
---|---|
BloomFilteringPostingsFormat() | Used only by core Lucene at read-time via Service Provider instantiation - do not use at Write-time in application code. |
BloomFilteringPostingsFormat(PostingsFormat) | Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data. This choice of constructor defaults to the DefaultBloomFilterFactory for configuring per-field BloomFilters. |
BloomFilteringPostingsFormat(PostingsFormat, BloomFilterFactory) | Creates Bloom filters for a selection of fields created in the index. This is recorded as a set of Bitsets held as a segment summary in an additional "blm" file. This PostingsFormat delegates to a choice of delegate PostingsFormat for encoding all other postings data. |
Fields
Name | Description |
---|---|
VERSION_CHECKSUM | |
VERSION_CURRENT | |
VERSION_START |
Methods
Name | Description |
---|---|
FieldsConsumer(SegmentWriteState) | |
FieldsProducer(SegmentReadState) |