Class CompactLabelToOrdinal
This is a very efficient LabelToOrdinal implementation that uses a Lucene.Net.Facet.Taxonomy.WriterCache.CharBlockArray to store all labels and a configurable number of Lucene.Net.Facet.Taxonomy.WriterCache.CompactLabelToOrdinal.HashArrays to reference the labels.
Since the Lucene.Net.Facet.Taxonomy.WriterCache.CompactLabelToOrdinal.HashArrays don't handle collisions, a CollisionMap is used to store the colliding labels.
This data structure grows by adding a new HashArray whenever the number of
collisions in the CollisionMap exceeds Lucene.Net.Facet.Taxonomy.WriterCache.CompactLabelToOrdinal.loadFactor
GetMaxOrdinal().
Growing also includes reinserting all colliding
labels into the Lucene.Net.Facet.Taxonomy.WriterCache.CompactLabelToOrdinal.HashArrays to possibly reduce the number of collisions.
For setting the Lucene.Net.Facet.Taxonomy.WriterCache.CompactLabelToOrdinal.loadFactor see CompactLabelToOrdinal(Int32, Single, Int32).
This data structure has a much lower memory footprint (~30%) compared to a Java HashMap<String, Integer>. It also only uses a small fraction of objects a HashMap would use, thus limiting the GC overhead. Ingestion speed was also ~50% faster compared to a HashMap for 3M unique labels.
@lucene.experimental
Inherited Members
Assembly: DistributedLucene.Net.Facet.dll
Syntax
public class CompactLabelToOrdinal : LabelToOrdinal
Constructors
Name | Description |
---|---|
CompactLabelToOrdinal(Int32, Single, Int32) | Sole constructor. |
Fields
Name | Description |
---|---|
DefaultLoadFactor | Default maximum load factor. |
TERMINATOR_CHAR |
Properties
Name | Description |
---|---|
SizeOfMap | How many labels. |
Methods
Name | Description |
---|---|
AddLabel(FacetLabel, Int32) | |
GetOrdinal(FacetLabel) |