Class NumericUtils
This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.
To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. this reduces the number of terms dramatically.
This class generates terms to achieve this: First the numerical integer values need to
be converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned
and the bits are converted to ASCII chars with each 7 bit. The resulting byte[] is
sortable like the original integer value (even using UTF-8 sort order). Each value is also
prefixed (in the first char) by the shift
value (number of bits removed) used
during encoding.
To also index floating point numbers, this class supplies two methods to convert them
to integer values by changing their bit layout: DoubleToSortableInt64(Double),
SingleToSortableInt32(Single). You will have no precision loss by
converting floating point numbers to integers and back (only that the integer form
is not usable). Other data types like dates can easily converted to
For easy usage, the trie algorithm is implemented for indexing inside
NumericTokenStream that can index
This class can also be used, to generate lexicographically sortable (according to UTF8SortedAsUTF16Comparer) representations of numeric data types for other usages (e.g. sorting).
@lucene.internal @since 2.9, API changed non backwards-compliant in 4.0
Inheritance
Assembly: DistributedLucene.Net.dll
Syntax
public sealed class NumericUtils : object
Fields
Name | Description |
---|---|
BUF_SIZE_INT32 | The maximum term length (used for byte[] buffer size)
for encoding NOTE: This was BUF_SIZE_INT in Lucene |
BUF_SIZE_INT64 | The maximum term length (used for byte[] buffer size)
for encoding NOTE: This was BUF_SIZE_LONG in Lucene |
PRECISION_STEP_DEFAULT | The default precision step used by Int32Field, SingleField, Int64Field, DoubleField, NumericTokenStream, NumericRangeQuery, and NumericRangeFilter. |
SHIFT_START_INT32 | Integers are stored at lower precision by shifting off lower bits. The shift count is
stored as NOTE: This was SHIFT_START_INT in Lucene |
SHIFT_START_INT64 | Longs are stored at lower precision by shifting off lower bits. The shift count is
stored as NOTE: This was SHIFT_START_LONG in Lucene |
Methods
Name | Description |
---|---|
DoubleToSortableInt64(Double) | Converts a NOTE: This was doubleToSortableLong() in Lucene |
FilterPrefixCodedInt32s(TermsEnum) | Filters the given TermsEnum by accepting only prefix coded 32 bit
terms with a shift value of NOTE: This was filterPrefixCodedInts() in Lucene |
FilterPrefixCodedInt64s(TermsEnum) | Filters the given TermsEnum by accepting only prefix coded 64 bit
terms with a shift value of NOTE: This was filterPrefixCodedLongs() in Lucene |
GetPrefixCodedInt32Shift(BytesRef) | Returns the shift value from a prefix encoded NOTE: This was getPrefixCodedIntShift() in Lucene |
GetPrefixCodedInt64Shift(BytesRef) | Returns the shift value from a prefix encoded NOTE: This was getPrefixCodedLongShift() in Lucene |
Int32ToPrefixCoded(Int32, Int32, BytesRef) | Returns prefix coded bits after reducing the precision by NOTE: This was intToPrefixCoded() in Lucene |
Int32ToPrefixCodedBytes(Int32, Int32, BytesRef) | Returns prefix coded bits after reducing the precision by NOTE: This was intToPrefixCodedBytes() in Lucene |
Int64ToPrefixCoded(Int64, Int32, BytesRef) | Returns prefix coded bits after reducing the precision by NOTE: This was longToPrefixCoded() in Lucene |
Int64ToPrefixCodedBytes(Int64, Int32, BytesRef) | Returns prefix coded bits after reducing the precision by NOTE: This was longToPrefixCodedBytes() in Lucene |
PrefixCodedToInt32(BytesRef) | Returns an NOTE: This was prefixCodedToInt() in Lucene |
PrefixCodedToInt64(BytesRef) | Returns a NOTE: This was prefixCodedToLong() in Lucene |
SingleToSortableInt32(Single) | Converts a NOTE: This was floatToSortableInt() in Lucene |
SortableInt32ToSingle(Int32) | Converts a sortable NOTE: This was sortableIntToFloat() in Lucene |
SortableInt64ToDouble(Int64) | Converts a sortable NOTE: This was sortableLongToDouble() in Lucene |
SplitInt32Range(NumericUtils.Int32RangeBuilder, Int32, Int32, Int32) | Splits an This method is used by NumericRangeQuery. NOTE: This was splitIntRange() in Lucene |
SplitInt64Range(NumericUtils.Int64RangeBuilder, Int32, Int64, Int64) | Splits a long range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its AddRange(BytesRef, BytesRef) method. This method is used by NumericRangeQuery. NOTE: This was splitLongRange() in Lucene |