Class NumericUtils

This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. this reduces the number of terms dramatically.

This class generates terms to achieve this: First the numerical integer values need to be converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting byte[] is sortable like the original integer value (even using UTF-8 sort order). Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: DoubleToSortableInt64(Double), SingleToSortableInt32(Single). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to s or s (e.g. date to long: ).

For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index , , , and . For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.

This class can also be used, to generate lexicographically sortable (according to UTF8SortedAsUTF16Comparer) representations of numeric data types for other usages (e.g. sorting).

@lucene.internal @since 2.9, API changed non backwards-compliant in 4.0

Inheritance

System.Object

NumericUtils

Assembly: DistributedLucene.Net.dll

Syntax

public sealed class NumericUtils : object

Fields

Name	Description
BUF_SIZE_INT32	The maximum term length (used for byte[] buffer size) for encoding values. NOTE: This was BUF_SIZE_INT in Lucene
BUF_SIZE_INT64	The maximum term length (used for byte[] buffer size) for encoding values. NOTE: This was BUF_SIZE_LONG in Lucene
PRECISION_STEP_DEFAULT	The default precision step used by Int32Field, SingleField, Int64Field, DoubleField, NumericTokenStream, NumericRangeQuery, and NumericRangeFilter.
SHIFT_START_INT32	Integers are stored at lower precision by shifting off lower bits. The shift count is stored as `SHIFT_START_INT32+shift` in the first byte NOTE: This was SHIFT_START_INT in Lucene
SHIFT_START_INT64	Longs are stored at lower precision by shifting off lower bits. The shift count is stored as `SHIFT_START_INT64+shift` in the first byte NOTE: This was SHIFT_START_LONG in Lucene

Methods

Name	Description
DoubleToSortableInt64(Double)	Converts a value to a sortable signed . The value is converted by getting their IEEE 754 floating-point "double format" bit layout and then some bits are swapped, to be able to compare the result as . By this the precision is not reduced, but the value can easily used as a . The sort order (including ) is defined by ; `NaN` is greater than positive infinity. NOTE: This was doubleToSortableLong() in Lucene
FilterPrefixCodedInt32s(TermsEnum)	Filters the given TermsEnum by accepting only prefix coded 32 bit terms with a shift value of `0`. NOTE: This was filterPrefixCodedInts() in Lucene
FilterPrefixCodedInt64s(TermsEnum)	Filters the given TermsEnum by accepting only prefix coded 64 bit terms with a shift value of `0`. NOTE: This was filterPrefixCodedLongs() in Lucene
GetPrefixCodedInt32Shift(BytesRef)	Returns the shift value from a prefix encoded . NOTE: This was getPrefixCodedIntShift() in Lucene
GetPrefixCodedInt64Shift(BytesRef)	Returns the shift value from a prefix encoded . NOTE: This was getPrefixCodedLongShift() in Lucene
Int32ToPrefixCoded(Int32, Int32, BytesRef)	Returns prefix coded bits after reducing the precision by `shift` bits. This is method is used by NumericTokenStream. After encoding, `bytes.Offset` will always be 0. NOTE: This was intToPrefixCoded() in Lucene
Int32ToPrefixCodedBytes(Int32, Int32, BytesRef)	Returns prefix coded bits after reducing the precision by `shift` bits. This is method is used by NumericTokenStream. After encoding, `bytes.Offset` will always be 0. NOTE: This was intToPrefixCodedBytes() in Lucene
Int64ToPrefixCoded(Int64, Int32, BytesRef)	Returns prefix coded bits after reducing the precision by `shift` bits. This is method is used by NumericTokenStream. After encoding, `bytes.Offset` will always be 0. NOTE: This was longToPrefixCoded() in Lucene
Int64ToPrefixCodedBytes(Int64, Int32, BytesRef)	Returns prefix coded bits after reducing the precision by `shift` bits. This is method is used by NumericTokenStream. After encoding, `bytes.Offset` will always be 0. NOTE: This was longToPrefixCodedBytes() in Lucene
PrefixCodedToInt32(BytesRef)	Returns an from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value. NOTE: This was prefixCodedToInt() in Lucene
PrefixCodedToInt64(BytesRef)	Returns a from prefixCoded bytes. Rightmost bits will be zero for lower precision codes. This method can be used to decode a term's value. NOTE: This was prefixCodedToLong() in Lucene
SingleToSortableInt32(Single)	Converts a value to a sortable signed . The value is converted by getting their IEEE 754 floating-point "float format" bit layout and then some bits are swapped, to be able to compare the result as . By this the precision is not reduced, but the value can easily used as an . The sort order (including ) is defined by ; `NaN` is greater than positive infinity. NOTE: This was floatToSortableInt() in Lucene
SortableInt32ToSingle(Int32)	Converts a sortable back to a . NOTE: This was sortableIntToFloat() in Lucene
SortableInt64ToDouble(Int64)	Converts a sortable back to a . NOTE: This was sortableLongToDouble() in Lucene
SplitInt32Range(NumericUtils.Int32RangeBuilder, Int32, Int32, Int32)	Splits an range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its AddRange(BytesRef, BytesRef) method. This method is used by NumericRangeQuery. NOTE: This was splitIntRange() in Lucene
SplitInt64Range(NumericUtils.Int64RangeBuilder, Int32, Int64, Int64)	Splits a long range recursively. You may implement a builder that adds clauses to a BooleanQuery for each call to its AddRange(BytesRef, BytesRef) method. This method is used by NumericRangeQuery. NOTE: This was splitLongRange() in Lucene

Extension Methods

Number.IsNumber(Object)

SystemTypesHelpers.toString(Object)

SystemTypesHelpers.equals(Object, Object)