Namespace Lucene.Net.Util

Classes

AlreadySetException

Thrown when Set(T) is called more than once.

ArrayUtil

Methods for manipulating arrays.

@lucene.internal

Attribute

Base class for Attributes that can be added to a AttributeSource.

Attributes are used to add data in a dynamic, yet type-safe way to a source of usually streamed objects, e. g. a TokenStream.

An AttributeSource contains a list of different Attributes, and methods to add and get them. There can only be a single instance of an attribute in the same AttributeSource instance. This is ensured by passing in the actual type of the IAttribute to the AddAttribute<T>(), which then checks if an instance of that type is already present. If yes, it returns the instance, otherwise it creates a new instance and returns it.

AttributeSource.AttributeFactory

An AttributeSource.AttributeFactory creates instances of Attributes.

AttributeSource.State

This class holds the state of an AttributeSource.

BaseDocIdSetTestCase<T>

Base test class for s.

Bits

Bits.MatchAllBits

Bits impl of the specified length with all bits set.

Bits.MatchNoBits

Bits impl of the specified length with no bits set.

BitUtil

A variety of high efficiency bit twiddling routines.

@lucene.internal

BroadWord

Methods and constants inspired by the article "Broadword Implementation of Rank/Select Queries" by Sebastiano Vigna, January 30, 2012:

algorithm 1: Lucene.Net.Util.BroadWord.BitCount(System.Int64), count of set bits in a
algorithm 2: Select(Int64, Int32), selection of a set bit in a ,
bytewise signed smaller <₈ operator: SmallerUpTo7_8(Int64, Int64).
shortwise signed smaller <₁₆ operator: SmallerUpto15_16(Int64, Int64).
some of the Lk and Hk constants that are used by the above: L8 L8_L, H8 H8_L, L9 L9_L, L16 L16_Land H16 H8_L.

@lucene.internal

ByteBlockPool

ByteBlockPool.Allocator

Abstract class for allocating and freeing byte blocks.

ByteBlockPool.DirectAllocator

A simple ByteBlockPool.Allocator that never recycles.

ByteBlockPool.DirectTrackingAllocator

A simple ByteBlockPool.Allocator that never recycles, but tracks how much total RAM is in use.

BytesRef

Represents byte[], as a slice (offset + length) into an existing byte[]. The Bytes property should never be null; use EMPTY_BYTES if necessary.

Important note: Unless otherwise noted, Lucene uses this class to represent terms that are encoded as UTF8 bytes in the index. To convert them to a .NET (which is UTF16), use Utf8ToString(). Using code like new String(bytes, offset, length) to do this is wrong, as it does not respect the correct character set and may return wrong results (depending on the platform's defaults)!

BytesRefArray

A simple append only random-access BytesRef array that stores full copies of the appended bytes in a ByteBlockPool.

Note: this class is not Thread-Safe!

@lucene.internal @lucene.experimental

BytesRefHash

BytesRefHash is a special purpose hash-map like data-structure optimized for BytesRef instances. BytesRefHash maintains mappings of byte arrays to ids (Map<BytesRef,int>) storing the hashed bytes efficiently in continuous storage. The mapping to the id is encapsulated inside BytesRefHash and is guaranteed to be increased for each added BytesRef.

Note: The maximum capacity BytesRef instance passed to Add(BytesRef) must not be longer than BYTE_BLOCK_SIZE-2. The internal storage is limited to 2GB total byte storage.

@lucene.internal

BytesRefHash.BytesStartArray

Manages allocation of the per-term addresses.

BytesRefHash.DirectBytesStartArray

A simple BytesRefHash.BytesStartArray that tracks memory allocation using a private Counter instance.

BytesRefHash.MaxBytesLengthExceededException

Thrown if a BytesRef exceeds the BytesRefHash limit of BYTE_BLOCK_SIZE-2.

BytesRefIterator

LUCENENET specific class to make the syntax of creating an empty IBytesRefIterator the same as it was in Lucene. Example:

var iter = BytesRefIterator.Empty;

CharsRef

Represents char[], as a slice (offset + Length) into an existing char[]. The Chars property should never be null; use EMPTY_CHARS if necessary.

@lucene.internal

CollectionUtil

Methods for manipulating (sorting) collections. Sort methods work directly on the supplied lists and don't copy to/from arrays before/after. For medium size collections as used in the Lucene indexer that is much more efficient.

@lucene.internal

CommandLineUtil

Class containing some useful methods used by command line tools

Constants

Some useful constants.

Counter

Simple counter class

@lucene.internal @lucene.experimental

DisposableThreadLocal<T>

Java's builtin ThreadLocal has a serious flaw: it can take an arbitrarily long amount of time to dereference the things you had stored in it, even once the ThreadLocal instance itself is no longer referenced. This is because there is single, master map stored for each thread, which all ThreadLocals share, and that master map only periodically purges "stale" entries.

While not technically a memory leak, because eventually the memory will be reclaimed, it can take a long time and you can easily hit because from the GC's standpoint the stale entries are not reclaimable.

This class works around that, by only enrolling WeakReference values into the ThreadLocal, and separately holding a hard reference to each stored value. When you call Dispose(), these hard references are cleared and then GC is freely able to reclaim space by objects stored in it.

You should not call Dispose() until all threads are done using the instance.

@lucene.internal

DocIdBitSet

Simple DocIdSet and DocIdSetIterator backed by a

DoubleBarrelLRUCache

LUCENENET specific class to nest the DoubleBarrelLRUCache.CloneableKey so it can be accessed without referencing the generic closing types of DoubleBarrelLRUCache<TKey, TValue>.

DoubleBarrelLRUCache.CloneableKey

Object providing clone(); the key class must subclass this.

DoubleBarrelLRUCache<TKey, TValue>

Simple concurrent LRU cache, using a "double barrel" approach where two ConcurrentHashMaps record entries.

At any given time, one hash is primary and the other is secondary. Get(TKey) first checks primary, and if that's a miss, checks secondary. If secondary has the entry, it's promoted to primary (NOTE: the key is cloned at this point). Once primary is full, the secondary is cleared and the two are swapped.

This is not as space efficient as other possible concurrent approaches (see LUCENE-2075): to achieve perfect LRU(N) it requires 2*N storage. But, this approach is relatively simple and seems in practice to not grow unbounded in size when under hideously high load.

@lucene.internal

English

Converts numbers to english strings for testing. @lucene.internal

ExcludeServiceAttribute

Base class for Attribute types that exclude services from Reflection scanning.

FailOnNonBulkMergesInfoStream

Hackidy-H�ck-Hack to cause a test to fail on non-bulk merges

FailureMarker

A that detects suite/ test failures. We need it because failures due to thread leaks happen outside of any rule contexts.

FieldCacheSanityChecker

Provides methods for sanity checking that entries in the FieldCache are not wasteful or inconsistent.

Lucene 2.9 Introduced numerous enhancements into how the FieldCache is used by the low levels of Lucene searching (for Sorting and ValueSourceQueries) to improve both the speed for Sorting, as well as reopening of IndexReaders. But these changes have shifted the usage of FieldCache from "top level" IndexReaders (frequently a MultiReader or DirectoryReader) down to the leaf level SegmentReaders. As a result, existing applications that directly access the FieldCache may find RAM usage increase significantly when upgrading to 2.9 or Later. This class provides an API for these applications (or their Unit tests) to check at run time if the FieldCache contains "insane" usages of the FieldCache.

@lucene.experimental

FieldCacheSanityChecker.Insanity

Simple container for a collection of related FieldCache.CacheEntry objects that in conjunction with each other represent some "insane" usage of the IFieldCache.

FieldCacheSanityChecker.InsanityType

An Enumeration of the different types of "insane" behavior that may be detected in a IFieldCache.

FilterIterator<T>

An implementation that filters elements with a boolean predicate.

FixedBitSet

BitSet of fixed length (numBits), backed by accessible (GetBits()) long[], accessed with an int index, implementing GetBits() and DocIdSet. If you need to manage more than 2.1B bits, use Int64BitSet.

@lucene.internal

FixedBitSet.FixedBitSetIterator

A DocIdSetIterator which iterates over set bits in a FixedBitSet.

GrowableByteArrayDataOutput

A DataOutput that can be used to build a byte[].

@lucene.internal

IndexableBinaryStringTools

Provides support for converting byte sequences to s and back again. The resulting s preserve the original byte sequences' sort order.

The s are constructed using a Base 8000h encoding of the original binary data - each char of an encoded represents a 15-bit chunk from the byte sequence. Base 8000h was chosen because it allows for all lower 15 bits of char to be used without restriction; the surrogate range [U+D8000-U+DFFF] does not represent valid chars, and would require complicated handling to avoid them and allow use of char's high bit.

Although unset bits are used as padding in the final char, the original byte sequence could contain trailing bytes with no set bits (null bytes): padding is indistinguishable from valid information. To overcome this problem, a char is appended, indicating the number of encoded bytes in the final content char.

@lucene.experimental

InfoStream

Debugging API for Lucene classes such as IndexWriter and SegmentInfos.

NOTE: Enabling infostreams may cause performance degradation in some components.

@lucene.internal

InPlaceMergeSorter

Sorter implementation based on the merge-sort algorithm that merges in place (no extra memory will be allocated). Small arrays are sorted with insertion sort.

@lucene.internal

Int32BlockPool

A pool for blocks similar to ByteBlockPool.

NOTE: This was IntBlockPool in Lucene

@lucene.internal

Int32BlockPool.Allocator

Abstract class for allocating and freeing blocks.

Int32BlockPool.DirectAllocator

A simple Int32BlockPool.Allocator that never recycles.

Int32BlockPool.SliceReader

A Int32BlockPool.SliceReader that can read slices written by a Int32BlockPool.SliceWriter.

@lucene.internal

Int32BlockPool.SliceWriter

A Int32BlockPool.SliceWriter that allows to write multiple integer slices into a given Int32BlockPool.

@lucene.internal

Int32sRef

Represents int[], as a slice (offset + length) into an existing int[]. The Int32s member should never be null; use EMPTY_INT32S if necessary.

NOTE: This was IntsRef in Lucene

@lucene.internal

Int64BitSet

BitSet of fixed length (Lucene.Net.Util.Int64BitSet.numBits), backed by accessible (GetBits()) long[], accessed with a index. Use it only if you intend to store more than 2.1B bits, otherwise you should use FixedBitSet.

NOTE: This was LongBitSet in Lucene

@lucene.internal

Int64sRef

Represents long[], as a slice (offset + length) into an existing long[]. The Int64s member should never be null; use EMPTY_INT64S if necessary.

NOTE: This was LongsRef in Lucene

@lucene.internal

Int64Values

Abstraction over an array of s. This class extends NumericDocValues so that we don't need to add another level of abstraction every time we want eg. to use the PackedInt32s utility classes to represent a NumericDocValues instance.

NOTE: This was LongValues in Lucene

@lucene.internal

IntroSorter

Sorter implementation based on a variant of the quicksort algorithm called introsort: when the recursion level exceeds the log of the length of the array to sort, it falls back to heapsort. This prevents quicksort from running into its worst-case quadratic runtime. Small arrays are sorted with insertion sort.

@lucene.internal

IOUtils

This class emulates the new Java 7 "Try-With-Resources" statement. Remove once Lucene is on Java 7.

@lucene.internal

LineFileDocs

Minimal port of benchmark's LneDocSource + DocMaker, so tests can enum docs from a line file created by benchmark's WriteLineDoc task

LuceneTestCase

LuceneTestCase.ConcurrentMergeSchedulerFactories

Contains a list of all the Func<IConcurrentMergeSchedulers> to be tested. Delegate method allows them to be created on their target thread instead of the test thread and also ensures a separate instance is created in each case (which can affect the result of the test).

LUCENENET specific

LuceneTestCase.SuppressCodecsAttribute

Annotation for test classes that should avoid certain codec types (because they are expensive, for example).

LuceneTestCase.SuppressTempFileChecks

LuceneVersionExtensions

Extension methods to the LuceneVersion enumeration to provide version comparison and parsing functionality.

MapOfSets<TKey, TValue>

Helper class for keeping Lists of Objects associated with keys. WARNING: this CLASS IS NOT THREAD SAFE

@lucene.internal

MathUtil

Math static utility methods.

MergedIterator<T>

Provides a merged sorted view from several sorted iterators.

If built with Lucene.Net.Util.MergedIterator`1.removeDuplicates set to true and an element appears in multiple iterators then it is deduplicated, that is this iterator returns the sorted union of elements.

If built with Lucene.Net.Util.MergedIterator`1.removeDuplicates set to false then all elements in all iterators are returned.

Caveats:

The behavior is undefined if the iterators are not actually sorted.
Null elements are unsupported.
If Lucene.Net.Util.MergedIterator`1.removeDuplicates is set to true and if a single iterator contains duplicates then they will not be deduplicated.
When elements are deduplicated it is not defined which one is returned.
If Lucene.Net.Util.MergedIterator`1.removeDuplicates is set to false then the order in which duplicates are returned isn't defined.

@lucene.internal

NamedServiceFactory<TService>

LUCENENET specific abstract class containing common fuctionality for named service factories.

NullInfoStream

Prints nothing. Just to make sure tests pass w/ and without enabled InfoStream without actually making noise. @lucene.experimental

NumericUtils

This is a helper class to generate prefix-encoded representations for numerical values and supplies converters to represent float/double values as sortable integers/longs.

To quickly execute range queries in Apache Lucene, a range is divided recursively into multiple intervals for searching: The center of the range is searched only with the lowest possible precision in the trie, while the boundaries are matched more exactly. this reduces the number of terms dramatically.

This class generates terms to achieve this: First the numerical integer values need to be converted to bytes. For that integer values (32 bit or 64 bit) are made unsigned and the bits are converted to ASCII chars with each 7 bit. The resulting byte[] is sortable like the original integer value (even using UTF-8 sort order). Each value is also prefixed (in the first char) by the shift value (number of bits removed) used during encoding.

To also index floating point numbers, this class supplies two methods to convert them to integer values by changing their bit layout: DoubleToSortableInt64(Double), SingleToSortableInt32(Single). You will have no precision loss by converting floating point numbers to integers and back (only that the integer form is not usable). Other data types like dates can easily converted to s or s (e.g. date to long: ).

For easy usage, the trie algorithm is implemented for indexing inside NumericTokenStream that can index , , , and . For querying, NumericRangeQuery and NumericRangeFilter implement the query part for the same data types.

This class can also be used, to generate lexicographically sortable (according to UTF8SortedAsUTF16Comparer) representations of numeric data types for other usages (e.g. sorting).

@lucene.internal @since 2.9, API changed non backwards-compliant in 4.0

NumericUtils.Int32RangeBuilder

Callback for SplitInt32Range(NumericUtils.Int32RangeBuilder, Int32, Int32, Int32). You need to override only one of the methods.

NOTE: This was IntRangeBuilder in Lucene

@lucene.internal @since 2.9, API changed non backwards-compliant in 4.0

NumericUtils.Int64RangeBuilder

Callback for SplitInt64Range(NumericUtils.Int64RangeBuilder, Int32, Int64, Int64). You need to override only one of the methods.

NOTE: This was LongRangeBuilder in Lucene

@lucene.internal @since 2.9, API changed non backwards-compliant in 4.0

OfflineSorter

On-disk sorting of byte arrays. Each byte array (entry) is a composed of the following fields:

(two bytes) length of the following byte array,
exactly the above count of bytes for the sequence to be sorted.

OfflineSorter.BufferSize

A bit more descriptive unit for constructors.

OfflineSorter.ByteSequencesReader

Utility class to read length-prefixed byte[] entries from an input. Complementary to OfflineSorter.ByteSequencesWriter.

OfflineSorter.ByteSequencesWriter

Utility class to emit length-prefixed byte[] entries to an output stream for sorting. Complementary to OfflineSorter.ByteSequencesReader.

OfflineSorter.SortInfo

Sort info (debugging mostly).

OpenBitSet

An "open" BitSet implementation that allows direct access to the array of words storing the bits.

NOTE: This can be used in .NET any place where a java.util.BitSet is used in Java.

Unlike java.util.BitSet, the fact that bits are packed into an array of longs is part of the interface. This allows efficient implementation of other algorithms by someone other than the author. It also allows one to efficiently implement alternate serialization or interchange formats.

OpenBitSet is faster than java.util.BitSet in most operations and much faster at calculating cardinality of sets and results of set operations. It can also handle sets of larger cardinality (up to 64 * 2**32-1)

The goals of OpenBitSet are the fastest implementation possible, and maximum code reuse. Extra safety and encapsulation may always be built on top, but if that's built in, the cost can never be removed (and hence people re-implement their own version in order to get better performance).

Performance Results

Test system: Pentium 4, Sun Java 1.5_06 -server -Xbatch -Xmx64M

BitSet size = 1,000,000

Results are java.util.BitSet time divided by OpenBitSet time.

cardinalityIntersectionCountUnionNextSetBitGetGetIterator
50% full	3.363.961.441.461.991.58
1% full	3.313.90 1.04 0.99

Test system: AMD Opteron, 64 bit linux, Sun Java 1.5_06 -server -Xbatch -Xmx64M

BitSet size = 1,000,000

Results are java.util.BitSet time divided by OpenBitSet time.

cardinalityIntersectionCountUnionNextSetBitGetGetIterator
50% full	2.503.501.001.031.121.25
1% full	2.513.49 1.00 1.02

OpenBitSetDISI

OpenBitSet with added methods to bulk-update the bits from a DocIdSetIterator. (DISI stands for DocIdSetIterator).

OpenBitSetIterator

An iterator to iterate over set bits in an OpenBitSet. this is faster than NextSetBit(Int64) for iterating over the complete set of bits, especially when the density of the bits set is high.

PagedBytes

Represents a logical byte[] as a series of pages. You can write-once into the logical byte[] (append only), using copy, and then retrieve slices (BytesRef) into it using fill.

@lucene.internal

PagedBytes.PagedBytesDataInput

PagedBytes.PagedBytesDataOutput

PagedBytes.Reader

Provides methods to read BytesRefs from a frozen PagedBytes.

Paths

The static accessor class for file paths used in testing.

PForDeltaDocIdSet

DocIdSet implementation based on pfor-delta encoding.

This implementation is inspired from LinkedIn's Kamikaze (http://data.linkedin.com/opensource/kamikaze) and Daniel Lemire's JavaFastPFOR (https://github.com/lemire/JavaFastPFOR).

On the contrary to the original PFOR paper, exceptions are encoded with FOR instead of Simple16.

PForDeltaDocIdSet.Builder

A builder for PForDeltaDocIdSet.

PrintStreamInfoStream

LUCENENET specific stub to assist with migration to TextWriterInfoStream.

PriorityQueue<T>

A PriorityQueue<T> maintains a partial ordering of its elements such that the element with least priority can always be found in constant time. Put()'s and Pop()'s require log(size) time.

NOTE: this class will pre-allocate a full array of length maxSize+1 if instantiated via the PriorityQueue(Int32, Boolean) constructor with prepopulate set to true. That maximum size can grow as we insert elements over the time.

@lucene.internal

QueryBuilder

Creates queries from the Analyzer chain.

Example usage:

    QueryBuilder builder = new QueryBuilder(analyzer);
    Query a = builder.CreateBooleanQuery("body", "just a test");
    Query b = builder.CreatePhraseQuery("body", "another test");
    Query c = builder.CreateMinShouldMatchQuery("body", "another test", 0.5f);

This can also be used as a subclass for query parsers to make it easier to interact with the analysis chain. Factory methods such as NewTermQuery(Term) are provided so that the generated queries can be customized.

QuickPatchThreadsFilter

Last minute patches. TODO: remove when integrated in system filters in rr.

RamUsageEstimator

Estimates the size (memory representation) of .NET objects.

@lucene.internal

RecyclingByteBlockAllocator

A ByteBlockPool.Allocator implementation that recycles unused byte blocks in a buffer and reuses them in subsequent calls to GetByteBlock().

Note: this class is not thread-safe.

@lucene.internal

RecyclingInt32BlockAllocator

A Int32BlockPool.Allocator implementation that recycles unused blocks in a buffer and reuses them in subsequent calls to GetInt32Block().

Note: this class is not thread-safe.

NOTE: This was RecyclingIntBlockAllocator in Lucene

@lucene.internal

RefCount<T>

Manages reference counting for a given object. Extensions can override Release() to do custom logic when reference counting hits 0.

RollingBuffer

LUCENENET specific class to allow referencing static members of RollingBuffer<T> without referencing its generic closing type.

RollingBuffer<T>

Acts like forever growing T[], but internally uses a circular buffer to reuse instances of .

@lucene.internal

RunListenerPrintReproduceInfo

A suite listener printing a "reproduce string". this ensures test result events are always captured properly even if exceptions happen at initialization or suite/ hooks level.

SentinelInt32Set

A native hash-based set where one value is reserved to mean "EMPTY" internally. The space overhead is fairly low as there is only one power-of-two sized int[] to hold the values. The set is re-hashed when adding a value that would make it >= 75% full. Consider extending and over-riding Hash(Int32) if the values might be poor hash keys; Lucene docids should be fine. The internal fields are exposed publicly to enable more efficient use at the expense of better O-O principles.

To iterate over the integers held in this set, simply use code like this:

SentinelIntSet set = ...
foreach (int v in set.keys) 
{
    if (v == set.EmptyVal)
        continue;
    //use v...
}

NOTE: This was SentinelIntSet in Lucene

@lucene.internal

ServiceNameAttribute

LUCENENET specific abstract class for s that can be used to override the default convention-based names of services. For example, "Lucene40Codec" will by convention be named "Lucene40". Using the CodecNameAttribute, the name can be overridden with a custom value.

SetOnce<T>

A convenient class which offers a semi-immutable object wrapper implementation which allows one to set the value of an object exactly once, and retrieve it many times. If Set(T) is called more than once, AlreadySetException is thrown and the operation will fail.

@lucene.experimental

SloppyMath

Math functions that trade off accuracy for speed.

SmallSingle

Floating point numbers smaller than 32 bits.

NOTE: This was SmallFloat in Lucene

@lucene.internal

Sorter

Base class for sorting algorithms implementations.

@lucene.internal

SPIClassIterator<S>

Helper class for loading SPI classes from classpath (META-INF files). This is a light impl of java.util.ServiceLoader but is guaranteed to be bug-free regarding classpath order and does not instantiate or initialize the classes found.

@lucene.internal

StackTraceHelper

StringHelper

Methods for manipulating strings.

@lucene.internal

TestRuleAssertionsRequired

Require assertions for Lucene/Solr packages.

TestRuleFieldCacheSanity

TestRuleIgnoreAfterMaxFailures

TestRuleIgnoreTestSuites

TestRuleMarkFailure

A rule for marking failed tests and suites.

TestRuleStoreClassName

Stores the suite name so you can retrieve it from

TestSecurityManager

TestUtil

General utility methods for Lucene unit tests.

TextWriterInfoStream

InfoStream implementation over a such as .

NOTE: This is analogous to PrintStreamInfoStream in Lucene.

@lucene.internal

ThrottledIndexOutput

Intentionally slow IndexOutput for testing.

TimeUnits

time unit constants for use in annotations.

TimSorter

Sorter implementation based on the TimSort algorithm.

This implementation is especially good at sorting partially-sorted arrays and sorts small arrays with binary sort.

NOTE:There are a few differences with the original implementation:

The extra amount of memory to perform merges is configurable. This allows small merges to be very fast while large merges will be performed in-place (slightly slower). You can make sure that the fast merge routine will always be used by having maxTempSlots equal to half of the length of the slice of data to sort.
Only the fast merge routine can gallop (the one that doesn't run in-place) and it only gallops on the longest slice.

@lucene.internal

ToStringUtils

Helper methods to ease implementing .

UnicodeUtil

Class to encode .NET's UTF16 char[] into UTF8 byte[] without always allocating a new byte[] as of does.

@lucene.internal

VirtualMethod

A utility for keeping backwards compatibility on previously abstract methods (or similar replacements).

Before the replacement method can be made abstract, the old method must kept deprecated. If somebody still overrides the deprecated method in a non-final class, you must keep track, of this and maybe delegate to the old method in the subclass. The cost of reflection is minimized by the following usage of this class:

Define static final fields in the base class (BaseClass), where the old and new method are declared:

 static final VirtualMethod<BaseClass> newMethod =
  new VirtualMethod<BaseClass>(BaseClass.class, "newName", parameters...);
 static final VirtualMethod<BaseClass> oldMethod =
  new VirtualMethod<BaseClass>(BaseClass.class, "oldName", parameters...);

this enforces the singleton status of these objects, as the maintenance of the cache would be too costly else. If you try to create a second instance of for the same method/baseClass combination, an exception is thrown.

To detect if e.g. the old method was overridden by a more far subclass on the inheritance path to the current instance's class, use a non-static field:

 final boolean isDeprecatedMethodOverridden =
  oldMethod.getImplementationDistance(this.getClass()) > newMethod.getImplementationDistance(this.getClass());

// alternatively (more readable):
final boolean isDeprecatedMethodOverridden =
 VirtualMethod.compareImplementationDistance(this.getClass(), oldMethod, newMethod) > 0

GetImplementationDistance(Type) returns the distance of the subclass that overrides this method. The one with the larger distance should be used preferable. this way also more complicated method rename scenarios can be handled (think of 2.9 deprecations).

@lucene.internal

WAH8DocIdSet

DocIdSet implementation based on word-aligned hybrid encoding on words of 8 bits.

This implementation doesn't support random-access but has a fast DocIdSetIterator which can advance in logarithmic time thanks to an index.

The compression scheme is simplistic and should work well with sparse and very dense doc id sets while being only slightly larger than a FixedBitSet for incompressible sets (overhead<2% in the worst case) in spite of the index.

Format: The format is byte-aligned. An 8-bits word is either clean, meaning composed only of zeros or ones, or dirty, meaning that it contains between 1 and 7 bits set. The idea is to encode sequences of clean words using run-length encoding and to leave sequences of dirty words as-is.

TokenClean length+Dirty length+Dirty words
1 byte0-n bytes0-n bytes0-n bytes

Token encodes whether clean means full of zeros or ones in the first bit, the number of clean words minus 2 on the next 3 bits and the number of dirty words on the last 4 bits. The higher-order bit is a continuation bit, meaning that the number is incomplete and needs additional bytes to be read.
Clean length+: If clean length has its higher-order bit set, you need to read a vint (ReadVInt32()), shift it by 3 bits on the left side and add it to the 3 bits which have been read in the token.
Dirty length+ works the same way as Clean length+ but on 4 bits and for the length of dirty words.
Dirty wordsare the dirty words, there are Dirty length of them.

This format cannot encode sequences of less than 2 clean words and 0 dirty word. The reason is that if you find a single clean word, you should rather encode it as a dirty word. This takes the same space as starting a new sequence (since you need one byte for the token) but will be lighter to decode. There is however an exception for the first sequence. Since the first sequence may start directly with a dirty word, the clean length is encoded directly, without subtracting 2.

There is an additional restriction on the format: the sequence of dirty words is not allowed to contain two consecutive clean words. This restriction exists to make sure no space is wasted and to make sure iterators can read the next doc ID by reading at most 2 dirty words.

@lucene.experimental

WAH8DocIdSet.Builder

A builder for WAH8DocIdSets.

WAH8DocIdSet.WordBuilder

Word-based builder.

WeakIdentityMap<TKey, TValue>

Implements a combination of java.util.WeakHashMap and java.util.IdentityHashMap. Useful for caches that need to key off of a == comparison instead of a .Equals(object).

This class is not a general-purpose implementation! It intentionally violates 's general contract, which mandates the use of the method when comparing objects. This class is designed for use only in the rare cases wherein reference-equality semantics are required.

This implementation was forked from Apache CXF but modified to not implement the interface and without any set views on it, as those are error-prone and inefficient, if not implemented carefully. The map only contains implementations on the values and not-GCed keys. Lucene's implementation also supports null keys, but those are never weak!

The map supports two modes of operation:

reapOnRead = true: This behaves identical to a java.util.WeakHashMap where it also cleans up the reference queue on every read operation (Get(Object), ContainsKey(Object), Count, GetValueEnumerator()), freeing map entries of already GCed keys.
reapOnRead = false: This mode does not call Reap() on every read operation. In this case, the reference queue is only cleaned up on write operations (like Put(TKey, TValue)). This is ideal for maps with few entries where the keys are unlikely be garbage collected, but there are lots of Get(Object) operations. The code can still call Reap() to manually clean up the queue without doing a write operation.

@lucene.internal

Interfaces

IAccountable

An object whose RAM usage can be computed.

@lucene.internal

IAttribute

Base interface for attributes.

IAttributeReflector

This interface is used to reflect contents of AttributeSource or Attribute.

IBits

Interface for Bitset-like structures.

@lucene.experimental

IBytesRefIterator

A simple iterator interface for BytesRef iteration.

IMutableBits

Extension of IBits for live documents.

IServiceListable

LUCENENET specific contract that provides support for AvailableCodecs(), AvailableDocValuesFormats(), and AvailablePostingsFormats(). Implement this interface in addition to ICodecFactory, IDocValuesFormatFactory, or IPostingsFormatFactory to provide optional support for the above methods when providing a custom implementation. If this interface is not supported by the corresponding factory, a will be thrown from the above methods.

RollingBuffer.IResettable

Implement to reset an instance

TestRuleIgnoreTestSuites.NestedTestSuite

Marker interface for nested suites that should be ignored if executed in stand-alone mode.

Enums

LuceneVersion

Use by certain classes to match version compatibility across releases of Lucene.

WARNING: When changing the version parameter that you supply to components in Lucene, do not simply change the version at search-time, but instead also adjust your indexing code to match, and re-index.

Namespace Lucene.Net.Util

Classes

Performance Results

Interfaces

Enums

Contact Us