Namespace Lucene.Net.Store
Classes
BaseDirectory
Base implementation for a concrete Directory.
@lucene.experimental
BaseDirectoryWrapper
Calls check index on close.
BufferedChecksum
Wraps another IChecksum with an internal buffer to speed up checksum calculations.
BufferedChecksumIndexInput
Simple implementation of ChecksumIndexInput that wraps another input and delegates calls.
BufferedIndexInput
Base implementation class for buffered IndexInput.
BufferedIndexOutput
Base implementation class for buffered IndexOutput.
ByteArrayDataInput
DataInput backed by a byte array. WARNING: this class omits all low-level checks.
@lucene.experimental
ByteArrayDataOutput
DataOutput backed by a byte array. WARNING: this class omits most low-level checks, so be sure to test heavily with assertions enabled.
@lucene.experimental
ByteBufferIndexInput
Base IndexInput implementation that uses an array of ByteBuffers to represent a file.
Because Java's ByteBuffer uses an
For efficiency, this class requires that the buffers
are a power-of-two (chunkSizePower
).
ChecksumIndexInput
Extension of IndexInput, computing checksum as it goes. Callers can retrieve the checksum via Checksum.
CompoundFileDirectory
Class for accessing a compound stream. This class implements a directory, but is limited to only read operations. Directory methods that would normally modify data throw an exception.
All files belonging to a segment have the same name with varying extensions.
The extensions correspond to the different file formats used by the Codec.
When using the Compound File format these files are collapsed into a
single .cfs
file (except for the LiveDocsFormat, with a
corresponding .cfe
file indexing its sub-files.
Files:
.cfs
: An optional "virtual" file consisting of all the other index files for systems that frequently run out of file handles..cfe
: The "virtual" compound file's entry table holding all entries in the corresponding .cfs file.
Description:
- Compound (.cfs) --> Header, FileData FileCount
- Compound Entry Table (.cfe) --> Header, FileCount, <FileName, DataOffset, DataLength> FileCount, Footer
- Header --> WriteHeader(DataOutput, String, Int32)
- FileCount --> WriteVInt32(Int32)
- DataOffset,DataLength --> WriteInt64(Int64)
- FileName --> WriteString(String)
- FileData --> raw file data
- Footer --> WriteFooter(IndexOutput)
Notes:
- FileCount indicates how many files are contained in this compound file. The entry table that follows has that many entries.
- Each directory entry contains a long pointer to the start of this file's data
section, the files length, and a
with that file's name.
@lucene.experimental
CompoundFileDirectory.FileEntry
Offset/Length for a slice inside of a compound file
DataInput
Abstract base class for performing read operations of Lucene's low-level data types.
DataInput may only be used from one thread, because it is not thread safe (it keeps internal state like file position). To allow multithreaded use, every DataInput instance must be cloned before used in another thread. Subclasses must therefore implement Clone(), returning a new DataInput which operates on the same underlying resource, but positioned independently.
DataOutput
Abstract base class for performing write operations of Lucene's low-level data types.
DataOutput may only be used from one thread, because it is not thread safe (it keeps internal state like file position).
Directory
A Directory is a flat list of files. Files may be written once, when they are created. Once a file is created it may only be opened for read, or deleted. Random access is permitted both when reading and writing.
.NET's i/o APIs not used directly, but rather all i/o is through this API. This permits things such as:
- implementation of RAM-based indices;
- implementation indices stored in a database;
- implementation of an index as a single file;
Directory locking is implemented by an instance of LockFactory, and can be changed for each Directory instance using SetLockFactory(LockFactory).
Directory.IndexInputSlicer
Allows to create one or more sliced IndexInput instances from a single file handle. Some Directory implementations may be able to efficiently map slices of a file into memory when only certain parts of a file are required.
@lucene.internal @lucene.experimental
Directory.Status
FileSwitchDirectory
Expert: A Directory instance that switches files between two other Directory instances.
Files with the specified extensions are placed in the primary directory; others are placed in the secondary directory. The provided ISet{string} must not change once passed to this class, and must allow multiple threads to call contains at once.
@lucene.experimental
FilterDirectory
Directory implementation that delegates calls to another directory. This class can be used to add limitations on top of an existing Directory implementation such as rate limiting (RateLimitedDirectoryWrapper) or to add additional sanity checks for tests. However, if you plan to write your own Directory implementation, you should consider extending directly Directory or BaseDirectory rather than try to reuse functionality of existing Directorys by extending this class.
@lucene.internal
FlushInfo
A FlushInfo provides information required for a FLUSH context. It is used as part of an IOContext in case of FLUSH context.
FSDirectory
Base class for Directory implementations that store index files in the file system.
There are currently three core subclasses:
- SimpleFSDirectory is a straightforward
implementation using
. However, it has poor concurrent performance (multiple threads will bottleneck) as it synchronizes when multiple threads read from the same file. - NIOFSDirectory uses java.nio's
FileChannel's positional io when reading to avoid
synchronization when reading from the same file.
Unfortunately, due to a Windows-only Sun
JRE bug this is a poor choice for Windows, but
on all other platforms this is the preferred
choice. Applications using
or should use SimpleFSDirectory instead. See NIOFSDirectory java doc for details. - MMapDirectory uses memory-mapped IO when
reading. This is a good choice if you have plenty
of virtual memory relative to your index size, eg
if you are running on a 64 bit runtime, or you are
running on a 32 bit runtime but your index sizes are
small enough to fit into the virtual memory space.
Applications using
or should use SimpleFSDirectory instead. See MMapDirectory doc for details.
Unfortunately, because of system peculiarities, there is no single overall best implementation. Therefore, we've added the Open(String) method (or one of its overloads), to allow Lucene to choose the best FSDirectory implementation given your environment, and the known limitations of each implementation. For users who have no reason to prefer a specific implementation, it's best to simply use Open(String) (or one of its overloads). For all others, you should instantiate the desired implementation directly.
The locking implementation is by default NativeFSLockFactory, but can be changed by passing in a custom LockFactory instance.
FSDirectory.FSIndexOutput
Writes output with
FSLockFactory
Base class for file system based locking implementation.
IndexInput
Abstract base class for input from a file in a Directory. A random-access input stream. Used for all Lucene index input operations.
IndexInput may only be used from one thread, because it is not
thread safe (it keeps internal state like file position). To allow
multithreaded use, every IndexInput instance must be cloned before
used in another thread. Subclasses must therefore implement Clone(),
returning a new IndexInput which operates on the same underlying
resource, but positioned independently. Lucene never closes cloned
IndexInputs, it will only do this on the original one.
The original instance must take care that cloned instances throw
IndexOutput
Abstract base class for output to a file in a Directory. A random-access output stream. Used for all Lucene index output operations.
IndexOutput may only be used from one thread, because it is not thread safe (it keeps internal state like file position).
InputStreamDataInput
A DataInput wrapping a plain
IOContext
Lock
An interprocess mutex lock.
Typical use might look like:
var result = Lock.With.NewAnonymous<string>(
@lock: directory.MakeLock("my.lock"),
lockWaitTimeout: Lock.LOCK_OBTAIN_WAIT_FOREVER,
doBody: () =>
{
//... code to execute while locked ...
return "the result";
}).Run();
Lock.With<T>
Utility class for executing code with exclusive access.
LockFactory
Base class for Locking implementation. Directory uses instances of this class to implement locking.
Lucene uses NativeFSLockFactory by default for FSDirectory-based index directories.
Special care needs to be taken if you change the locking implementation: First be certain that no writer is in fact writing to the index otherwise you can easily corrupt your index. Be sure to do the LockFactory change on all Lucene instances and clean up all leftover lock files before starting the new configuration for the first time. Different implementations can not work together!
If you suspect that some LockFactory implementation is not working properly in your environment, you can easily test it by using VerifyingLockFactory, LockVerifyServer and LockStressTest.
LockObtainFailedException
This exception is thrown when the write.lock
could not be acquired. This
happens when a writer tries to open an index
that another writer already has open.
LockReleaseFailedException
This exception is thrown when the write.lock
could not be released.
LockStressTest
Simple standalone tool that forever acquires & releases a lock using a specific LockFactory. Run without any args to see usage.
LockVerifyServer
Simple standalone server that must be running when you use VerifyingLockFactory. This server simply verifies at most one process holds the lock at a time. Run without any args to see usage.
MergeInfo
A MergeInfo provides information required for a MERGE context. It is used as part of an IOContext in case of MERGE context.
MMapDirectory
File-based Directory implementation that uses
NOTE: memory mapping uses up a portion of the
virtual memory address space in your process equal to the
size of the file being mapped. Before using this class,
be sure your have plenty of virtual address space, e.g. by
using a 64 bit runtime, or a 32 bit runtime with indexes that are
guaranteed to fit within the address space.
On 32 bit platforms also consult MMapDirectory(DirectoryInfo, LockFactory, Int32)
if you have problems with mmap failing because of fragmented
address space. If you get an
NOTE: Accessing this class either directly or
indirectly from a thread while it's interrupted can close the
underlying channel immediately if at the same time the thread is
blocked on IO. The channel will remain closed and subsequent access
to MMapDirectory will throw a
MMapDirectory.MMapIndexInput
MockDirectoryWrapper
this is a Directory Wrapper that adds methods intended to be used only by unit tests. It also adds a number of features useful for testing:
- Instances created by LuceneTestCase are tracked to ensure they are closed by the test.
- When a MockDirectoryWrapper is closed, it will throw an exception if it has any open files against it (with a stacktrace indicating where they were opened from).
- When a MockDirectoryWrapper is closed, it runs CheckIndex to test if the index was corrupted.
- MockDirectoryWrapper simulates some "features" of Windows, such as refusing to write/delete to open files.
MockDirectoryWrapper.Failure
Objects that represent fail-able conditions. Objects of a derived class are created and registered with the mock directory. After register, each object will be invoked once for each first write of a file, giving the object a chance to throw anSystem.IO.IOException.
MockDirectoryWrapper.FakeIOException
Use this when throwing fake
MockIndexInputWrapper
Used by MockDirectoryWrapper to create an input stream that keeps track of when it's been closed.
MockIndexOutputWrapper
Used by MockRAMDirectory to create an output stream that will throw anSystem.IO.IOException on fake disk full, track max disk space actually used, and maybe throw random System.IO.IOExceptions.
MockLockFactoryWrapper
Used by MockDirectoryWrapper to wrap another factory and track open locks.
NativeFSLockFactory
Implements LockFactory using native OS file locks. For NFS based access to an index, it's recommended that you try SimpleFSLockFactory first and work around the one limitation that a lock file could be left when the runtime exits abnormally.
The primary benefit of NativeFSLockFactory is that locks (not the lock file itsself) will be properly removed (by the OS) if the runtime has an abnormal exit.
Note that, unlike SimpleFSLockFactory, the existence of leftover lock files in the filesystem is fine because the OS will free the locks held against these files even though the files still remain. Lucene will never actively remove the lock files, so although you see them, the index may not be locked.
Special care needs to be taken if you change the locking implementation: First be certain that no writer is in fact writing to the index otherwise you can easily corrupt your index. Be sure to do the LockFactory change on all Lucene instances and clean up all leftover lock files before starting the new configuration for the first time. Different implementations can not work together!
If you suspect that this or any other LockFactory is not working properly in your environment, you can easily test it by using VerifyingLockFactory, LockVerifyServer and LockStressTest.
NCacheDirectory
Base class for Directory implementations that store index files in NCache.
The locking implementation is by default NativeFSLockFactory, but can be changed by passing in a custom LockFactory instance.
NIOFSDirectory
An FSDirectory implementation that uses
This class only uses
NOTE: NIOFSDirectory is not recommended on Windows because of a bug in how FileChannel.read is implemented in Sun's JRE. Inside of the implementation the position is apparently synchronized. See here for details.
NOTE: Accessing this class either directly or
indirectly from a thread while it's interrupted can close the
underlying file descriptor immediately if at the same time the thread is
blocked on IO. The file descriptor will remain closed and subsequent access
to NIOFSDirectory will throw a
NIOFSDirectory.NIOFSIndexInput
Reads bytes with the Read(FileStream, ByteBuffer, Int64)
extension method for
NoLockFactory
Use this LockFactory to disable locking entirely. Only one instance of this lock is created. You should call GetNoLockFactory() to get the instance.
NRTCachingDirectory
Wraps a RAMDirectory around any provided delegate directory, to be used during NRT search.
This class is likely only useful in a near-real-time context, where indexing rate is lowish but reopen rate is highish, resulting in many tiny files being written. This directory keeps such segments (as well as the segments produced by merging them, as long as they are small enough), in RAM.
This is safe to use: when your app calls Commit(), all cached files will be flushed from the cached and sync'd.
Here's a simple example usage:
Directory fsDir = FSDirectory.Open(new DirectoryInfo("/path/to/index"));
NRTCachingDirectory cachedFSDir = new NRTCachingDirectory(fsDir, 5.0, 60.0);
IndexWriterConfig conf = new IndexWriterConfig(Version.LUCENE_48, analyzer);
IndexWriter writer = new IndexWriter(cachedFSDir, conf);
This will cache all newly flushed segments, all merges whose expected segment size is <= 5 MB, unless the net cached bytes exceeds 60 MB at which point all writes will not be cached (until the net bytes falls below 60 MB).
@lucene.experimental
OutputStreamDataOutput
A DataOutput wrapping a plain
RAMDirectory
A memory-resident Directory implementation. Locking implementation is by default the SingleInstanceLockFactory but can be changed with SetLockFactory(LockFactory).
Warning: This class is not intended to work with huge indexes. Everything beyond several hundred megabytes will waste resources (GC cycles), because it uses an internal buffer size of 1024 bytes, producing millions of byte[1024] arrays. This class is optimized for small memory-resident indexes. It also has bad concurrency on multithreaded environments.
It is recommended to materialize large indexes on disk and use MMapDirectory, which is a high-performance directory implementation working directly on the file system cache of the operating system, so copying data to heap space is not useful.
RAMFile
Represents a file in RAM as a list of byte[] buffers.
@lucene.internal
RAMInputStream
A memory-resident IndexInput implementation.
@lucene.internal
RAMOutputStream
A memory-resident IndexOutput implementation.
@lucene.internal
RateLimitedDirectoryWrapper
A Directory wrapper that allows IndexOutput rate limiting using IO context (IOContext.UsageContext) specific rate limiters (RateLimiter).
@lucene.experimental
RateLimiter
Abstract base class to rate limit IO. Typically implementations are shared across multiple IndexInputs or IndexOutputs (for example those involved all merging). Those IndexInputs and IndexOutputs would call Pause(Int64) whenever they want to read bytes or write bytes.
RateLimiter.SimpleRateLimiter
Simple class to rate limit IO.
SimpleFSDirectory
A straightforward implementation of FSDirectory
using
SimpleFSDirectory.SimpleFSIndexInput
Reads bytes with
SimpleFSLockFactory
Implements LockFactory using
Special care needs to be taken if you change the locking implementation: First be certain that no writer is in fact writing to the index otherwise you can easily corrupt your index. Be sure to do the LockFactory change to all Lucene instances and clean up all leftover lock files before starting the new configuration for the first time. Different implementations can not work together!
If you suspect that this or any other LockFactory is not working properly in your environment, you can easily test it by using VerifyingLockFactory, LockVerifyServer and LockStressTest.
SingleInstanceLockFactory
Implements LockFactory for a single in-process instance, meaning all locking will take place through this one instance. Only use this LockFactory when you are certain all IndexReaders and IndexWriters for a given index are running against a single shared in-process Directory instance. This is currently the default locking for RAMDirectory.
SurrogateNCacheDirectory
Base class for Directory implementations that store index files in the file system.
There are currently three core subclasses:
- SimpleFSDirectory is a straightforward implementation using System.IO.FileStream. However, it has poor concurrent performance (multiple threads will bottleneck) as it synchronizes when multiple threads read from the same file.
- NIOFSDirectory uses java.nio's FileChannel's positional io when reading to avoid synchronization when reading from the same file. Unfortunately, due to a Windows-only Sun JRE bug this is a poor choice for Windows, but on all other platforms this is the preferred choice. Applications using System.Threading.Thread.Interrupt or System.Threading.Tasks.Task<TResult> should use SimpleFSDirectory instead. See NIOFSDirectory java doc for details.
- MMapDirectory uses memory-mapped IO when
reading. This is a good choice if you have plenty
of virtual memory relative to your index size, eg
if you are running on a 64 bit runtime, or you are
running on a 32 bit runtime but your index sizes are
small enough to fit into the virtual memory space.
Applications using System.Threading.Thread.Interrupt or System.Threading.Tasks.Task should use SimpleFSDirectory instead. See MMapDirectory doc for details.
Unfortunately, because of system peculiarities, there is
no single overall best implementation. Therefore, we've
added the
The locking implementation is by default NativeFSLockFactory, but can be changed by passing in a custom LockFactory instance.
TestHelper
this class provides access to package-level features defined in the store package. It is used for testing only.
TrackingDirectoryWrapper
A delegating Directory that records which files were written to and deleted.
VerifyingLockFactory
A LockFactory that wraps another LockFactory and verifies that each lock obtain/release is "correct" (never results in two processes holding the lock at the same time). It does this by contacting an external server (LockVerifyServer) to assert that at most one process holds the lock at a time. To use this, you should also run LockVerifyServer on the host & port matching what you pass to the constructor.
Interfaces
IDirectoryEventListener
Enums
IOContext.UsageContext
IOContext.UsageContext is a enumeration which specifies the context in which the Directory is being used for.
NOTE: This was Context in Lucene