Class IndexReader
IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable.
Concrete subclasses of IndexReader are usually constructed with a call to
one of the static open()
methods, e.g. Open(Directory, Boolean)
.
For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.
An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.
NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to.
NOTE: as of 2.4, it's possible to open a read-only IndexReader using the static open methods that accepts the boolean readOnly parameter. Such a reader has better better concurrency as it's not necessary to synchronize on the isDeleted method. You must explicitly specify false if you want to make changes with the resulting IndexReader.
NOTE: IndexReader
instances are completely thread
safe, meaning multiple threads can call any of its methods,
concurrently. If your application requires external
synchronization, you should not synchronize on the
IndexReader
instance; use your own
(non-Lucene) objects instead.
Inheritance
Namespace:
Assembly: Lucene.Net.NetCore.dll
Syntax
public abstract class IndexReader : ICloneable, IDisposable
Constructors
Name | Description |
---|---|
IndexReader() |
Fields
Name | Description |
---|---|
DEFAULT_TERMS_INDEX_DIVISOR | |
hasChanges |
Properties
Name | Description |
---|---|
CommitUserData | Retrieve the String userData optionally passed to
|
DeletesCacheKey | |
FieldCacheKey | Expert |
HasDeletions | Returns true if any documents have been deleted |
IndexCommit | Expert: return the IndexCommit that this reader has opened. This method is only implemented by those readers that correspond to a Directory with its own segments_N file. WARNING: this API is new and experimental and may suddenly change.
|
Item[Int32] | Returns the stored fields of the NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call IsDeleted(Int32) with the requested document ID to verify the document is not deleted. |
MaxDoc | Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index. |
NumDeletedDocs | Returns the number of deleted documents. |
RefCount | Expert: returns the current refCount for this reader |
TermInfosIndexDivisor | For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened. |
UniqueTermCount | Returns the number of unique terms (across all fields) in this reader. This method returns long, even though internally Lucene cannot handle more than 2^31 unique terms, for a possible future when this limitation is removed. |
Version | Version number when this IndexReader was opened. Not implemented in the IndexReader base class. If this reader is based on a Directory (ie, was created by calling Open(Directory, Boolean), or Reopen() on a reader based on a Directory), then this method returns the version recorded in the commit that the reader opened. This version is advanced every time Commit() is called.
If instead this reader is a near real-time reader (ie, obtained by a call to GetReader(), or by calling Reopen() on a near real-time reader), then this method returns the version of the last commit done by the writer. Note that even as further changes are made with the writer, the version will not changed until a commit is completed. Thus, you should not rely on this method to determine when a near real-time reader should be opened. Use IsCurrent() instead.
|
Methods
Name | Description |
---|---|
AcquireWriteLock() | Does nothing by default. Subclasses that require a write lock for index modifications must implement this method. |
Clone() | Efficiently clones the IndexReader (sharing most internal state). On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned. Like Reopen(), it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.
|
Clone(Boolean) | Clones the IndexReader and optionally changes readOnly. A readOnly reader cannot open a writeable reader. |
Close() | |
Commit() | Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics). |
Commit(IDictionary<String, String>) | Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics). |
DecRef() | Expert: decreases the refCount of this IndexReader instance. If the refCount drops to 0, then pending changes (if any) are committed to the index and this reader is closed. |
DeleteDocument(Int32) | Deletes the document numbered |
DeleteDocuments(Term) | Deletes all documents that have a given |
Directory() | Returns the directory associated with this index. The Default implementation returns the directory specified by subclasses when delegating to the IndexReader(Directory) constructor, or throws an UnsupportedOperationException if one was not specified. |
Dispose() | Closes files associated with this index. Also saves any new deletions to disk. No other methods should be called after this has been called. |
Dispose(Boolean) | |
DocFreq(Term) | Returns the number of documents containing the term |
DoClose() | Implements close. |
DoCommit(IDictionary<String, String>) | Implements commit. |
Document(Int32) | Returns the stored fields of the NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call IsDeleted(Int32) with the requested document ID to verify the document is not deleted. |
Document(Int32, FieldSelector) | Get the Document at the NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call IsDeleted(Int32) with the requested document ID to verify the document is not deleted. |
DoDelete(Int32) | Implements deletion of the document numbered |
DoSetNorm(Int32, String, Byte) | Implements setNorm in subclass. |
DoUndeleteAll() | Implements actual undeleteAll() in subclass. |
EnsureOpen() | |
Flush() | |
Flush(IDictionary<String, String>) | |
GetCommitUserData(Directory) | Reads commitUserData, previously passed to
|
GetCurrentVersion(Directory) | Reads version number from segments files. The version number is initialized with a timestamp and then increased by one for each change of the index. |
GetFieldNames(IndexReader.FieldOption) | Get a list of unique field names that exist in this index and have the specified field option information. |
GetSequentialSubReaders() | Expert: returns the sequential sub readers that this reader is logically composed of. For example, IndexSearcher uses this API to drive searching by one sub reader at a time. If this reader is not composed of sequential child readers, it should return null. If this method returns an empty array, that means this reader is a null reader (for example a MultiReader that has no sub readers). NOTE: You should not try using sub-readers returned by this method to make any changes (setNorm, deleteDocument, etc.). While this might succeed for one composite reader (like MultiReader), it will most likely lead to index corruption for other readers (like DirectoryReader obtained through Open(Directory, Boolean). Use the parent reader directly. |
GetTermFreqVector(Int32, TermVectorMapper) | Map all the term vectors for all fields in a Document |
GetTermFreqVector(Int32, String) | Return a term frequency vector for the specified document and field. The returned vector contains terms and frequencies for the terms in the specified field of this document, if the field had the storeTermVector flag set. If termvectors had been stored with positions or offsets, a TermPositionVector is returned. |
GetTermFreqVector(Int32, String, TermVectorMapper) | Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the ITermFreqVector. |
GetTermFreqVectors(Int32) | Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null. The term vectors that are returned may either be of type ITermFreqVector or of type TermPositionVector if positions or offsets have been stored. |
HasNorms(String) | Returns true if there are norms stored for this field. |
IncRef() | Expert: increments the refCount of this IndexReader instance. RefCounts are used to determine when a reader can be closed safely, i.e. as soon as there are no more references. Be sure to always call a corresponding DecRef(), in a finally clause; otherwise the reader may never be closed. Note that Close() simply calls decRef(), which means that the IndexReader will not really be closed until DecRef() has been called for all outstanding references. |
IndexExists(Directory) | Returns |
IsCurrent() | Check whether any new changes have occurred to the index since this reader was opened.
If this reader is based on a Directory (ie, was created by calling
If instead this reader is a near real-time reader (ie, obtained by a call to GetReader(), or by calling Reopen() on a near real-time reader), then this method checks if either a new commmit has occurred, or any new uncommitted changes have taken place via the writer. Note that even if the writer has only performed merging, this method will still return false.
In any event, if this returns false, you should call Reopen() to get a new reader that sees the changes.
|
IsDeleted(Int32) | Returns true if document n has been deleted |
IsOptimized() | Checks is the index is optimized (if it has a single segment and no deletions). Not implemented in the IndexReader base class. |
LastModified(Directory) | Returns the time the index in the named directory was last modified. Do not use this to check whether the reader is still up-to-date, use IsCurrent() instead. |
ListCommits(Directory) | Returns all commit points that exist in the Directory.
Normally, because the default is KeepOnlyLastCommitDeletionPolicy
, there would be only
one commit point. But if you're using a custom IndexDeletionPolicy
then there could be many commits.
Once you have a given commit, you can open a reader on
it by calling Open(IndexCommit, Boolean)
There must be at least one commit in
the Directory, else this method throws |
Main(String[]) | Prints the filename and size of each file within a given compound file. Add the -extract flag to extract files to the current working directory. In order to make the extracted version of the index work, you have to copy the segments file from the compound index into the directory where the extracted files are stored. |
Norms(String) | Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents. |
Norms(String, Byte[], Int32) | Reads the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents. |
NumDocs() | Returns the number of documents in this index. |
Open(IndexCommit, IndexDeletionPolicy, Boolean) | Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader. |
Open(IndexCommit, IndexDeletionPolicy, Boolean, Int32) | Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader. |
Open(IndexCommit, Boolean) | Expert: returns an IndexReader reading the index in the given IndexCommit. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader. |
Open(Directory, IndexDeletionPolicy, Boolean) | Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy . You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader. |
Open(Directory, IndexDeletionPolicy, Boolean, Int32) | Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy . You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader. |
Open(Directory, Boolean) | Returns an IndexReader reading the index in the given Directory. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader. |
Reopen() | Refreshes an IndexReader if the index has changed since this instance was (re)opened. Opening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.
If the index has not changed since this instance was (re)opened, then this
call is a NOOP and returns this instance. Otherwise, a new instance is
returned. The old instance is not closed and remains usable.
You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:
Be sure to synchronize that code so that other threads, if present, can never use reader after it has been closed and before it's switched to newReader. NOTE: If this reader is a near real-time reader (obtained from GetReader(), reopen() will simply call writer.getReader() again for you, though this may change in the future. |
Reopen(IndexCommit) | Expert: reopen this reader on a specific commit point. This always returns a readOnly reader. If the specified commit point matches what this reader is already on, and this reader is already readOnly, then this same instance is returned; if it is not already readOnly, a readOnly clone is returned. |
Reopen(Boolean) | Just like Reopen(), except you can change the readOnly of the original reader. If the index is unchanged but readOnly is different then a new reader will be returned. |
SetNorm(Int32, String, Byte) | Expert: Resets the normalization factor for the named field of the named
document. The norm represents the product of the field's Boost
and its NOTE: If this field does not store norms, then this method call will silently do nothing. |
SetNorm(Int32, String, Single) | Expert: Resets the normalization factor for the named field of the named document. |
TermDocs() | Returns an unpositioned TermDocs enumerator. |
TermDocs(Term) | Returns an enumeration of all the documents which contain
The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration. |
TermPositions() | Returns an unpositioned TermPositions enumerator. |
TermPositions(Term) | Returns an enumeration of all the documents which contain
This positional information facilitates phrase and proximity searching. The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration. |
Terms() | Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), Next() must be called on the resulting enumeration before calling other methods such as Term. |
Terms(Term) | Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. |
UndeleteAll() | Undeletes all documents currently marked as deleted in this index. |