Class DirectoryTaxonomyWriter
ITaxonomyWriter which uses a Directory to store the taxonomy information on disk, and keeps an additional in-memory cache of some or all categories.
In addition to the permanently-stored information in the Directory, efficiency dictates that we also keep an in-memory cache of recently seen or all categories, so that we do not need to go back to disk for every category addition to see which ordinal this category already has, if any. A ITaxonomyWriterCache object determines the specific caching algorithm used.
This class offers some hooks for extending classes to control the IndexWriter instance that is used. See OpenIndexWriter(Directory, IndexWriterConfig).
@lucene.experimental
Inheritance
Assembly: DistributedLucene.Net.Facet.dll
Syntax
public class DirectoryTaxonomyWriter : object, ITaxonomyWriter, IDisposable, ITwoPhaseCommit, IIdentifiableSurrogate
Constructors
Name | Description |
---|---|
DirectoryTaxonomyWriter(Directory) | Create this with CREATE_OR_APPEND. |
DirectoryTaxonomyWriter(Directory, OpenMode) | Creates a new instance with a default cache as defined by DefaultTaxonomyWriterCache(). |
Fields
Name | Description |
---|---|
INDEX_EPOCH | Property name of user commit data that contains the index epoch. The epoch changes whenever the taxonomy is recreated (i.e. opened with CREATE. Applications should not use this property in their commit data because it will be overridden by this taxonomy writer. |
Properties
Name | Description |
---|---|
CommitData | |
Count | |
Directory | Returns the Directory of this taxonomy writer. |
TaxonomyEpoch | Expert: returns current index epoch, if this is a near-real-time reader. Used by DirectoryTaxonomyReader to support NRT. @lucene.internal |
Methods
Name | Description |
---|---|
AddCategory(FacetLabel) | |
AddTaxonomy(Directory, DirectoryTaxonomyWriter.IOrdinalMap) | Takes the categories from the given taxonomy directory, and adds the missing ones to this taxonomy. Additionally, it fills the given DirectoryTaxonomyWriter.IOrdinalMap with a mapping from the original ordinal to the new ordinal. |
Commit() | |
CreateIndexWriterConfig(OpenMode) | Create the IndexWriterConfig that would be used for opening the internal index writer. Extensions can configure the IndexWriter as they see fit, including setting a MergeScheduler, or IndexDeletionPolicy, different RAM size etc. NOTE: internal docids of the configured index must not be altered. For that, categories are never deleted from the taxonomy index. In addition, merge policy in effect must not merge none adjacent segments. |
DefaultTaxonomyWriterCache() | Defines the default ITaxonomyWriterCache to use in constructors which do not specify one. The current default is Cl2oTaxonomyWriterCache constructed with the parameters (1024, 0.15f, 3), i.e., the entire taxonomy is cached in memory while building it. |
Dispose() | Frees used resources as well as closes the underlying IndexWriter, which commits whatever changes made to it to the underlying Directory. |
GetSurrogateId() | |
OpenIndexWriter(Directory, IndexWriterConfig) | Open internal index writer, which contains the taxonomy data. Extensions may provide their own IndexWriter implementation or instance. NOTE: the instance this method returns will be disposed upon calling to Dispose(). NOTE: the merge policy in effect must not merge none adjacent segments. See comment in CreateIndexWriterConfig(OpenMode) for the logic behind this. |
PrepareCommit() | prepare most of the work needed for a two-phase commit. See PrepareCommit(). |
ReplaceTaxonomy(Directory) | Replaces the current taxonomy with the given one. This method should generally be called in conjunction with AddIndexes(Directory[]) to replace both the taxonomy as well as the search index content. |
Rollback() | Rollback changes to the taxonomy writer and closes the instance. Following
this method the instance becomes unusable (calling any of its API methods
will yield an |
SetCacheMissesUntilFill(Int32) | Set the number of cache misses before an attempt is made to read the entire taxonomy into the in-memory cache. This taxonomy writer holds an in-memory cache of recently seen categories to speed up operation. On each cache-miss, the on-disk index needs to be consulted. When an existing taxonomy is opened, a lot of slow disk reads like that are needed until the cache is filled, so it is more efficient to read the entire taxonomy into memory at once. We do this complete read after a certain number (defined by this method) of cache misses.
If the number is set to NOTE: it is assumed that this method is called immediately after the taxonomy writer has been created. |
SetCommitData(IDictionary<String, String>) | |
Unlock(Directory) | Forcibly unlocks the taxonomy in the named directory. Caution: this should only be used by failure recovery code, when it is known that no other process nor thread is in fact currently accessing this taxonomy. This method is unnecessary if your Directory uses a NativeFSLockFactory instead of the default SimpleFSLockFactory. When the "native" lock is used, a lock does not stay behind forever when the process using it dies. |