Class SpellChecker
Spell Checker class (Main class)
(initially inspired by the David Spencer code).
Example Usage (C#):
SpellChecker spellchecker = new SpellChecker(spellIndexDirectory);
// To index a field of a user index:
spellchecker.IndexDictionary(new LuceneDictionary(my_lucene_reader, a_field));
// To index a file containing words:
spellchecker.IndexDictionary(new PlainTextDictionary(new FileInfo("myfile.txt")));
string[] suggestions = spellchecker.SuggestSimilar("misspelt", 5);
Inheritance
Assembly: Lucene.Net.Suggest.dll
Syntax
public class SpellChecker : IDisposable
Constructors
Name | Description |
---|---|
SpellChecker(Store.Directory) | Use the given directory as a spell checker index with a LevensteinDistance as the default StringDistance. The directory is created if it doesn't exist yet. |
SpellChecker(Store.Directory, IStringDistance) | Use the given directory as a spell checker index. The directory is created if it doesn't exist yet. |
SpellChecker(Store.Directory, IStringDistance, IComparer<SuggestWord>) | Use the given directory as a spell checker index with the given IStringDistance measure
and the given |
Fields
Name | Description |
---|---|
DEFAULT_ACCURACY | The default minimum score to use, if not specified by setting Accuracy or overriding with SuggestSimilar(String, Int32, IndexReader, String, SuggestMode, Single) . |
F_WORD | Field name for each word in the ngram index. |
Properties
Name | Description |
---|---|
Accuracy | Gets or sets the accuracy (minimum score) to be used, unless overridden in SuggestSimilar(String, Int32, IndexReader, String, SuggestMode, Single), to decide whether a suggestion is included or not. Sets the accuracy 0 < minScore < 1; default DEFAULT_ACCURACY |
Comparer | Gets or sets the |
StringDistance | Gets or sets the IStringDistance implementation for this SpellChecker instance. |
Methods
Name | Description |
---|---|
ClearIndex() | Removes all terms from the spell check index. |
Dispose() | Dispose the underlying IndexSearcher used by this SpellChecker |
Exist(String) | Check whether the word exists in the index. |
IndexDictionary(IDictionary, IndexWriterConfig, Boolean) | Indexes the data from the given IDictionary. |
SetSpellIndex(Store.Directory) | Sets a different index as the spell checker index or re-open the existing index if is the same value
as given in the constructor.
|
SuggestSimilar(String, Int32) | Suggest similar words. As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match. I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion. |
SuggestSimilar(String, Int32, IndexReader, String, SuggestMode) | Calls SuggestSimilar(String, Int32, IndexReader, String, SuggestMode, Single) SuggestSimilar(word, numSug, ir, suggestMode, field, this.accuracy) |
SuggestSimilar(String, Int32, IndexReader, String, SuggestMode, Single) | Suggest similar words (optionally restricted to a field of an index). As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match. I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion. |
SuggestSimilar(String, Int32, Single) | Suggest similar words. As the Lucene similarity that is used to fetch the most relevant n-grammed terms is not the same as the edit distance strategy used to calculate the best matching spell-checked word from the hits that Lucene found, one usually has to retrieve a couple of numSug's in order to get the true best match. I.e. if numSug == 1, don't count on that suggestion being the best one. Thus, you should set this value to at least 5 for a good suggestion. |