Class StandardAnalyzer

Filters StandardTokenizer with StandardFilter, LowerCaseFilter and StopFilter, using a list of English stop words.

You must specify the required LuceneVersion compatibility when creating StandardAnalyzer:

As of 3.4, Hiragana and Han characters are no longer wrongly split from their combining characters. If you use a previous version number, you get the exact broken behavior for backwards compatibility.
As of 3.1, StandardTokenizer implements Unicode text segmentation, and StopFilter correctly handles Unicode 4.0 supplementary characters in stopwords. ClassicTokenizer and ClassicAnalyzer are the pre-3.1 implementations of StandardTokenizer and StandardAnalyzer.
As of 2.9, StopFilter preserves position increments
As of 2.4, Tokens incorrectly identified as acronyms are corrected (see LUCENE-1068)

Inheritance

System.Object

Analyzer

StopwordAnalyzerBase

StandardAnalyzer

Inherited Members

StopwordAnalyzerBase.m_stopwords

StopwordAnalyzerBase.m_matchVersion

StopwordAnalyzerBase.StopwordSet

StopwordAnalyzerBase.LoadStopwordSet(Boolean, Type, String, String)

StopwordAnalyzerBase.LoadStopwordSet(FileInfo, LuceneVersion)

StopwordAnalyzerBase.LoadStopwordSet(TextReader, LuceneVersion)

Lucene.Net.Analysis.Analyzer.NewAnonymous(System.Func<System.String, System.IO.TextReader, Lucene.Net.Analysis.TokenStreamComponents>)

Lucene.Net.Analysis.Analyzer.NewAnonymous(System.Func<System.String, System.IO.TextReader, Lucene.Net.Analysis.TokenStreamComponents>, Lucene.Net.Analysis.ReuseStrategy)

Lucene.Net.Analysis.Analyzer.NewAnonymous(System.Func<System.String, System.IO.TextReader, Lucene.Net.Analysis.TokenStreamComponents>, System.Func<System.String, System.IO.TextReader, System.IO.TextReader>)

Lucene.Net.Analysis.Analyzer.GetTokenStream(System.String, System.IO.TextReader)

Analyzer.GetTokenStream(String, String)

Lucene.Net.Analysis.Analyzer.InitReader(System.String, System.IO.TextReader)

Analyzer.GetPositionIncrementGap(String)

Analyzer.GetOffsetGap(String)

Analyzer.Strategy

Analyzer.Dispose()

Lucene.Net.Analysis.Analyzer.GetObjectData(System.Runtime.Serialization.SerializationInfo, System.Runtime.Serialization.StreamingContext)

Analyzer.GLOBAL_REUSE_STRATEGY

Analyzer.PER_FIELD_REUSE_STRATEGY

System.Object.ToString()

System.Object.Equals(System.Object)

System.Object.Equals(System.Object, System.Object)

System.Object.ReferenceEquals(System.Object, System.Object)

System.Object.GetHashCode()

System.Object.GetType()

System.Object.MemberwiseClone()

Assembly: Lucene.Net.Analysis.Common.dll

Syntax

[Serializable]
public sealed class StandardAnalyzer : StopwordAnalyzerBase, IDisposable

Constructors

Name	Description
StandardAnalyzer(LuceneVersion)	Builds an analyzer with the default stop words (STOP_WORDS_SET).
StandardAnalyzer(LuceneVersion, CharArraySet)	Builds an analyzer with the given stop words.
StandardAnalyzer(LuceneVersion, TextReader)	Builds an analyzer with the stop words from the given reader.

Fields

Name	Description
DEFAULT_MAX_TOKEN_LENGTH	Default maximum allowed token length
STOP_WORDS_SET	An unmodifiable set containing some common English words that are usually not useful for searching.

Properties

Name	Description
MaxTokenLength	Set maximum allowed token length. If a token is seen that exceeds this length then it is discarded. This setting only takes effect the next time tokenStream or tokenStream is called.

Methods

Name	Description
CreateComponents(String, TextReader)

Extension Methods

Number.IsNumber(Object)

SystemTypesHelpers.toString(Object)

SystemTypesHelpers.equals(Object, Object)