Interface ITermToBytesRefAttribute
This attribute is requested by TermsHashPerField to index the contents. This attribute can be used to customize the final byte[] encoding of terms.
Consumers of this attribute call BytesRef up-front, and then invoke FillBytesRef() for each term. Example:
TermToBytesRefAttribute termAtt = tokenStream.GetAttribute<TermToBytesRefAttribute>;
BytesRef bytes = termAtt.BytesRef;
while (tokenStream.IncrementToken()
{
// you must call termAtt.FillBytesRef() before doing something with the bytes.
// this encodes the term value (internally it might be a char[], etc) into the bytes.
int hashCode = termAtt.FillBytesRef();
if (IsInteresting(bytes))
{
// because the bytes are reused by the attribute (like CharTermAttribute's char[] buffer),
// you should make a copy if you need persistent access to the bytes, otherwise they will
// be rewritten across calls to IncrementToken()
DoSomethingWith(new BytesRef(bytes));
}
}
...
@lucene.experimental this is a very expert API, please use CharTermAttribute and its implementation of this method for UTF-8 terms.
Inherited Members
Assembly: DistributedLucene.Net.dll
Syntax
public interface ITermToBytesRefAttribute : IAttribute
Properties
Name | Description |
---|---|
BytesRef | Retrieve this attribute's BytesRef. The bytes are updated from the current term when the consumer calls FillBytesRef(). |
Methods
Name | Description |
---|---|
FillBytesRef() | Updates the bytes BytesRef to contain this term's final encoding. |