NCache makes Lucene distributed and scalable by implementing the Lucene API on top of NCache’s Distributed architecture. Starting from NCache, NCache has support for another Lucene feature: GeoSpatial Indexes. Let’s see how to use GeoSpatial Indexes with Distributed Lucene.
Full-Text Searching with Distributed Lucene is divided into two phases: indexing and searching. In the indexing phase, an analyzer creates indexes from some text. Then, the searching phase only uses those indexes.
Since we’re working with GeoSpatial Indexes, we want to index longitude and latitude coordinates in our documents and then search data based on those stored locations.
NCache Details NCache Documentation Edition Comparison
How to use GeoSpatial Indexes with Distributed Lucene
To index documents with geo-spatial coordinates and perform location-based searching, Distributed Lucene uses Spatial4n, “a GeoSpatial Library for .NET.”
Before starting our sample application, we should have NCache installed, and a Distributed Lucene cache already created. To learn how to configure a Distributed Lucene cache, check Create a Distributed Lucene with Persistence Cache.
1. Index Some Landmarks
Let’s use GeoSpatial Indexes to store our favorite landmarks from our past travels.
First, let’s create a Console application and install the NuGet package Lucene.Net.Spatial.NCache
.
In the, Program.cs
file, let’s index some landmarks around Paris. Each landmark has a name, longitude, and latitude. Something like this,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 |
using DistributedLucene.Net.Spatial; using Landmarks; using Lucene.Net.Analysis.Standard; using Lucene.Net.Index; using Lucene.Net.Spatial.Prefix; using Lucene.Net.Spatial.Prefix.Tree; using Lucene.Net.Store; using Lucene.Net.Util; using Spatial4n.Core.Context; const LuceneVersion LuceneVersion = LuceneVersion.LUCENE_48; const string CacheName = "DemoLuceneCache"; const string IndexName = "landmarks"; const string NameFieldName = "name"; const string LocationFieldName = "landmarkLocation"; // 1. Let's index some locations around Paris var favoriteLandmarks = new Landmark[] { new Landmark("Eiffel Tower", new Location(48.858093, 2.294694)), new Landmark("Sacre Coeur", new Location(48.886452, 2.343121)), new Landmark("Louvre Museum", new Location(48.860294, 2.338629)), new Landmark("Palace of Versailles", new Location(48.804722, 2.121782)), new Landmark("Disneyland Paris", new Location(48.867374, 2.784018)), new Landmark("Arc de Triomphe", new Location(48.873756, 2.294946)) }; IndexLandmarks(favoriteLandmarks); // Later, we will search all landmarks close to a reference point here... static void IndexLandmarks(IEnumerable landmarks) { // Create an SpatialStrategy var context = SpatialContext.GEO; var strategy = new RecursivePrefixTreeStrategy( new GeohashPrefixTree(context, maxLevels: 11), fieldName: LocationFieldName); // Open a directory using var indexDirectory = NCacheDirectory.Open(CacheName, IndexName); var config = new IndexWriterConfig(LuceneVersion, new StandardAnalyzer(LuceneVersion)) { OpenMode = OpenMode.CREATE }; // Create a writer using var writer = new IndexWriter(indexDirectory, config); foreach (var landmark in landmarks) { // Create a SpatialDocument from our Landmark var document = landmark.ToSpatialDocument(strategy); // Add a document writer.AddDocument(document, strategy); } // Write all documents writer.Commit(); } public record Location(double Latitude, double Longitude); public record Landmark(string Name, Location Position); |
If you’re familiar with Full-Text Indexing with Distributed Lucene, then creating GeoSpatial Indexes is also quite similar. We have to open an NCache directory, create a writer, and add documents to the writer.
But, instead of using analyzers to index text, we need a SpatialStrategy
. A strategy turns points and shapes into indexable fields.
To index our landmarks, we used RecursivePrefixTreeStrategy
. This strategy supports searching for non-point shapes. To create it, we used a Geohash-based tree and a field name. We will use the same field name later to create documents.
Distributed Lucene introduces a new type of document: SpatialDocument
. It’s a Lucene document with some shapes attached. Like, a point, rectangle, or circle.
This is the ToSpatialDocument()
extension method we used to convert our landmarks into SpatialDocument
,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
using Lucene.Net.Documents; using Lucene.Net.Spatial; using Spatial4n.Core.Context; using static System.FormattableString; using Document = Lucene.Net.Documents.Document; namespace Landmarks; public static class LocationExtensions { public static SpatialDocument ToSpatialDocument(this Landmark landmark, SpatialStrategy strategy) { var document = new Document { new StringField("name", landmark.Name, Field.Store.YES) }; var point = SpatialContext.GEO.MakePoint(landmark.Position.Longitude, landmark.Position.Latitude); document.Add(new StoredField(strategy.FieldName, Invariant($"{point.X} {point.Y}"))); return new SpatialDocument { Document = document, Shapes = new[] { point } }; } } |
Notice that we stored the landmark name in a string field and a string representation of the landmark location in another field named after the SpatialStrategy
field name. Then, we created a SpatialDocument
with a regular Lucene document and a point for the landmark location.
To create points, we used the SpatialContext.GEO
factory instead of using constructors directly.
2. Search Closest Landmarks
Now that we have indexed our favorite landmarks, let’s find five of them, 30 kilometers around the Paris airport.
Let’s update the Program.cs
file to include a SearchAround()
method,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
using DistributedLucene.Net.Spatial; using Landmarks; using Lucene.Net.Analysis.Standard; using Lucene.Net.Index; using Lucene.Net.Search; using Lucene.Net.Spatial; using Lucene.Net.Spatial.Prefix; using Lucene.Net.Spatial.Prefix.Tree; using Lucene.Net.Spatial.Queries; using Lucene.Net.Store; using Lucene.Net.Util; using Spatial4n.Core.Context; using Spatial4n.Core.Distance; const LuceneVersion LuceneVersion = LuceneVersion.LUCENE_48; const string CacheName = "DemoLuceneCache"; const string IndexName = "landmarks"; const string NameFieldName = "name"; const string LocationFieldName = "landmarkLocation"; // 2. Let's find five landmarks, 30Km around the airport // and print them var airport = new Landmark("Charles de Gaulle Airport", new Location(49.009724, 2.547778)); SearchAround(5, airport, 30); static void SearchAround(int landmarkCount, Landmark referenceLandmark, int distanceInKm) { // Create a reader using var indexDirectory = NCacheDirectory.Open(CacheName, IndexName); using var reader = DirectoryReader.Open(indexDirectory); var searcher = new IndexSearcher(reader); var startingPoint = referenceLandmark.ToPoint(); var context = SpatialContext.GEO; var sortByName = new Sort(new SortField(NameFieldName, SortFieldType.STRING)); // Create a circle of 30km around a reference point var spatialArgs = new SpatialArgs( SpatialOperation.Intersects, context.MakeCircle(startingPoint, DistanceUtils.Dist2Degrees(distanceInKm, DistanceUtils.EARTH_MEAN_RADIUS_KM))); var strategy = new RecursivePrefixTreeStrategy( new GeohashPrefixTree(context, maxLevels: 11), fieldName: LocationFieldName); // Create a filter using the same strategy var filter = strategy.MakeFilter(spatialArgs); // Search documents var documents = searcher.Search(new MatchAllDocsQuery(), filter, landmarkCount, sortByName); foreach (var scoreDoc in documents.ScoreDocs) { var document = searcher.Doc(scoreDoc.Doc); // Create a result tuple var (name, awayInKm) = document.ToResponse(startingPoint); Console.WriteLine($"Name: {name}"); Console.WriteLine($"Distance: {awayInKm}"); } } |
Instead of searching for documents that contain some keywords, with GeoSpatial Indexes, we perform searches based on their positions.
To find all landmarks close to the airport, we passed to the Search()
method, a filter, count, and sort order. We created the filter from the same SpatialStrategy
that we used before. And we wrote a SpatialArg
to query all documents inside a circle, centered at a starting point with a given radius in kilometers. Like this,
1 2 3 |
var spatialArgs = new SpatialArgs( SpatialOperation.Intersects, context.MakeCircle(startingPoint, DistanceUtils.Dist2Degrees(distanceInKm, DistanceUtils.EARTH_MEAN_RADIUS_KM))); |
Then, we created a response object from each found document and calculated the distance from the starting point. Like this,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
using Spatial4n.Core.Context; using Spatial4n.Core.Distance; using Spatial4n.Core.Shapes; using System.Globalization; using Document = Lucene.Net.Documents.Document; namespace Landmarks; public static class LocationExtensions { public static (string Name, double DistanceInKm) ToResponse(this Document document, IPoint startingPoint) { var name = document.GetField("name").GetStringValue(); var location = document.GetField("landmarkLocation").GetStringValue(); var positions = location.Split(' '); var x = double.Parse(positions[0], CultureInfo.InvariantCulture); var y = double.Parse(positions[1], CultureInfo.InvariantCulture); var distanceInDeg = SpatialContext.GEO.CalcDistance(startingPoint, x, y); var distanceInKm = distanceInDeg * DistanceUtils.DEG_TO_KM; return (name, distanceInKm); } } |
Notice that we parsed our landmark locations and used the CalcDistance()
with the starting point and found our document position.
These are five indexed landmarks closest to the airport:
NCache Details Download NCache Full-Text Indexing with NCache Lucene
Conclusion
To implement GeoSpatial indexes for Distributed Lucene, we need SpatialDocument
instead of a plain Document
.
Since NCache implements Lucene.NET API, we can scale our Lucene code with NCache by changing a few lines of code and by following a few naming conventions.
To learn about other recent NCache features, check What’s New in NCache? For more details about GeoSpatial Indexes with Distributed Lucene, check Distributed Lucene Geo-Spatial API.
To follow along with the code, we wrote in this post, check my NCache Demo repository on GitHub.