Query a Distributed Cache with SQL-Like Aggregate Functions

Today, distributed cache is widely used to achieve scalability and performance in high traffic applications. Distributed cache offloads your database servers by serving cached data from in-memory store. In addition, few distributed caches also provide you SQL-like query capability, which you can easily query your distributed cache the way you query your database i.e. “SELECT employee WHERE employee.city=‘New York’”

First of all, most distributed caches don’t even provide SQL-like querying capabilities. Even a few that do provide this have a very limited support for it. They only provide searching of distributed cache based on simple criteria. Whereas, there are several scenarios where you have to find the result based on aggregate functions i.e. “SELECT COUNT(employee) WHERE salary > 1000” or “SELECT SUM(salary) WHERE employee.city = ‘New York’”. In order, to achieve this you have to first query the distributed cache and then calculate the aggregate function on fetched cache data.

This approach has two major drawbacks. First is that you have to execute query on distributed cache, which involves fetching of all the data from distributed cache to cache client. This data may vary from MBs to GBs, and this operation becomes more expensive when you are also paying for the consumed network bandwidth. Moreover, mostly you don’t need this data after you are done with aggregate function calculations.

Second drawback is that it involves custom programming for aggregate function calculation. This adds extra man hours and still most of the complex scenarios cannot be covered. It would be much nicer if you could continue to develop application for the purpose that it is being built and not worry about designing and implementing these extra features yourself.

These are the reasons why NCache provides you the flexibility to query distributed cache using aggregate functions like COUNT, SUM, MIN, MAX and AVG as part of its Object Query Language (OQL). Using NCache OQL aggregate functions, you can easily perform the required aggregate calculations inside the distributed cache domain. This approach not only avoids the network traffic, but also provides you much better performance in term of aggregate functions calculation. This is because all the selections and calculations are done inside the distributed cache domain and no network trips are involved.

Here is the code sample to search NCache using OQL aggregate queries:

public void Main(string[] args)

    String query = "SELECT COUNT(Business.Product) WHERE
                           this.ProductID > ?  AND this.Price < ?";

    Hashtable param = new Hashtable();
    param.Add("ProductID", 100);
    param.Add("Price", 50);

    // Fetch the cache items matching this search criteria
    IDictionary searchResults = _cache.SearchEntries(query, values);

For more reduced query execution time, NCache runs the SQL-like query in parallel by distributing it to all the cache servers  just like the map-reduce mechanism. In addition, you can use NCache OQL aggregate queries in both .NET and Java applications.

In summary, NCache provides you not only the scalability and performance, but also the flexibility of searching distributed cache using SQL like aggregate function.

Download NCache Trial | NCache Details

Avatar for Iqbal Khan

Leave a Reply

Your email address will not be published. Required fields are marked *