Indexing
The purpose of using an index is to optimize speed and performance in finding relevant data for a search. For searching purposes, all searchable attributes should be indexed. Without an index, the search would scan every data, which would require considerable time and computing power. The additional computer storage required to store the index, as well as the considerable increase in the time required for an update to take place, are traded off for the time saved during information retrieval.
NCache internally uses indexing in different modules to boost searching on cache performance and keeps track of it for different purposes.
Cache Searching Index
NCache provides Cache Searching Index in order to boost the execution of queries on cache. For searching purpose, all that is needed to be done is to just index all the searchable attributes. NCache will not search the non-indexed attribute from the cache because it has to traverse through the whole cache to find related data which will affect the cache performance.
Also indexes are defined against the actual data types of the cached items. Through indexing NCache knows what kind of data will be stored and it will store it in searchable way.
NCache provides two ways to define indexes.
Pre-Defined Index (Static Index)
Runtime Index (Dynamic Index)
Pre-Defined Index (Static Index)
To search custom class objects like Product in a cluster cache, these indexes need to be defined in advance. In this way, cache will save that type of data in the form where it can be easily fetched through queries. Indexes can be configured for public, private and protected primitive data members (including string).
These kind of indexes are defined before cache start. To define new indexes, the cache needs to be stopped first and restarted after modifications for deploying changes. On insertion of the indexed type of data in cache, NCache automatically detects its type and it can be used in searching queries afterwards.
Runtime Index (Dynamic Index)
If there are attributes in an application that are defined and used at runtime, then runtime indexing should be used. For example, there may be some calculated values assigned to a product related to its sales. When any product is purchased by any customer, the sale details are saved in cache for generating daily report of sales trend through cached values later. This value is defined at runtime so it’s the runtime attribute of the product.
NCache will automatically create indexes for data that are added by the cache client. It means that indexes don’t need to be configured before using them for runtime attributes.
Runtime index can be further categorized into following types:
Tag Index
Named Tag Index
Tag Index
One or more identification marks can be associated with cache items. These identification marks are called tags. Through tags, the user can associate keywords(s) with cache items. Collection of cached items can be found and removed from the cache by specifying tags too. Tags are only allowed in string format. For every new tag, a new tag entry will be created in the tag index and all related cached items will be associated with that index.
Named Tag Index
If there is a need to use a high level of tagging where tags can have different data types or names and the requirement is to query data related to specific type of tag, then Named Tags should be used.
“Named Tags" is the enhancement of "Tags". Named tags allows the user to store additional information (of any type) required to query the object stored as string. For example, "Highest-Sale-Time" named tag with value of time can be stored when any product sale is the highest in whole day. Named tags allow indexing the data according to required attributes at runtime. Later on, a query can be constructed using named tags to fetch the desired result set. Multiple named tags can be associated with one cached item and vice versa.
When a data is added with a named tag that does not exist in cache, a new index is created for this new named tag. All the related data will be assigned to that index for searching.
Unlike Tag, named tag index can be defined for all primitive types, string and date time. So there is more flexibility to add wide range of searchable data with different data types.
If there are multiple applications that are sharing the same cache and all of them are supposed to add named tags, then make sure that the same named tags have homogenous data types, e.g., if one client is adding named tag "ProductID" with String data type, then all other clients should add values of "ProductID" in String format not in Integer or other for the same cache.
Eviction Index
Eviction is a useful feature in NCache, where when cache is full it decides to evict its existing data so as to accommodate incoming data. In this scenario, eviction can smooth cache operations while keeping cache size limit by removing a configured percentage of data.
NCache provides different polices for eviction. These policies decide which data would be evicted when cache is full. NCache provides the following policies for eviction:
Priority Based Eviction For Priority Based Eviction, index is kept for priority from high to low and when cache is full, eviction thread selects keys from index with low priority and evicts them from cache.
Least Recently Used (LRU) For Least Recently Used (LRU), eviction index is kept for time of usage of keys and when cache is full, eviction thread selects keys from index which are least recently used and evicts them cache.
Least Frequently Used (LFU) For Least Frequently Used (LFU), eviction index is kept for the number of usage of keys and when cache is full, eviction thread selects keys from index which are least frequently used and evicts them cache.
Note
In order to perform Eviction, NCache keeps the index of eviction for cache data for eviction.
Expiration Index
Expiration is a property attached with each item which is being cached. This property is attached with each object so that each cache object can be evaluated upon its age, this age of object will decide upon its expiration time. When that expiration object invalidates, the associated item is removed from the cache.
NCache keeps DateTime
Expiration index for cache keys in order to remove them efficiently on expiration. Expiration manager expires items after Clean Interval.
Clean interval is the periodic interval after which the expired items are removed from the cache. The clean interval for a cache can be changed in configuration settings.
After clean interval has elapsed, expiration manager gets keys from index and expired items are removed from cache. After removing expired items from cache, expiration thread sleeps for “clean interval duration” specified in cache configuration. NCache provides with two types of expirations explained in Data Invalidation Strategies:
- Absolute Expiration
- Sliding Expiration