Cache Data Format
When the client sends an object to the cache, the user object, e.g. Product class, is stored in either of the two data formats in NCache – Binary or Object. The data format is an encoding of the data for storage in the cache, and the format you choose depends on the type of operations you expect to perform over the cache.
Note that when data travels across the network, it has to be serialized regardless of the data format being used. Serialization saves the state of the object in a linear stream of bytes for socket communication. The same serialized object is then reconstructed back to its original form when needed by the client, granted that the assemblies for the custom object (e.g. Product.dll) have been provided.
While serialization is particularly useful in aiding transportation of objects over a network, it comes with a resource overhead of the CPU and IO devices being utilized in the serialization and deserialization process. In case compression or encryption is configured, additional overhead may incur.
We will take an example of the Product class object in the following discussion:
Binary Format
In the binary format, the user objects (e.g. Product object) are serialized from the client end and stored on the server side in the same form. Every time an item is requested from the server, the client receives the binary form of the item which is then de-serialized to Product object locally. The size of an item in binary format is significantly smaller compared to the Object format. Furthermore, serialization/deserialization takes place on the client side only, and just once either while fetching or adding the data – de-serialized while fetching and serialized while adding. This saves cost of serialization/de-serialization, which is noticeable especially in cases where sizeable data is added or fetched from the cache.
Object Format
In case of Object format, the user object is serialized before being sent to the server. The server receives the serialized object de-serializes it back to the Product form and stores it in that format. When the client requests for the object, the server takes the de-serialized object from the cache, serializes it, and then sends it back to the client. The client then de-serializes the binary object to its original form (in this case, Product). This additional serialization and deserialization on the server side leads to twice the cost of the process. Moreover, it should be ensured that the custom object libraries have been deployed on the server side as well, as deserialization will be taking place on the server.
Uses of Data Formats
Binary Format
Binary Format is beneficial if most of your processing is on the client side, and the operations performed are like add, update, fetch, and remove from the cache. For example, a serialized Product object is fetched from the cache to display its contents. Using binary format, the item will only need to be de-serialized once it reaches the client, thus, keeping the processing cost to a minimum. In case Object format is being used, the server will serialize the Product object and send it to the client which will then de-serialize it. This increases the overall overhead and cost of serialization and de-serialization.
Similarly, if an object is being added to the cache using Binary format, it will be serialized before it is sent over the network and it will be stored as it is. In case Object format is used the item will first be serialized on the client side and then de-serialized on the server side before being stored.
Object Format
However, if local processing is required on the server side, without fetching and adding the items back from client like Entry Processors, Aggregator and MapReduce, Object format is required.
For example, you regularly execute an Entry Processor over your cache items. The execution will be performed on the Product objects. If the user object has been stored on the server side as a Product object, it will directly execute the code over the item. If the object has been stored in binary format, then the item will have to be de-serialized first to execute the entry processor.
Moreover, Object formats are valuable in cases where a data source is involved, like loading data from data source on startup (Cache Startup Loader), saving changes made to cache items to the data source (Write-Through) or reading data from the data source (Read-Through). This is because the data is stored in the data source in the object format, no matter what the format is in the cache. If the object is stored in the Object format, there will be no need for additional deserialization apart from the socket serialization for data transfer.
Configuring Data Format
By default, NCache uses the Binary Format. However, in case of Local cache and Client cache, if it is InProc it will take Object format as default, and if OutProc, it will take Binary format as default. You can configure the format for OutProc cache according to your requirements.
You can configure the data format through the Manager in two ways:
Method 1: During cache creation:
Method 2: Through the Main tab:
- Click a cache in Cache Explorer to open cache settings.
- Go to the Main tab.
- Change the cache Data Format as per your requirement.
To de-serialize data in the cache, class assemblies must be deployed on the server (in case of Object format).
Click on the Deploy Task Libraries button at the bottom left of the MapReduce tab.
Browse for the required assemblies, select them and then click Open.
- Right click on the cache name in Cache Explorer and select Apply Configuration option.