Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Two-Phase-Commit for In-Memory Caches - Part II

DZone's Guide to

Two-Phase-Commit for In-Memory Caches - Part II

What happens when an in-memory cache serves as a layer on top of a persistent database? You need data consistency via read- or write-throughs.

· Database Zone
Free Resource

Traditional relational databases weren’t designed for today’s customers. Learn about the world’s first NoSQL Engagement Database purpose-built for the new era of customer experience.

Generally, persistent disk-oriented systems will require the additional 3rd phase in commit protocol in order to ensure data consistency in case of failures. In my previous blog I covered why the 2-Phase-Commit protocol (without 3rd phase) is sufficient to handle failures for distributed in-memory caches. The explanation was based on the open source GridGain architecture, however it can be applied to any in-memory distributed system.

In this blog we will cover a case when an in-memory cache serves as a layer on top of a persistent database. In this case the database serves as a primary system of records, and distributed in-memory cache is added for performance and scalability reasons to accelerate reads and (sometimes) writes to the data. Cache must be kept consistent with database which means that a cache transaction must merge with the database transaction.

When we add a persistent store to an in-memory cache, our primary goal is to make sure that the cache will remain consistent with on-disk database at all times. 

In order to keep the data consistent between memory and database, data is automatically loaded on demand whenever a read happens and the data cannot be found in cache. This behavior is called read-through. Alternatively, whenever a write operation happens, data is stored in cache and is automatically persisted to the database.  This behavior is called write-through. Additionally, there is also a mode called write-behind which batches up the writes in memory and flushes them to the database in one bulk operation (we will not be covering this mode here).


Figure 1: Read-through operation for K2                                

When we add a persistent store to the 2-Phase-Commit protocol, in order to merge cache and database transactions into one, the coordinator will have to write the transactional changes to the database before it sends the Commit message to the other participants. This way, if database transaction fails, the coordinator can still send the Rollback message to everyone involved, so that the cached data will remain consistent with database. Figure below illustrates this behavior.
Figure 2: Two-Phase-Commit with In-Memory-Cache and Database
Handling failures is actually more straight forward whenever a database is present rather than when it is not. We always assume that the database must have the utmost up-to-date copy, and it is acceptable to reload data from the database into cache whenever in doubt (see  Figure 1). Just like in my previous blog, the most challenging scenario here is when the coordinator node crashes (potentially together with other nodes), because in this case we cannot tell whether it crashed before it was able to commit to the database or not. Other failure scenarios are handled the same way with database present as without.
Whenever we cannot tell whether the database commit had happened or not, we can simply reload the relevant data from database into cache upon committing the transaction. This effectively ensures that database and cache always remain in consistent state.

Conclusion

When working with in-memory caches, we can always manage to keep the data within transactions consistent by slightly enhancing the standard 2-Phase-Commit protocol. The main advantage of in-memory vs. disk is that failure handling does not introduce any additional overhead, and we do not need to add an expensive 3rd phase to the 2-Phase-Commit protocol in order to keep caches consistent with databases in case of failures.

Learn how the world’s first NoSQL Engagement Database delivers unparalleled performance at any scale for customer experience innovation that never ends.

Topics:
opinion ,database ,persistence ,performance ,cloud platforms

Published at DZone with permission of Dmitriy Setrakyan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

THE DZONE NEWSLETTER

Dev Resources & Solutions Straight to Your Inbox

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

X

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}