DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Building Scalable Real-Time Apps with AstraDB and Vaadin
Register Now

Trending

  • Building A Log Analytics Solution 10 Times More Cost-Effective Than Elasticsearch
  • Playwright JavaScript Tutorial: A Complete Guide
  • How Web3 Is Driving Social and Financial Empowerment
  • Send Email Using Spring Boot (SMTP Integration)

Trending

  • Building A Log Analytics Solution 10 Times More Cost-Effective Than Elasticsearch
  • Playwright JavaScript Tutorial: A Complete Guide
  • How Web3 Is Driving Social and Financial Empowerment
  • Send Email Using Spring Boot (SMTP Integration)
  1. DZone
  2. Data Engineering
  3. AI/ML
  4. One-Phase-Commit: Fast Transactions For In-Memory Caches

One-Phase-Commit: Fast Transactions For In-Memory Caches

Dmitriy Setrakyan user avatar by
Dmitriy Setrakyan
·
Sep. 29, 14 · Interview
Like (0)
Save
Tweet
Share
5.82K Views

Join the DZone community and get the full member experience.

Join For Free

In my previous blogs I have talked at length about 2-Phase-Commit transaction protocol for in memory caches, and how in-memory caches can handle failures a lot more efficiently than disk-based databases. In this blog I want to cover yet another very important optimization that can be utilized for in-memory caches, specifically for cases where data is partitioned across the network.

In-memory caches, and specifically in-memory data grids, such as GridGain or Oracle Coherence, often employ a technique called data partitioning, where every key in the key set is assigned to a partition, and every partition is assigned to a specific cluster member. Assigning keys to a partition is usually easy and is done similarly to how hash maps work:
key.hashCode() % N, where N is a total number of partitions.
Assigning a partition to a cluster node is a little trickier, as in case of failures or cluster topology changes, the amount of repartitioning has to be minimal. There are various algorithms that can be employed here, such as Rendezvous Hashing, or Consistent Hashing, which we will not be discussing here. Let's assume that after applying some of the partition strategy algorithms, your in-memory cache evenly distributed data among cluster nodes:


Figure 1: Cache Partition Distribution
Usually to achieve best performance we need to minimize the number of cluster nodes participating in a transaction. This can be done by ensuring that all the entries in the transaction belong to the same partition, which consecutively ensures that they all belong to the same node. It will also ensure that the backup copies for these entries will be grouped together on some other node as well (and secondary backups will be grouped together as well, and so on). Such custom key mapping is called custom affinity mapping.

For example, if we have Employee objects and Company objects, then we can ensure that all employees working for the same company will be mapped to the same partition by providing a custom affinity-key for Employees, which in this case will be the "companyId".
Custom affinity mapping helps us ensure that all objects within a single transaction are mapped to the same cluster node, thus collocating the whole transaction logic on that node and minimizing the number of nodes involved in a transaction.
The 1-Phase-Commit optimization is possible only when we have 1 primary and 1 backup copy. Such deployments are most common for distributed caches. If we add 2 backup copies, then we have to resort to 2-Phase-Commit, however, adding a 2nd backup copy is generally considered wasteful from memory standpoint and is rarely done. Diagram below illustrates the 1-Phase optimization:

Figure 2: 1-Phase-Commit for Collocated Partition Transaction
The first deviation from standard transactions is that now the client node sends the whole transaction logic to the primary node. This is possible because we ensure in advance that all the keys we are transacting on are mapped to the same partition on that node. Once the primary node receives the transaction logic, it will acquire all the locks locally, and will send only one commit message (without the prepare message) to the backup node.

Now let's analyze failure scenarios. In this case, failures of client nodes or backup nodes are not very interesting, as they do not affect the primary copies. Failures of primary nodes are a bit trickier, however they are still safe. If the primary node crashes before it sends the commit message, then the backup transaction never starts, so there are no side effects. If the primary node fails after or during the commit acknowledgement is received from the backup node, then the backup transaction is committed, and data consistency is again not violated.

The hardest part is that even though the data remains consistent in case of any cluster failures, how does the client node know whether the transaction was committed or not if it failed to get the final acknowledgement, i.e. if the primary node failed before it was able to send the acknowledgement to the client? In this case the "recovery protocol" is initiated, which was described in detail in my previous blog. Essentially, a message is sent to the backup node asking whether the transaction was committed or not. Since the backup node keeps a backlog of completed transaction IDs in memory, it can always check the backlog. If the backlog does not have the given transaction ID, the backup node will add it in the "rolled back" state and will reply to the client with rollback acknowledgement. If afterwards the backup node actually does receive the commit request for the same transaction ID from the now failed primary node, it can verify in the backlog that it was rolled back and safely ignore it.

Conclusion 

By ensuring that all objects participating in a transaction are mapped to the same logical partition, we can remove the whole "prepare" phase from the distributed commit protocol, thus converting the standard 2-Phase-Commit into very light weight 1-Phase-Commit transactions.
Cache (computing) Backup Partition (database) cluster AI

Published at DZone with permission of Dmitriy Setrakyan, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Building A Log Analytics Solution 10 Times More Cost-Effective Than Elasticsearch
  • Playwright JavaScript Tutorial: A Complete Guide
  • How Web3 Is Driving Social and Financial Empowerment
  • Send Email Using Spring Boot (SMTP Integration)

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: