Conflict Resolution: Using Last-Write-Wins vs. CRDTs
LWW isn't always the wrong approach to use. However, CRDTs can simplify development when it comes to geo-distributed applications.
Join the DZone community and get the full member experience.Join For Free
Some say that the cloud is quite reliable and easily forget about the outages from 2017. Let's be more specific: a single region may not be that reliable in the cloud but multiple-region deployments are! Cloud infrastructure makes deploying multi-regional, geo-distributed applications simpler to a certain degree; many public clouds affords a wide choice of regions.
That's where the simplicity ends, however. Developing active-active geo-distributed applications is still not easy. Imagine we are maintaining a shopping cart — it is a private session of a user trying to dress up for Halloween.
At time t1, the user adds a costume to the cart.
At time t2, before the cart could replicate over to the European datacenter, the U.S. datacenter fails. The user is routed to the European datacenter.
At time t3, the user adds a mask that goes with the costume.
At time t4, the U.S. datacenter recovers. Perfect! What's in the shopping cart now?
This is a fairly straightforward case with simple geo-failover. However, even with this case, the failover causes a conflicting write with itself.
Common Techniques for Resolving Conflicts
NoSQL provides a number of techniques to resolve such write conflicts. Many use LWW (last-write-wins). Simply put, last-write-wins identifies the latest write and keeps that. We all know that clocks on machines drift, and it is easy to get skewed results. Even with NTP in place, skews can be in a half-second range, and maybe more.
NoSQL provides this by keeping a timestamp of some sort that help them decide which write came last. Databases like DynamoDB or Cassandra use LWW to process writes. With these folks, strongly consistent writes require write to a quorum of shards, thus will cost you more money. (See Q: What is the consistency model of Amazon DynamoDB? or Cassandra consistency levels.).
In some cases, LWW can be acceptable. However, in this case, the cost of LWW is a lost update: at time t5, you lose either the costume or the mask. That will be one upset user on Halloween.
We know what the ideal result would be here: to keep both items in the cart and merge them. The good news is that advanced techniques can easily give you this result — conflict-free replicated data types (CRDTs) are designed to do that.
CRDTs design a conflict resolution technique based on data type. For a set like a shopping cart, they can engage things like OR sets (observed remove sets), which provides the ideal result for the shopping cart. The idea with CRDTs is that each "type" (like a shopping cart) acts with intelligence to resolve conflicts automatically. With CRDTs at time t5, you get a shopping cart with both costume and mask.
However, there is a big problem here. Databases typically don't care to understand whether your table/document is representing a shopping cart or a product inventory. There are databases that understand structures, such as Redis structures and Riak. With CRDTs, a type like a counter, stack, list, or queue engages different techniques to resolve conflicts that are more intelligent than LWW. Instead of blindly picking one of the instances of the data, CRDTs implement sets, counters etc. correctly when concurrent writes come together. For example, Redis Sets are commonly used for bags of items and use OR sets. Each type, combined with its methods, drives conflict resolution based on the sequence of updates.
LWW isn't always the wrong approach to use. However, CRDTs can simplify development when it comes to geo-distributed apps. If you are simply looking for more reliable failover or looking to handle concurrent writes across geographies correctly, CRDTs can give you a leg up over LWW.
Opinions expressed by DZone contributors are their own.