On Eventually Consistent Data
On Eventually Consistent Data
Join the DZone community and get the full member experience.Join For Free
Build vs Buy a Data Quality Solution: Which is Best for You? Gain insights on a hybrid approach. Download white paper now!
Eventually consistent data seems to be the buzzword nowadays, especially in any NoSQL discussion. For those not versed in tech talk, having eventually consistent data means that you’re willing to sacrifice data consistency in order to gain in other areas. The most common area is performance or throughput. Not having to check for consistency every time speeds up an application tremendously. However, you cannot guarantee that your data will be consistent at a certain point in time.
I’ll give an example. Imagine you have a online bookstore. If you would strive for consistent data, it would mean that for every book ordered you’d have to check whether that book is available. Even more, you’d have to ‘put the book aside’ while the customer is finishing his order. Otherwise, another customer would be able to order the same copy of the book (if for example it’s the last book in stock) and ‘steal’ it from the first customer if he’s quicker. Needless to say, this would result in a very complex system. If on the other hand you would strive for eventually consistent data, you’d accept orders without first checking whether you have the book in stock. You assume you’ll be able to deliver the ordered goods regardless of the fact you don’t have enough books in stock to immediately fulfill the orders. The result is that you’ll be able to service your clients faster, but with the added risk that you’ll have to do some extra work in order to get your orders actually delivered. In other words, it’s your responsability to make sure your data eventually gets consistent. Depending on the case, this could be easy or hard to accomplish.
Most developers have grown accustomed to transactional, consistent data. Even for me it’s still hard to accept the fact that eventually consistent data can be enough. However, today a lot of systems are eventually consistent, especially on the internet. And most of us won’t even realize it. And that’s the real beauty around the entire concept and the reason why I’ve come to accept that sometimes it’s enough to have your data in non-consistent state at a given point in time: users simply don’t care how you’ll be able to fulfill their request, as long as their request gets fulfilled (rather soon than later, but in any case eventually). Most stores that are asynchronous in nature (for example webshops) utilize this mechanism.
Most systems don’t need the performance that’s the common reason for choosing eventual consistency. However, it is possible to build your application in such a way that eventual consistency can be supported in the future. To achieve that you could adopt a CQRS-based architecture for example that initially ensures synchronous view updates whenever data is changed, effectively ensuring transactional data consistency. When the need arises, you can change to asynchronous data view updates or even event-sourcing in order to split up write and read operations.
However, systems that rely on eventually consistency are also a lot more complex and require a real in-depth knowledge of the problem domain and the issues you’re solving by using eventually consistent data stores. In order to achieve consistency in such a system, it needs to be allowed to ‘evolve’ into a consistent state. And that makes for some very hard issue solving when it comes to programming, as you can’t guarantee when a system has evolved into a state in which it can be considered consistent.
The point I’m trying to make is that most projects don’t need eventually consistent data from the beginning. It’s a need that arises when certain conditions occur in projects. Whenever you hear someone at the inception of a project say ‘Well, we need an eventually consistent data store’, in 99% of the cases you can pull the ‘Bullshit!’ card. While applications that utilize eventually consistent data stores are an order of magnitude faster that regular application (most of the time), those systems are also an order of magnitude more complex and notoriously hard to debug. There’s no doubt that developing systems on eventually consistent data stores can be a lot of fun, but again, clients don’t care about your fun. My advice? Use an architecture that allows for, but not mandates, eventual consistency such as CQRS. And start off simple but agile. In other words, don’t implement a solution for a problem you don’t have (yet).
Published at DZone with permission of Lieven Doclo . See the original article here.
Opinions expressed by DZone contributors are their own.