Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

Database Synchronisation Is an Integration Pattern!

DZone 's Guide to

Database Synchronisation Is an Integration Pattern!

There are not only APIs in the world of integration. What if database synchronization was a real integration pattern?

· Integration Zone ·
Free Resource

Image title

Integration pattern

I recently took a look into the DZone Integration Zone, and it was globally all about APIs. How to design APIs, test them, use the latest framework, and so on. But it would be a bad idea to forget that there are many things to integrate!

You might also like: Introduction to Integration Patterns

What Is Change Data Capture?

Change data capture, or CDC, is a kind of technology that synchronizes two data sources together. The synchronization can be bi-directional, and everything's made to make it very simple. Just choose the tables you need to synchronize, and if needed, it will also create new tables in the target database. Of course, you can define other existing tables in a simple manner, it’s up to you!

There are several CDC solutions that exist, whose promises are very cool! For one of my clients, it was able to synchronize data in just seconds, and just take something like 0.1% CPU from the source system! Yes, it’s as magical as that!

What Not to Do With It

Database synchronization sounds like an anti-pattern, or if you prefer a pattern to death. Each data is ruled by business rules, managed in this case by at least two applications that have two different teams. For sure, I deeply advise not having two systems that could change data in its own database.

You’ll have strong consistency issues, and CDC will never synchronize business rules between two applications. You must use data synchronization to only get data that you’ll expose in a read manner or data you’ll treat to make your very own treatments, like BI, for example.

Why Does It Matter?

It matters because you often cannot expose your data from your plain old legacy directly, as it wouldn’t be able to assume the extra charge. And your legacy system is maybe tailored to execute batch transactions when you have needs to expose data to your customer portal, who doesn’t want to wait.

Let’s take an example. You do have all your data and business rules in a mainframe, and you want to build a new portal for your customers. Mostly, customers want to see the data that concerns themselves, watch the offers, and watch a catalog. We generally consider that a portal has 80% of read actions. So how to make the 20%?

My portal is not just a catalog of data! The customer wants to interact, and you told me not to use CDC for other operations that read operations. Well, you just have to remember the CQRS pattern, which says that it will not be the same system that will handle read operations and write/update/delete.

Yes, you must directly “attack” your legacy system. And you have the right to use an asynchronous mechanism if you want to handle that, as you don’t want to overcharge your legacy. Users understand that to confirm their delivery can take time, but they don’t want to wait for searching goods on your site.

Different Solutions?

Many solutions exist, like Attunity, Syncsort, and Talend. I honestly do not know any that are free or open source, and the pricing of these solutions can really vary a lot. They all have their specificity, but an interesting one is Syncsort, which allows synchronizing a mainframe with a Cloudera Hadoop installation.

So, if it can fit one of your needs, don’t hesitate to take a look into these solutions, you’ll be able to find a list in the Gartner Magic Quadrant for Data Integration tools. Be careful, this list mixes CDC solutions with other solutions, such as Data Virtualisation with Denodo (very interesting by the way!)

Further Reading

Enterprise Integration

The Top Twelve Integration Patterns for Apache Camel: Implement in Java or Spring XML

Topics:
cdc ,integration pattern ,database synchronization ,integration

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}