DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • Production Database Migration or Modernization: A Comprehensive Planning Guide [Part 2]
  • The Bill You Didn't See Coming
  • How Online Databases Replicate Public Records: A Look at Data Aggregation
  • Migration from Lovable Cloud to Supabase

Trending

  • Pragmatica Aether: Let Java Be Java
  • Detecting Bugs and Vulnerabilities in Java With SonarQube
  • Run Gemma 4 on Your Laptop: A Hands-On Guide to Google's Latest Open Multimodal LLM
  • Building Enterprise-Grade Real-Time IoT Dashboards with Vue 3, MQTT, and Kafka
  1. DZone
  2. Data Engineering
  3. Data
  4. What to Do When Data Goes out of Sync?

What to Do When Data Goes out of Sync?

We've all said this at some point. So, how do we fix it? Read this article to find out.

By 
Akshat Kansal user avatar
Akshat Kansal
·
Jun. 22, 18 · Opinion
Likes (1)
Comment
Save
Tweet
Share
4.8K Views

Join the DZone community and get the full member experience.

Join For Free
Oh! The data is out of sync.

I am sure many of us have heard this multiple times when we built systems to either support scale or give a better experience to the user.

We all have seen situations where we want the contents of the database in some other systems, for ex: in a Hadoop cluster for analytics, in Elasticsearch for better search, in cache systems so that the applications are nice and fast.

If we were to do this for a system where the data does not change, it would have been very easy. We could have taken a snapshot of the database and loaded the data in another system.

However, reality has a different story to tell. By the time we are done with loading the snapshot, the data is already stale, which is not really great in today's world.

So, what do we do in case we need real-time data in other systems?

I guess we all end up asking our applications to write to multiple systems. What this means is that every time the application writes to the database, it updates the cache for faster retrievals, reindexes search systems, and sends the data for analytics.

Is there any problem with the current approach? Probably not, until we do not hear that the cache is out of sync or has stale data, the changes they made did not reflect in the analytics because the sync job failed or has not pushed the data. Over a period of time, this approach starts seeing race conditions and reliability issues, and what we end up with is a data drift across multiple systems, a big team of engineers rebuilding caches, making sure data is available across all systems, and tons of monitoring infrastructure.

Now, let's try to see this from a different angle. Let's consider a write to the database as a stream. Every time a database change happens, it is a new message in the stream. If we apply the messages to a system in a similar order, we would end up with an exact copy of the same data in another system. This is typically how database replications work.

This approach to building systems is called Change Data Capture. It is already being used by companies like Yelp, Facebook, LinkedIn, etc.

I am very excited about this, as it allows us to unlock the value of data we already have, and we can feed the data into a central hub where the data can be enriched with event streams and data from other databases in real time. This makes it much easier to experiment with minimal data corruption.

I will write another post on how to implement it.

Data (computing) Sync (Unix) Database

Opinions expressed by DZone contributors are their own.

Related

  • Production Database Migration or Modernization: A Comprehensive Planning Guide [Part 2]
  • The Bill You Didn't See Coming
  • How Online Databases Replicate Public Records: A Look at Data Aggregation
  • Migration from Lovable Cloud to Supabase

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook