DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
Securing Your Software Supply Chain with JFrog and Azure
Register Today

Trending

  • Performance Comparison — Thread Pool vs. Virtual Threads (Project Loom) In Spring Boot Applications
  • Revolutionize JSON Parsing in Java With Manifold
  • Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL
  • What Is JHipster?

Trending

  • Performance Comparison — Thread Pool vs. Virtual Threads (Project Loom) In Spring Boot Applications
  • Revolutionize JSON Parsing in Java With Manifold
  • Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL
  • What Is JHipster?
  1. DZone
  2. Data Engineering
  3. Data
  4. Common Data Migration Mistakes

Common Data Migration Mistakes

Data migration doesn't have to be complicated. Read on to get some great advice on the data migration process and leave the headaches for someone else.

Garrett Alley user avatar by
Garrett Alley
·
Mar. 26, 19 · Opinion
Like (3)
Save
Tweet
Share
4.85K Views

Join the DZone community and get the full member experience.

Join For Free

Data migration can seem like a simple task: move data from one place to another. How hard can it be? We start a project with the best of intentions, thinking, "What could possibly go wrong?"

Unfortunately, migrating data can be more complex than it looks, and many of the challenges are made up of the things we forgot to do or assumed we didn't need to do. Let's take a look at a few of the common issues that can trip you up when migrating data.

Waiting Until the Target Is Ready to Get Started

Often when migrating data, people wait until the target is ready to get started. But, this is a mistake because a large part of the work in migrating data involves the careful planning and scoping of the project. You'll need to gather requirements and agree on metrics for success. You'll also need to plan schema mapping, data mapping, backup and recovery plans, and security and go-live plans. Each of these steps takes considerable time. And once you've done all that planning, the work of cleansing and normalizing the data needs to happen to get the source data ready to be moved. If you wait until the target is ready to go, you may be behind schedule before you even start.

Surprises in Your Data

Part of your planning should include an assessment of your sources and their dependencies. You need to perform an inventory of all your data assets, and the associated applications, to find dependencies. Pay close attention to the upstream and downstream applications affected by your data migration. A complex project may have between 60 and 80 different data objects coming in from a hundred or so different applications. When you discover new source data or dependencies late in the game, it can throw off your migration timeline and add complexity to your project.

Skipping Data Cleansing

Sometimes when migrating data it seems easier to just move the data and clean it once it is moved to the target. But, the time to clean your data is before you move it. If you were moving to a new house, would you take the contents of your garbage can with you? Likely not. So why would you move bad data? If you move the data without cleansing it, you'll perpetuate the problems that existed in the source data.

Before you move your data, you should take the time to perform a data profile. A data profile is a thorough examination of your existing data. Profiling your data will help you to understand if there are blank or null values, if the data is unique or duplicated, or if the data patterns and values fall into a range you expect. After you perform a thorough data profile, you'll need to perform data mapping to plan how the source types will correlate to the source types in the target. Next, you'll cleanse and validate your data. This involves removing extraneous data, filling in missing data, normalizing data (making it conform to a pattern that is compatible with other data), and masking sensitive data. You may need to transform and enrich the data. Data transformation is the process of converting data from one format or structure into another format or structure. Some of these processes must be done before you extract the data, while others can be done after extracting the data but before loading it to the target. A flexible ETL tool can help ease some of the work in this process.

Not Hiring Experts

Often, the perception of a data migration project is that it is a "shift and lift" operation. This perception leads project leaders to skimp when hiring or assigning staff to the project. The process of migrating data takes an understanding of the complexities of data profiling, data cleansing, and security requirements, among other things. It is easy to underestimate just how complex and challenging data migrations can become, and spending less on experts can cost you in the long run. If you move bad data, or if you neglect security, you can end up with poor data quality or worse, a security breach. At the very least, it can take a long time for a newbie to ramp up, and your project can be severely delayed.

No Rollback Plan

Sometimes when you are migrating data, there is a lot of pressure to keep moving forward. And it might seem tempting to push your changes to the target and fix any issues after you have moved the data. But, a better way to handle this is to have a rollback plan for various stages of the project. This involves performing checks at various stages and having backups configured if you need to roll back changes. While this may seem more tedious, it will save you headaches down the road.

Data (computing) Data migration

Published at DZone with permission of Garrett Alley, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Trending

  • Performance Comparison — Thread Pool vs. Virtual Threads (Project Loom) In Spring Boot Applications
  • Revolutionize JSON Parsing in Java With Manifold
  • Building the World's Most Resilient To-Do List Application With Node.js, K8s, and Distributed SQL
  • What Is JHipster?

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com

Let's be friends: