DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Stream Processing in the Serverless World
  • A Comparative Analysis: AWS Kinesis vs Amazon Managed Streaming for Kafka - MSK
  • Using CDKTF To Create an AWS Lambda Function
  • Required Knowledge To Pass AWS Certified Data Analytics Specialty Exam

Trending

  • Scaling Microservices With Docker and Kubernetes on Production
  • Rust, WASM, and Edge: Next-Level Performance
  • Event Driven Architecture (EDA) - Optimizer or Complicator
  • Build a Simple REST API Using Python Flask and SQLite (With Tests)
  1. DZone
  2. Software Design and Architecture
  3. Cloud Architecture
  4. Data Migration With AWS DMS and Terraform IaC

Data Migration With AWS DMS and Terraform IaC

Learn how to migrate data from any supported source database to target databases using AWS Database Migration Service (AWS DMS) and Terraform IAC.

By 
Sameer Danave user avatar
Sameer Danave
·
May. 30, 24 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
4.1K Views

Join the DZone community and get the full member experience.

Join For Free

Data is the new oil—a saying I often hear, and it couldn't be more accurate in today's highly interconnected world. Data migration is crucial for organizations worldwide, from startups aiming to scale rapidly to enterprises seeking to modernize IT infrastructure.

However, as a tech enthusiast, I've often found myself navigating the complexities of large volumes of data across different environments. A data migration that is not well planned or executed, whether it is a one-time event or ongoing replication, is done manually, not automated using any scripts, or not tested well, which can potentially cause issues during the migration and increase the delay or downtime. 

To take this challenge head-on, I've interacted with several technology heads to ease data migration journeys and understand how AWS DMS streamlines data migration journeys. AWS DMS sets up a platform to execute migrations effectively with minimal downtime. I've also realized that we can completely automate this process using Terraform IAC to trigger migration for any supported source database to the target database. Using Terraform, we can create an infrastructure required for target nodes and AWS DMS resources, which can complete the data migration automatically.

In this blog, we'll dive deep into the intricacies of data migration using AWS DMS and Terraform IAC. 

In this blog, we'll learn:

  • What is AWS Data Migration Service (AWS DMS)? 
  • How to Automate Data Migration using AWS DMS and Terraform IAC
  • Key Benefits and Features of AWS DMS? 

Let's get started!

1. What Is AWS DMS (Database Migration Service)?

AWS DMS (Database Migration Service) is a cloud-based tool that facilitates database migration to the AWS Cloud by replicating data from any supported source to any supported target. It also supports continuous data capture (CDC) functionality, which replicates data from source to target on an ongoing basis. 

AWS DMS Architectural Overview

AWS DMS Architectural Overview

Use Cases of AWS DMS

AWS Database Migration Service (AWS DMS) supports many use cases, from like-to-like migrations to complex cross-platform transitions.

Homogeneous Data Migration 

Homogeneous database migration migrates data between identical or similar databases. This one-step process is straightforward due to the consistent schema structure and data types between the source and target databases.

Homogeneous Database Migration

Homogeneous Database Migration

Heterogeneous Database Migration

Heterogeneous database migration involves transferring data between different databases, such as Oracle to Amazon Aurora, Oracle to PostgreSQL, or SQL Server to MySQL. This process requires converting the source schema and code to match the target database. 

Using the AWS Schema Conversion Tool, this migration becomes a two-step procedure: schema transformation and data migration. Source schema and code conversion involve transforming tables, views, stored procedures, functions, data types, synonyms, etc. Any objects that the AWS Schema Conversion Tool can't automatically convert are clearly marked for manual conversion to complete the migration.

DMS Schema Conversion

DMS Schema Conversion 

Heterogeneous Database Migrations

Heterogeneous Database Migrations

Prerequisites for AWS DMS

The following are prerequisites for AWS DMS data migration

  • Access to source and target endpoints through firewall and security groups
  • Source endpoint connection
  • Target endpoint connection
  • Replication instance
  • Target schema or database
  • CloudWatch event to trigger the Lambda function
  • Lambda function to start the replication task
  • Resource limit increase

AWS DMS Components

Before migrating to AWS DMS, let's understand AWS DMS components. 

Replication Instance

Replication instances are managed by Amazon EC2 instances that handle replication jobs. They connect to the source data store, read and format the data for the target, and load it into the target data store. 

Replication Instance

Replication Instance

Source and Target Endpoints

AWS DMS uses endpoints to connect to source and target databases, allowing it to migrate data from a source endpoint to a target endpoint.

Supported Source Endpoints Include:

Supported source endpoints include Google Cloud for MySQL, Amazon RDS for PostgreSQL, Microsoft SQL Server, Oracle Database, Amazon DocumentDB, PostgreSQL, Microsoft Azure SQL Database, IBM DB2, Amazon Aurora with MySQL compatibility, MongoDB, Amazon RDS for Oracle, Amazon S3, Amazon RDS for MariaDB, Amazon RDS for Microsoft SQL Server, MySQL, Amazon RDS for MySQL, Amazon Aurora with PostgreSQL compatibility, MariaDB, and SAP Adaptive Server Enterprise (ASE).

Supported Target Endpoints Include

Supported target endpoints include PostgreSQL, SAP Adaptive Server Enterprise (ASE), Google Cloud for MySQL, IBM DB2, MySQL, Amazon RDS for Microsoft SQL Server, Oracle Database, Amazon RDS for MariaDB, Amazon Aurora with MySQL compatibility, MariaDB, Amazon S3, Amazon RDS for PostgreSQL, Microsoft SQL Server, Amazon DocumentDB, Microsoft Azure SQL Database, Amazon RDS for Oracle, MongoDB, Amazon Aurora with PostgreSQL compatibility, Amazon RDS for MySQL, and Amazon RDS for Microsoft SQL Server.

Replication Tasks

Replication tasks facilitate smooth data transfer from a source endpoint to a target endpoint. This involves specifying the necessary tables and schemas for migration and any special processing requirements such as logging, control table data, and error handling. Creating a replication task is a crucial step before starting the migration, which includes defining the migration type, source and target endpoints, and the replication instance.

A replication task includes three main migration types:

  • Total Load: Migrates existing data only.
  • Full Load with CDC (Change Data Capture): Migrates existing data and continuously replicates changes.
  • CDC Only (Change Data Capture): Continuously replicates only the changes in data.
  • Validation Only: Focuses solely on data validation.

These types lead to three main phases:

  • Migration of Existing Data (Full Load): AWS DMS transfers Data from the source tables to the target tables.
  • Cached Changes Application: While the total load is in progress, changes to the loading tables are cached on the replication server. Once the total load for a table is complete, AWS DMS applies the cached changes.
  • Ongoing Replication (Change Data Capture): Initially, a transaction backlog delays the source and target databases. Over time, this backlog is processed, achieving a steady migration flow.

This detailed explanation ensures that AWS DMS methodically guides the data migration process, maintaining data integrity and consistency.

CloudWatch Events

AWS CloudWatch EventBridge delivers notifications about AWS DMS events, such as replication task initiation/deletion and replication instance creation/removal. EventBridge receives these events and directs notifications based on predefined rules.

Lambda Function

We use an AWS Lambda function to initiate replication tasks. When an event signaling task creation occurs in AWS DMS, the Lambda function is automatically triggered by the configured EventBridge rules.

Resource Limits

In managing AWS Database Migration Service (DMS), we adhere to default resource quotas, which serve as soft limits. With assistance from AWS support tickets, these limits can be increased as needed to ensure optimal performance.

Critical AWS DMS resource limits include:

  • Endpoints per user account: 1000 (default)
  • Endpoints per replication instance: 100 (default)
  • Tasks per user account: 600 (default)
  • Tasks per replication instance: 200 (default)
  • Replication instances per user account: 60 (default)

For example, to migrate 100 databases from an On-Prem MySQL source to RDS MySQL, we use the following calculation:

  • Tasks per database: 1
  • Endpoints per database: 2
  • Endpoints per replication instance: 100

Total tasks per replication instance = Endpoints per replication instance / Endpoints per database = 100 / 2 = 50.

This means we can migrate up to 50 databases per replication instance. Using two replication instances, we can migrate all 100 databases efficiently in one go. This approach exemplifies the strategic use of resource quotas for effective database migration.

How To Automate Data Migration With Terraform IaC: Overview

Terraform and DMS automate and secure data migration, simplifying the process while managing AWS infrastructure efficiently. 

Here's a step-by-step overview of this seamless and secure migration process:

Step 1: Fetching Migration Database List

Retrieve a list of databases to be migrated.

Step 2: Database Creation (Homogeneous Migration)

Create target schema or database structures to prepare for data transition in case of homogeneous data migrations.

Step 3: Replication Subnet Group Creation

Create replication subnet groups to ensure seamless network communication for data movement.

Step 4: Source/Target Connection Endpoints

Equip each database set for migration with source and target connection.

Step 5: Replication Instance Creation

Create replication instances to handle the data migration process. 

Step 6: Lambda Integration With Cloud Watch Events 

Integrate a CloudWatch event and Lambda function to initiate replication tasks.

Step 7: Replication Task Creation and Assignment

Create and assign replication tasks to replication instances, setting up the migration. 

Step 8: Migration Task Initiation

Migration tasks are initiated for each database.

Migration Process & Workflow Diagram

Migration Process & Workflow Diagram

Architecture Overview for Data Migration Automation

AWS DMS with Terraform Infrastructure as Code (IAC) automates the data migration. The data migration automation process begins with the dynamic framework of Jenkins pipelines. This framework uses various input parameters to customize and tailor the migration process, offering flexibility and adaptability.

Here's a detailed overview of the architecture:

AWS DMS Architecture with Terraform IAC

AWS DMS Architecture with Terraform IAC

Step 1: Jenkins Pipeline Parameters

The Jenkins pipeline for AWS DMS starts by defining essential input parameters, such as region and environment details, Terragrunt module specifics, and migration preferences. 

Key input parameters include:

  • AWS_REGION: Populates the region list from the repository.
  • APP_ENVIRONMENT: Populates the application environment list from the repository.
  • TG_MODULE: Populates the Terragrunt module folder list from the repository.
  • TG_ACTION: Allows users to select Terragrunt actions from plan, validate, and apply).
  • TG_EXTRA_FLAGS: Users can pass Terragrunt more flags.
  • FETCH_DBLIST: Determines the migration DB list generation type (AUTOMATIC and MANUAL).
  • CUSTOM_DBLIST: SQL Server custom Database list for migration if FETCH_DBLIST is selected as MANUAL.
  • MIGRATION_TYPE: Allows users to choose the DMS migration type (full-load, full-load-and-cdc, cdc).
  • START_TASKS: Allows users to turn migration task execution on or off.
  • TEAMS: MS Teams channel for build notifications.

Step 2: Execution Stages

Based on the input parameters, the pipeline progresses through distinct execution stages:

  • Source Code Checkout for IAC: The pipeline begins by checking out the source code for IAC, establishing a solid foundation for the following steps.
  • Migration Database List: Depending on the selected migration type, the pipeline automatically fetches the migration database list from the source instance or uses a manual list.
  • Schema or Database Creation: The target instance is created by creating the necessary schema or database structures for data migration.
  • Terraform/Terragrunt Execution: The pipeline executes Terraform or Terragrunt modules to facilitate the AWS DMS migration process.
  • Notifications: Updates are sent via email or MS Teams throughout the migration process.

Step 3: Automatic and Manual List Fetching

Fetched migration database list automatically from the source instance using a shell script and keeping FETCH_DBLIST automatic. Alternatively, users can manually provide a selective list for migration.

Step 4: Migration Types

The Terraform/Terragrunt module initiates CDC, full-load-and-cdc, and full-load migrations based on the specified migration type in MIGRATION_TYPE.

Step 5: Automation Control

Initiate the migration task, either manually or automatically, with START_TASKS.

Step 6: Credentials Management

For security, retrieve database credentials from AWS Secrets Manager while executing DMS Terraform/Terragrunt modules. 

Step 7: Endpoint Creation

Establish endpoints for target and source instances, facilitating seamless connection and data transfer. 

Step 8: Replication Instances

Create replication instances based on the database count or quota limits. 

Step 9: CloudWatch Integration

Configure AWS CloudWatch events to trigger a Lambda function after AWS DMS replication tasks are created.

Step 10: Replication Task Configuration

Create replication tasks for individual databases and assign them to available replication instances for optimized data transfer.

Step 11: Task Automation

Replication tasks automatically start using the Lambda function in the Ready State.

Step 12: Monitoring Migration

Use the AWS DMS Console for real-time monitoring of data migration progress, gaining insights into the migration journey.

Step 13: Ongoing Changes

Seamlessly replicate ongoing changes into the target instance after the migration, ensuring data consistency.

Step 14: Automated Validation

Automatically validate migrated data against source and target instances based on provided validation configurations to reinforce data integrity.

Step 15: Completion and Configuration

Ensure user migration and database configurations are completed post-validation.

Step 16: Target Testing and Validation

Update the application configuration to use the target instance for testing to ensure functionality.

Step 17: Cutover Replication

Execute cutover replication from the source instance after thorough testing, taking a final snapshot of the source instance to conclude the process.

Key Features and Benefits of AWS DMS With Terraform

AWS DMS with Terraform IAC offers several benefits: cost-efficiency, ease of use, minimized downtime, and robust replication.

Cost Optimization

AWS DMS Migration offers a cost-effective model as it costs as per compute resources and additional log storage. 

Ease of Use

The migration process is simplified with no need for specific drivers or application installations and often no changes to the source database. One-click resource creation streamlines the entire migration journey.

Continuous Replication and Minimal Downtime

AWS DMS ensures continuous source database replication, even while operational, enabling minimal downtime and seamless database switching.

Ongoing Replication

Maintaining synchronization between source and target databases with ongoing replication tasks ensures data consistency.

Diverse Source/Target Support

AWS DMS supports migrations from like-to-like (e.g., MySQL to MySQL) to heterogeneous migrations (e.g., Oracle to Amazon Aurora) across SQL, NoSQL, and text-based targets.

Database Consolidation

AWS DMS with Terraform can easily consolidate multiple source databases into a single target database, which applies to homogeneous and heterogeneous migrations.

Efficiency in Schema Conversion and Migration

AWS DMS minimizes manual effort in tasks such as migrating users, stored procedures, triggers, and schema conversion while validating the target database against application functionality.

Automated Provisioning With Terraform IAC

Leverage Terraform for automated creation and destruction of AWS DMS replication tasks, ideal for managing migrations involving multiple databases.

Automated Pipeline Integration

Integrate seamlessly with CI/CD pipelines for efficient migration management, monitoring, and progress tracking.

Conclusion

This blog talks in detail about how the combination of AWS DMS and Terraform IAC can be used to automate data migration. The blog serves as a guide, exploring the synergy between these technologies and equipping businesses with the tools for optimized digital transformation.

AWS AWS Lambda Data migration Data (computing) Terraform (software)

Opinions expressed by DZone contributors are their own.

Related

  • Stream Processing in the Serverless World
  • A Comparative Analysis: AWS Kinesis vs Amazon Managed Streaming for Kafka - MSK
  • Using CDKTF To Create an AWS Lambda Function
  • Required Knowledge To Pass AWS Certified Data Analytics Specialty Exam

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!