Automating Database Migrations With Containers and Git
Script manifest files are used to deliver custom database migrations.
Join the DZone community and get the full member experience.Join For Free
Managing database migrations for multiple environments and teams can be challenging. This article describes how Git, database containers, and cloned databases, are combined for custom dev, test, and staging environments, each delivered in seconds.
While Git is often used with database containers, this approach introduces two new elements. Rather than restoring databases from backups, or building databases from source, identical secure production database environments are cloned and delivered in seconds. Database clones are writable, and easily incorporate data masking and synthetic test data. The second element is a script manifest file used in creation and application of personalized migration scripts.
A developer can work on a feature branch with a production database clone with feature scripts applied automatically. At the same time, testing can work on a release branch with an identical production database clone, with a set of release scripts applied. A pipeline stage can test a release branch rollback with a third identical secure production database clone, with upgrade and rollback scripts applied automatically.
This article focuses on SQL Server, but the same methods support Postgres and MySQL as well.
Any public or private Git repo can be used, including GitHub, GitLab, or Git on a private VM.
A database image is a set of production databases, restored from backups, or database files, with data masking scripts applied. Users can specify scripts from a Git branch and repo, to support specific dev, test, and staging environments. When transaction log backups are available the image can be updated incrementally, forever.
A script manifest file is a text file listing scripts to be run.
A database container is a database instance providing database services.
Database clones are writable databases delivered from an image that are delivered in seconds and require only 40 MB of storage on delivery.
Building a Database Image
A dockerfile includes one more paths to production database backups and data masking scripts. At run time a Git repo is cloned into the container file system, with a specified branch checkout. At run time users or a pipeline provide two environment variables, highlighted in red, specifying the Git branch and a script manifest file (more on this below). The final step includes a PowerShell command that creates a concatenated “all.sql” script, reflecting the order of scripts in the manifest file, which is then run.
The image is built using standard docker command:
>docker build -t microservice1 c:\path\to\dockerfile
A manifest is a text file listing the paths for scripts relative to the root of the Git repo, in the order they are to be run. Multiple manifest files can be used. One manifest can specify upgrade scripts, and a second includes upgrade scripts followed by rollback scripts. A manifest.txt file could include:
Development and Test Environments
Development begins with delivery of a SQL Server container with a production database clone, but with no Git based scripts applied. As work progresses, scripts are committed to the repo, and a manifest file is created and committed listing scripts in the order they are to be run.
Users and pipelines provision environments using docker commands, or restful API. Log output for scripts is available via the Rest API. Environments are used to unit test the SQL scripts and applications, with specific manifest files and Git branches. Delivery includes user inputs highlighted in red.
Release Branch Testing
As work progresses, developer branches are merged into a release branch, with updated manifest files. Testers and DevOps pipelines can now deliver release testing environments using the same database image, referencing the release branch and updated manifest files. Database migrations with Git supports a complete dev/test life cycle with a consistent database environment.
Data Governance and Security
While this approach simplifies and automates database development and test, it also creates a secure centralized data repository. Where organizations struggle with VM and instance sprawl, containers allow for consolidation on the container host. A single database image easily supports 20 to 50 simultaneous environments, for a potential 95% reduction in storage.
Opinions expressed by DZone contributors are their own.