DevOps (a portmanteau of development and operations) is a practice that emphasizes collaboration between software developers and other information-technology (IT) professionals while automating the process of software delivery and infrastructure changes.
DevOps focuses on organizational culture, while continuous delivery and continuous integrations are mainly about automation and tests, which require a trustworthy source control. An ever-increasing number of organizations are implementing DevOps and continuous delivery process. They are fueled by reports of the benefits, which include quicker time to market, reduced costs, and higher quality products.
Effective DevOps and continuous delivery must encompass the database, but the database represents some unique challenges not faced in the application. Here we will address some of these challenges and provide best practices to handle them.
In order to generate the correct database change scripts, the build phase must have information on the current structure and source control structure. But only having the current and source control structure, as is the case with standard compare and sync tools, is not enough.
Simply comparing two environments does not provide insight regarding the nature of the differences, for example:
- A case where the difference conflicts with an emergency fix.
- The trunk/stash/QA environment was already updated with other changes from a different branch.
- The later environment (trunk/stash/QA) is more up-to-date regarding specific objects – thus the difference should not be part of the delta changes script.
This missing information is only available with baseline aware analysis. The input for the database build phase should absolutely be taken from the source control repository which includes only changes that were checked-in and does not include changes that are still in work-in-progress mode. This brings us to the starting point of the process—the source control and how to make sure the build process retrieves the relevant changes.
Develop Using a Reliable Database Source Control
In this phase, developers introduce changes to the database structure, reference lookup content, or logic in the database (procedures, function, etc.)
The two common approaches of database development are:
- Using a shared database environment for the team.
- Using a private database environment for each developer.
Both methods have many advantages and challenges.
Using a shared database environment reduces the code merges for the database code and also reduces the complexity and cost of updating the database structure based on the source control. Using a private database environment causes many merges of the database code, but reduces the potential of code overriding by another colleague. In addition, a private database environment may have other factors to consider such as management overhead, licenses, hardware, and cost.
The primary reason why the private environment method is not commonly used, relates to how developers publish changes from their private workspace environment to the integration environment. Publishing changes should not revert changes made by someone else and updating the private environment from the source control repository should not revert work-in-progress.
The same process of building the native code using only changes which are documented in the source control repository should be applied to database code changes. Developers work on the native code in the IDE and then check-in the changes to the source control repository without any additional manual steps. Having a file-based script that a developer is maintaining for his/her changes will create a few challenges that will be difficult to resolve and will require a lot of time.
How to guarantee that the version control repository correctly represents the database structure that was tested.
If Developer A makes a number of changes to a script and Developer B makes other changes to the script, none of the developers can execute his/her entire script because the script overrides (or reverts) the changes introduced by the other developer.
In addition, there are other challenges that occur in deployment phase but originate in previous phases:
- Controlling the order of the execution of scripts created by several developers.
- Maintaining the change scripts on a release scope change.
Instead of running many small scripts (in the same order they’ve been executed in QA) which may change the same object several times, execute fewer scripts, and change the object only once—this is difficult to practice as it is expensive to generate the script from scratch and test it.
Source Control – Single Source of Truth?
Anyone with sufficient database credentials may log in to the database, introduce a change, and forget to apply the change in the relevant script of the file-based version control. This is what has reportedly happened in finance, insurance, online travel, algo-trading, gaming, and other industries.
Database Deployment Logic
Another unique challenge from the database point of view is how deployment is done. Can the database deployment process act as the native code—replacing the existing database/table in production with the new database/table from the development? Or does it have to alter the existing database structure in production from the current state to the target state to preserve the data?
Deploying native code artifacts—binaries of Java, C#, C++—is done by copying the new binaries and overriding the existing ones (current state has no effect on the binary content). Deploying database code changes is done by changing the structure of the database or schema from the given state (current state) to the target state (end point). When executing a script in the database, the given state (current point) must be the same as it was when the script was generated otherwise the outcome is not predictable.
When using the correct method, all steps of automated database Continuous Delivery are possible, from the build, to deploy, to test.