Continuous Delivery Patterns and Anti-Patterns

Section 1

Introduction

The competitiveness of today's market requires companies to develop and provide new functionalities as quickly as possible to position themselves against their competitors. To achieve this, companies have changed their working methods, notably by moving to a more agile model like the DevOps methodology.

DevOps requires taking into consideration several principles in order to ensure the continuous delivery (CD) of new functionalities in production. This Refcard will review the critical challenges of implementing a CD process and present the patterns and anti-patterns of its integration into existing software-development-related workflows.

Section 2

What Is Continuous Delivery?

Continuous delivery represents the workflows, activities, and automations needed to shepherd a new piece of functionality from ideation to the end user. CD is today a pillar of modern application development to expand upon continuous integration (CI) processes by deploying all code changes to a testing environment and/or a production environment after the build stage. The main goal of the CD process is to allow developers to always have a deployment-ready build artifact that has passed through a predefined list of tests.

Developing, maintaining, and optimizing a CD process requires specialized skills and tools throughout the CI/CD pipeline. Because this type of release system requires the fast delivery of complex solutions with very short feedback loops and high degrees of cross-functional collaboration, the DevOps methodology is perfectly suited to enable CD.

In DevOps, the notion of CD is often interpreted as a single phase combining the delivery of artifacts continuously and their automatic deployment into production. Continuous deployment can be considered a special case of continuous delivery. In this practice, the team (developers and quality assurance) must ensure the builds pass all tests and that the test suites are adequate for qualifying builds and automatically deploying them.

Both continuous deployment and continuous delivery depend on real-time infrastructure, application monitoring, and alerting tools to maintain the product and uncover any issues that were not found before release. Continuously testing and monitoring the product and incorporating new releases into tests are the ultimate aspects of quality control that any successful product needs. So both phases have the same role: ensuring the quality of service of an application in order to provide an optimal product to customers.

Benefits of Continuous Delivery

A well-constructed and documented automated process has many advantages:

Team empowerment – Because the deployment pipeline is a pull system, testers, developers, operations, and other stakeholders can self-service the application version into an environment of their choice.
Error reduction – Ensuring the correct version, configuration, database schema, etc. are applied the same way every time through automation reduces the risk of flaws later in production.
Risk reduction – Even if end users do not notice the significant changes, releasing on a smaller scale several times a day is often more effective than a giant release on a weekly basis or an even longer timescale.
Decreased stress – Through push-button releases to production and deployment rehearsals, a release becomes commonplace without the typical stress that comes with a code update.
Deployment flexibility – Instantiating a new environment or configuration by making a few changes to the automated delivery system creates more flexibility in the deployment process.
Practice makes perfect – Through the deployment pipeline, the final release into production is being rehearsed every time the software is deployed to any target environments.

Section 3

CD Patterns and Anti-Patterns

This section lists a set of CD-related patterns and anti-patterns in a logical order: from delivery to deployment as well as some requirements external to the automated process but necessary to facilitate its adoption.

Delivery

The delivery phase is an extension of continuous integration, involving the automatic deployment of all code changes in a test environment to qualify the source code and the compliance of a version before its deployment into production.

Automate Testing

CD lets developers automate testing beyond just unit tests so they can verify application updates across multiple dimensions before deploying to production. These tests can be different formats:

UI testing
Load testing
Integration testing
Regression testing
API reliability testing

The main goal is to help developers thoroughly validate updates and preemptively discover issues to minimize impact on the customer experience. As an important part of the DevOps methodology, the automated testing phase is an excellent opportunity to break down silos by combining the efforts of development, quality assurance (QA), and operations teams to improve the reliability of the deliverable.

Pattern: Automate the verification and validation of software to include unit, component, capacity, functional, and deployment tests

Anti-Pattern: Use manual tests to verify and validate software

Mock the Environment

Mocking means to create a fake version of an external or internal service that can replace the real one in order for developers to test the source code faster and more reliably, using unit tests, for example. Mocking is important for ensuring the portability and reproducibility of the CI pipeline. In CD, these same processes can also be used by the operations and QA teams to perform various tests.

Pattern: Run tests the same way on any platform (laptop, on-premises, cloud, container orchestration platform, etc.) to always have the same result with mocked data

Anti-Pattern: Run tests that will potentially fail based on the status of the infrastructure

Define Release Conventions

Convention definitions are necessary in any work involving multiple teams, such as in the case of automated release management processes. Conventions make it possible to easily identify the status of the deliverable in its lifecycle, facilitate the automation of each step, and guide the onboarding of new people.

A deliverable can have several conventions:

Package naming – An application's name separated by dashes (e.g., kube-prometheus-stack-35.4.2)
Package suffix:
- SNAPSHOT – Package deployed in the development environment
- RC (Release Candidate) – Package deployed in the QA testing environment
- RELEASE – Package ready for production
Semantic versioning convention – Standardize the distribution of any artifacts based on code changes

Pattern: Define an enterprise-wide release convention that all development teams follow to standardize artifact management and facilitate automation

Anti-Pattern: Do not increment the application version to overwrite the previous artifact

Promote Artifacts

In order to promote artifacts from one environment to another, you must first decouple the application from its configuration. Indeed, artifact promotion is also necessary to separate the CI process (build) from the CD process (deploy, run) in following the twelve-factor app principles. Promoting an artifact consists of building the deliverable once and using the same one in all environments, and only changing the configuration.

Pattern: Build once and promote the same artifact from environment to environment, while using different configurations each time

Anti-Pattern: Rebuild the deliverable for each environment

Manage Hotfixes

A hotfix is generally defined as a patch to a live system due to a bug or vulnerability that meets a certain level of risk and severity. Normally, a hotfix is created as an urgent action against problems that need to be fixed immediately and outside of the normal git workflow. As part of a software development cycle, the development team should have a flexible definition of a hotfix and an internal method for determining what meets the needs for a hotfix. When a critical bug in a production version must be resolved, a hotfix branch may be plugged off from the corresponding tag on the main branch that marks the production version. That way, the team members can continue working on the development branch while another person prepares a quick production fix.

Pattern: Deploy a hotfix as soon as possible; test the code in staging before moving it to production

Anti-Pattern: Schedule the deployment of a hotfix; test directly in production

Deployment

The deployment phase takes the features validated by the QA team in a staging environment and flags them as ready for production — whether for manual or automatic deployment into the production environment.

Automated Deployment

With CD, software is built in a way that enables it to be released into production at any time. To do this, a CD pipeline involves production-type test environments in which a new software version is automatically deployed for testing. Once the code is validated, the CD process requires human intervention to approve production deployments, and then the deployment itself is done by automation.

Pattern: Build your binaries once, while deploying the binaries to multiple target environments as necessary

Anti-Pattern: Build software in every stage of the deployment pipeline

Blue-green deployment is a technique that eliminates downtime and reduces risk by running two identical production environments that are called blue and green. At any given time, only one of the environments is live, with the live environment serving all production traffic. For example, let's say blue is currently active and green is inactive. When preparing a new version of software, the deployment and the final stage of testing take place in an environment that is not live: in this example, green. Once the software is deployed and fully tested in green, all the incoming requests can be routed to green instead of blue. Green is now alive, and blue is inactive.

When using this technique, if something unexpected happens with the new version on green, the traffic can immediately return to the latest version in the blue environment.

Canary deployment is a strategy of deploying a software version incrementally. The idea is to first deploy the change to a small subset of servers, test it, and then deploy the change to the rest of the servers. The target environment's infrastructure is updated in small phases (e.g., 10% > 25% > 75% > 100%) to limit the impact of downtime on users: If the canary deployment fails, the rest of the users or servers are not impacted.

Feature Flags

Within the context of CD, feature flags allow developers to release software faster with less risk and more control. Feature flags play a key part in CD schemes where features are constantly being deployed but are not necessarily available for all the users in production. It's important to treat every change equally, meaning that even if it's an emergency, developers must avoid making changes or performing other work outside the scope of the CD pipeline. Feature flags support this practice by disabling a feature when it is no longer needed, and vice versa.

Pattern: Deploy new features or services to production but limit access dynamically for testing purposes

Anti-Pattern: Wait until a feature is fully complete before committing the source code

GitOps

GitOps is a methodology that aims to optimize the time/effort between the development and operations team members. The main components of the GitOps methodology are:

A versioning source control tool that acts as a single source of truth for declarative infrastructure and app configurations
An automated process to match the production environment to the state described in the repository

GitOps improves continuous delivery by empowering the process with verifiable and auditable changes, automated deployments, and rollbacks in case of failure.

Pattern: Write all deployment processes in scripts, check them into to the version control system, and run the scripts as part of the single delivery system

Anti-Pattern: Use deployment documentation instead of automation; perform manual deployments or partially manual deployments

Disposable/Preview Environments

A disposable environment is an on-demand environment based on the automation of cloud infrastructure provisioning, configuration, deployment, and deletion processes. It relies on CD automation to deploy software the same way as done in another environment. This practice can be complex because it requires a good understanding of the application architecture to deploy all its dependencies.

Disposable environments have different roles:

Create a reliable and repeatable end-to-end test configuration in a similar but separate on-demand environment, such as a production architecture-based disposable environment, to test a new functionality
Create a dedicated environment for support engineers to replicate customer bugs with exact versions of languages and dependencies
Start a new stable environment dedicated to product demonstrations for customers

Pattern: Utilize the automated provisioning, scripted deployment, and scripted database patterns; any environment should be capable of terminating and launching at will

Anti-Pattern: Fix environments to "DEV," "QA," or other predetermined environments

A/B Testing

Experimentation and feature management must work hand in hand. Experimentation is an important way for a company to validate ideas before launching new products. With development teams using CI and CD processes, flags that control the deployment of new experiences can mitigate the risk of launching something that hasn't worked for everyone at the same time. By first running an A/B test with some of the traffic, developers can test and gradually optimize a new feature. Once the best user experience has been achieved, it can be deployed in a controlled manner across the entire customer base to reduce the risk of technical issues related to the publishing process.

Pattern: Automate tests to verify the behavior of the application; continually run these tests to provide near real-time benchmarking

Anti-Pattern: Never benchmark the performance of new features

Permission Management

Controlling access to automation tools used in a CD pipeline is necessary as these processes provide a measure of stability and resilience for an application, and they often have access to confidential data (e.g., company source code, passwords). Securing their access is essential to avoid any human error due to a lack of attention or malicious behavior that could negatively impact the company.

Pattern: Secure all actions on the deployment job's orchestration platform by defining a role-based access policy to scripts and pipelines

Anti-Pattern: Allow anyone to launch any job at any time without controls or checkpoints

Four-Eyes Principle

The four-eyes principle means that any activity by an individual within the organization must be approved by at least two people. This control mechanism is used to facilitate delegation of authority and increase transparency. This approach not only ensures process efficiency by allowing rapid decision-making while ensuring effective monitoring and control, but also leads to cultural change to minimize risk.

Pattern: Define at least two reviewers for each project and apply a rule to obtain at least one approval to deploy new code into production

Anti-Pattern: Bypass peer validation and deploy changes

Rollback

The process of rolling back means returning the system to its last working state. This ensures a system can be immediately restored if a system failure occurs to minimize disruption to the business.

Automate Processes

Designing a CD pipeline requires determining predefined steps to roll back the application manually or automatically after a deployment failure. The rollback process occurs only for components of an application that were not skipped during the last deployment. Combining an AIOps platform with the CI/CD pipeline can form a truly automated deployment pipeline that not only facilitates deployment but can also verify and recover from any anomalies or failures that are observed in a production environment. Reducing downtime, therefore, minimizes the impact on customers.

Pattern: Provide an automated single-command rollback of changes after an unsuccessful deployment

Anti-Pattern: Manually undo changes applied in a recent deployment; shut down production instances while changes are undone

Dissociate Configuration from Code

Adding flexibility to the application's management strategy is essential in a world where dynamic platforms (e.g., for container orchestration) are becoming more prominent and widely used. Source code and configuration files are two distinct components and should have their own management strategy. Source code should work the same way in all environments because it is immutable. Configuration files are an external part of the application that matter only during execution and can be overridden before starting the application.

Dissociating configuration files and source code makes it easier to maintain and secure them for many reasons, including:

Operations can update the contents of a configuration file without changing the application's source code.
The same artifact can be promoted without having to rebuild it.
The application is easily portable into a new environment.
Security teams can better audit the access to sensitive data.

Pattern: Capture changes between environments as configuration information; externalize all variable values from the application configuration into build/deployment-time properties

Anti-Pattern: Hardcode values inside the source code or per target environment

Release Your Data

Databases are the cornerstones of all modern software projects; no project of any scale beyond a prototype can function without some form of a database. Continuous database integration is the integration of the database schema and logical changes in application development efforts. Applying the same principles of integration and deployment patterns to the database allows all database changes to flow through the pipeline of each software version, synchronized with the application code.

The main goal is to keep a release's code aligned with the schema of a database, which is essential when launching a new feature — and even more crucial during a rollback where retro compatibility must be ensured.

Pattern: Ensure your application is backward and forward compatible with your database so you can deploy and roll back each independently

Anti-Pattern: Keep a strong dependence between the database schema and the application source code, not being able to deploy one without the other

Observability

Observability is now an essential component of any architecture to effectively manage a system, determine whether it is working properly, and decide what needs to be fixed, modified, or improved on any level, CD processes included.

Monitoring

The impact of collecting deployment pipeline metrics is oftentimes minimized or forgotten. DevOps indicators are data points that directly reveal the performance of a development pipeline and help to quickly identify and eliminate any bottlenecks in the CD process. These metrics can be used to track the progress of a DevOps transition or measure the adoption of automated processes and associated tools by development teams.

Four metrics that are important to measure:

Lead time for changes – The time needed to push new changes to production.
Deployment frequency – How often builds are deployed to an environment.
Change failure rate – How many changes result in defects.
Mean time to recovery – The time required to recover from failure.

All these metrics are essential for measuring the speed at which teams can correct existing bugs, develop new functionality, and deploy to production. That is, in order to improve a company's competitive advantage, a main objective must be to better organize and manage projects.

Pattern: Deploy software one instance at a time while conducting behavior-driven monitoring; if an error is detected during the incremental deployment, a rollback release is initiated to revert changes

Anti-Pattern: Conduct non-incremental deployments without monitoring

Logging

Every year, companies across industries adopt the DevOps philosophy, but without the necessary data to support their decisions, every deployment can be a risk. One way to mitigate such risk and allow for quick and safe changes — to both overall deployment processes and the software itself — is to enrich the continuous delivery processes with log data coupled with strategic log management.

Logs can offer insight into a specific event in time. Log data can also be used to forensically identify any potential issues proactively before they cause real problems in a system. But logs — especially in modern, hyperconnected, ecosystem-based services — require appropriate optimization to be effective:

Structure logs – Logging in an unstructured format dramatically increases the complexity of detecting patterns in your logs. Using a JSON format is highly recommended.
Classify logs – Each log must have a severity level assigned to quickly be able to identify a potential source of error and link the event to other metrics.
Create actionable alerts – A correctly formatted log line brings more context to the information than a simple metric, making it easier to interpret an error and the actions needed to resolve it.
Centralize logs – Forwarding system and application logs onto a single platform has several advantages, including accessibility, readability, and ease of correlation with other events.

Pattern: Define an intelligent alert process based on the CD pipeline logs to notify the team capable of resolving the problem

Anti-Pattern: Leave logs on the servers and wait for someone to notify others of an error present in the pipeline

Auditing

DevOps principles like CD help teams develop and deploy applications at a higher velocity. While software delivery speed is an advantage, it is equally critical to ensure the software delivery pipeline is compliant with governance policies and industry standards. Audits are intended to ensure the effectiveness of controls put in place (e.g., dependency verification, code quality control). A way to help mitigate the risk of noncompliance is to aggregate data generated and collected throughout CI and CD workflows into easy-to-read reports that address customers' security and auditing requirements.

Pattern: Produce weekly reports on applications created and deployed; schedule a monthly review with development teams to proactively determine impact on security policies

Anti-Pattern: Wait for the annual infrastructure audit to determine if an application is no longer compliant

External Reporting

We often think about observability in technical matters but having external reporting with data that can be understood by all teams and stakeholders, from technical to executive, is a huge added value. For instance, knowing how often a deployment fails for a service or how many deployments are made during a development cycle or a sprint are critical pieces of information that can be used to improve the SDLC.

Pattern: Define standard data points and export them for every team/project for delivery to stakeholders

Anti-Pattern: Create custom reports with different datasets for each team

Communication and Documentation

Communication between teams is the key to the success of any project. The transition to the DevOps methodology has proven that working in silos is not an effective method and that team collaboration through clear and transparent communication is necessary.

Generate Release Notes

An easy-to-automate means of communication is the creation of release notes in order to share information about a deployment with all teams. These release notes can take on different formats such as an email or a chat message. The objective is to promote the dissemination of information to the entire engineering team — both for the purposes of widespread visibility and transparency as well as so everyone has the ability to interpret it, then take, or not take, any necessary actions.

The most effective and informative release notes contain details like:

The name of the application and its new version
The ticket number(s) attached to this version to identify the work performed
The release date to correlate with potential anomalies detected following the deployment
The team leading the project and potentially the developer who carried out production
A summarized runbook for rolling back in case of major force

Pattern: Automatically notify all engineering team members to coordinate projects and further actions

Anti-Pattern: Wait until an error occurs during a deployment for dev and ops teams to collaborate and identify issues with the latest release currently in production

Perform Root-Cause Analyses

Another important communication format, which aligns with the Site Reliability Engineering (SRE) methodology, is identifying, then reporting on, the root causes of a problem that occurred after a deployment. A root-cause analysis prompts a postmortem, where all teams can join to reflect on and discuss the issue — as well as identify the actions that should be carried out to prevent future occurrences of the problem. Postmortems should then be written up, defining the necessary actions and their respective owners. These tasks must be completed within a specified timeframe after issue identification to ensure that the defect has been fixed before the next deployment.

Common actions in a postmortem:

Review the outcomes and results
Identify what went well and what did not
Give everyone the chance to speak
Identify post actions and assignees

Pattern: Question everything: ask "why" of every symptom until discovering the root cause

Anti-Pattern: Accept the symptom as the root cause of the problem

Document Workflows

Some means of communication are more appropriate than others depending on the information you want to share — documentation included, which is an important component in the DevOps approach. Disseminating information to a large number of people is crucial to foster the understanding of automation principles and adoption of such practices. We have at our disposal today many tools that allow us to share information according to the target audience:

Engineers – Often prefer reading deployment scripts, which is not always the simplest approach but is probably the most up-to-date documentation.
Project managers – Will prefer centralized documentation, such as a wiki, with a certain level of information to interpret and understand different aspects of a pipeline without having to understand the specific technique.
Managers – Tend to prefer high-level documentation describing the overall process, the catalog of applications, application statuses, and the teams or people attached to the project — all information that will allow them to quickly identify and route any information from other teams regarding their projects.

The main purpose of workflow documentation is to align teams in order to promote mutual assistance and process optimization, which can be achieved with automation (e.g., release notes can be automatically generated based on commit messages and the created tag).

Pattern: Define a standard for documentation for every audience

Anti-Pattern: Do not document deployment and configuration workflows; allow only a few employees to access the catalog of applications and other assets

Section 4

Conclusion

Many companies have now reached a high level of automation through DevOps adoption and the implementation of CI/CD processes. However, there is further room to evolve these principles with the new integration of intelligent processes based on learning models that will allow us to make our applications more reliable and satisfy our customers.

Additional Resources: