Immutable CI/CD for Resilient Delivery

DZone 's Guide to

Immutable CI/CD for Resilient Delivery

Learn more about immutable CI/CD for constructing a robust pipeline and some challenges you might face.

· DevOps Zone ·
Free Resource

This article is featured in the new DZone Guide to DevOps: Implementing Cultural Change. Get your free copy for insightful articles, industry stats, and more!

The main goal of an immutable CI/CD system is to maximize the availability of delivery pipelines (where unavailability could be either due to planned downtime, accidental downtime, or blocking issues in the mechanics of the pipeline).

In this article, I will explain what immutable CI/CD is, as well as the associated practices and benefits. But let's start with what resilient delivery means.

What Is a Resilient Delivery (CI/CD) System?

A delivery (CI/CD) system encompasses all of the tools we use in the delivery process — not only CI and CD but also artifact management, deployment tools, test tools, etc. But the delivery system also includes the underlying CI/CD infrastructure, the source repositories for the applications (as well as the pipeline definitions), and the pipeline orchestration code itself. In short, it's everything that our pipelines need, directly or indirectly, to be able to execute.

Image title

A simple example of a delivery system, including tools, repos, infrastructure, and pipelines.

Once we see the delivery system as yet another system in our portfolio, we can start thinking not only about the features we need from it (for example, support for secure code analysis or rolling deployments) but also about our operational requirements (for example, availability expectations or disaster recovery time).

We can define, in general terms, a resilient CI/CD system as a highly available system with a short time to recover from failures and disasters. The exact metrics depend on each organization's context.

Fast and safe software delivery requires high availability and scalability from the CI/CD system. It is no longer acceptable that builds and deployments are halted for a few hours due to CI/CD infrastructure upgrades, tooling updates, or a new plugin installation.

Queuing up multiple small changes because the CI/CD system is not available creates a delivery bottleneck. Once the system is back up, the risk of failed builds, tests, or deployments increases significantly, and negatively affects the team's pace. This, in turn, possibly derails their sprint (or iteration) objectives and the business expectations of the team.

We Have Immutable Infrastructure and Applications; Why Not Immutable Tools?

We have been talking about immutable infrastructure as a concept with well-understood benefits and challenges since at least 2013. I find this definition by Josh Stellaparticularly comprehensive. It has also withstood the test of time:

"Immutable infrastructure provides stability, efficiency, and fidelity to your applications through automation... the basic idea is that you create and operate your infrastructure using the programming concept of immutability: once you instantiate something, you never changeit. Instead, you replace it with another instance to make changes or ensure proper behavior."

Docker showed up around that time, as well (a quick fact-checking shows Docker was first released in March 2013, and version 1.0 came out 15 months later), and its quickly growing adoption inthe following years gave us the tools to quickly build and manage machine image artifacts. They are a key enabler for immutability.

The process of creating a new image whenever an application or service requires a change to the machine (container) where it runs became standardized and deployment of immutable containers (and pods with Kubernetes) became the way containerized applications get updated. Patching or installing new packages inside a container is (correctly) seen as a risk given their ephemeral nature.

So, if immutable infrastructure for applications is now mainstream, why aren't we doing the same for the tools we use? Well, there are a few challenges.

But first, let's acknowledge that modern development and operations tools that run on-prem have been containerized and vendors provide official Docker images for them. That frees us from having to either manually install or automate the install of such tools ourselves.

The challenges with immutability for tools manifest themselves when we get to the post-install administration, configuration, and upgrading activities.

Challenge #1: Lack of Control

Container images embed the necessary technical stack to run a tool, but we still don't control the tool's internal architecture like we do with our own applications. That means we have to discover and adapt to what we're given when automating the instantiation of the tool.

(Pro tip: When evaluating new tools, consider the ease of adoption within an immutability approach in terms of tool design and logging accessibility.)

For example, it's common to configure things like admin credentials and tool preferences via the UI or interactive scripts. Are those settings stored in a config file? If so, where and in which format? We can run a sandboxed manual configuration to find out. Then, we can create a config file version already customized for our needs. We can then copy this version into the tool container through our Dockerfile.

If the tool was designed to store configuration in an internal database, then this approach becomes more cumbersome but still worth investigating. (Is there an API that we could call to update the database programmatically and avoid inconsistencies? If not, it's probably the right time to switch to a modern tool.)

Challenge #2: More Moving Parts

Another challenge relates to the multiple aspects that need to be automated to instantiate a tool in immutable fashion.

Besides the aforementioned tool configuration, some other aspects that might need to be addressed include:

• Filesystems and databases where execution data is stored (e.g. pipeline execution logs and artifacts)

• Installation of required plugins (a common need for CI/CD tools)

• Configuration of source and pipeline code repositories

• Installation of any OS utilities we might need to automate the above

Again, we need to understand on a tool-by-tool case, what kind of mechanisms it provides to help us perform these tasks. For example, Jenkins CI provides a shell script that pre-installs a list of custom plugins. In our Dockerfile, we can make use of it:

COPY plugins.txt /usr/share/jenkins/ref/plugins.txt
RUN xargs /usr/local/bin/install-plugins.sh <

Here, plugins.txt identifies the set of plugins that will be installed by the script; for example:


Challenge #3: Production Data

For a CI/CD tool, production data consists of (at a minimum) logs, artifacts, workspaces, and stage/pipeline results.

How do we keep intact the production data that our tools store when switching to a new version of the CI/CD system? We need to identify where and how the data gets stored first. Then, we need to findan adequate solution that ensures the data is persisted across the different environments where each version of the CI/CD system runs.

For example, CI/CD tools might store their file-based data in the server's filesystem. Using Docker volumes, we can map our persistent filesystem(s) on the container running that server in the new CI/CD version. And, of course, we can and should establish adequate backup and restore policies as well.

What Is Immutable CI/CD?

We can define immutable CI/CD as a delivery system whose toolchain (CI tools, CD tools, plugins, build/test/deploy tools, etc.) does not change once the system is instantiated. Fixes, updates, new tools, new releases, and new plugins can only be introduced by instantiating a new version of the CI/CD system altogether.

Immutable CI/CD provides stability to the delivery system, reducing downtime and unexpected failures due to changes on the fly. When coupled with scalable infrastructure and adequate logging and monitoring, the end result is a resilient, highly available, and scalable delivery system that can meet modern software delivery needs.

There are, of course, some challenges to this approach today.

Challenge #1: Upfront Investment

Earlier in this article, we saw that we need to invest some effort into understanding the architecture of a tool in order to be able to codify and automate all the setup and configuration for our needs. To achieve immutable CI/CD, we multiply that effort by all the tools we are using in the system.

However, in my experience with open-source toolchains, it's really the CI and CD tools that demand most of the work. Tools that do specific jobs tend to have less configuration and apply sensible defaults. That makes them easier to adopt out-of-the-box (where "box" now means an official Docker image).

The upfront investment pays off in droves every time we are able to update the delivery system without disrupting the work of the software delivery teams. It's a gift that keeps on giving.

There is also good news in that vendors and creators are increasingly aware of the need to up their game in terms of configurability and availability of their tools in a DevOps, "everything as code" world. There's a growing awarenessthat automation and immutability are not just for the applications we build but also for the tools we need to build and deliver those applications.

Challenge #2: Zero Downtime Deployments

Many CI/CD tools today are still architected around a client/server approach that was not designed with scalability in mind. That makes it difficult to gradually move incoming traffic (build and pipeline requests) to a new instance of the delivery system.

The best approach today is to use blue-green deployments, with a hard switch from the current delivery system version to the new one. If we want to do this during working hours, we need to freeze new build and pipeline requests and either abort or wait for current pipelines to finish execution.

For (near) zero downtime deployments, we would require waiting for the current system to be idle and only then switch to the new version. This might be impossible, however, particularly for development teams distributed across many time zones.

Useful Side Effects of Immutable CI/CD

In my experience, I've found that pursuing the goal of immutable CI/ CD brings about beneficial side effects, such as the ability to quickly spin up new CI/CD environments from zero. This is not only useful for disaster recovery but also allows organizations to distribute their delivery system (per team, department, or business unit, for example) instead of centralizing all the pipelines on a single massive server. And because everything is defined as code, propagating changes across multiple instances of the delivery system becomes rather straightforward.

Finally, it promotes testing changes to the CI/CD system in isolation (with the help of example tech stacks and ephemeral environments for deployments to create minimal yet realistic test cases for pipeline functionality).

Getting Started

If you've been feeling the pain of an overloaded and fragile delivery system, get started now. Account for an upfront investment but expect massive gains, especially with a large number of development teams using the same CI/CD system.

If you haven't felt the growing pains yet, at least ensure you communicate clearly when and how the delivery system is being updated. Meanwhile, start creating a "staging" delivery system to smoke test changes in isolation and reduce downtime for the "production" delivery system. Also get started with pipeline-as-code, if you haven't yet.

This article is featured in the new DZone Guide to DevOps: Implementing Cultural Change. Get your free copy for insightful articles, industry stats, and more!

challenges, ci cd, devops, devops guide, immutable infrastructure

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}