A Fatal Impedance Mismatch for Continuous Delivery
Most of the time, when organizations pursue a continuous delivery capability, they’re doing that in pursuit of increased agility. They want to be able to release software at will, with as little delay between the decision to implement a feature and the feature’s availability to end users.
I’m a big fan of agility, and agree with the idea that agility and continuous delivery go hand in hand. There are unfortunately ill-conceived approaches to implementing agility that can prove fatal to a continuous delivery program. In this post we’re going to take a look at one that occurs in larger organizations. We’ll see one reason why it can be challenging to implement continuous delivery in such environments.
Software development in the enterprise
One fairly common configuration in large enterprises is for there to be a shared production environment and multiple development groups creating software to be released into that environment. Sometimes the development groups have the ability to push their own changes into production. But often there’s some central release team, whether on the software side of the house or on the infrastructure/operations (I’ll call them IT in this post) side, that controls the change going into the production environment. Here’s what the red-flag—but common—configuration looks like:
Let’s see what tends to happen in such enterprises.
The quest for agility leads to development siloing
When there are multiple development groups, they usually want to be able to do things their own way. They set up their own source repos, configuration repos, continuous integration infrastructure, artifact repos, test infrastructure (tools, environments) and deployment infrastructure. They have their own approaches to using source control (including branching strategies), architectural standards, software versioning schemes and so on. They see themselves as being third-party software development shops, or at least analogous to them. Releasing and operationalizing the software is largely somebody else’s concern. They certainly don’t want some central team telling them how to do their jobs.
There’s a reason the development groups want things this way: agility. The central release team is either seen to be a barrier to agility, or in many cases, actually is a barrier to agility. There are tons of reasons for both the perception and the reality here. If the central team lives in the IT organization instead of living in a software organization, the chance for misalignment is very high. Common challenges include:
- IT doesn’t understand best practices around software development (e.g., continuous integration, unit testing, etc.).
- IT takes on a broad ITIL/ITSM scope when the development groups would prefer that they focus on infrastructure like IaaS providers do.
- IT chooses big enterprise toolsets that aren’t designed around continuous delivery, integration with development-centric tools and so forth.
- IT prioritizes concerns differently than the development groups do. In many cases IT is trying to throttle change whereas development is trying to increase change velocity.
But even if the central team manages to escape the challenges above, fundamentally shared services balance competing concerns across multiple customers, and they’re therefore usually suboptimal for any one customer. All it takes is for one or two developers to say, “I can do better” (a pretty common refrain from developers), and suddenly we end up with a bunch of development teams doing things their own way.
This is really bad for continuous delivery. Let’s see why.
Development siloing creates a fatal impedence mismatch
Let’s start with a little background.
Because continuous delivery aims to support increased deployment rates into production, it becomes especially important to test the deployment mechanism itself (including rollbacks). The challenge, though, is that any given production deployment is a one-time, high-stakes activity. So we need a way to know that the deployment is going to work.
Continuous delivery solves this through something called the deployment pipeline. This is a metaphorical pipeline carrying work from the developer’s machine all the way through to the production environment. The key insight from a deployment testing perspective is that earlier stages of the pipeline (development, continuous integration) involve high-volume, low-risk deployment activity, whereas later stages (systems integration testing, production) involve low-volume, high-risk deployment activity. If we make earlier stages as production-like as possible and we use the same deployment automation throughout the pipeline, then we have a pretty good way to ensure that production deployments will work. Here’s an example of such a pipeline; your environments may be different depending on the needs of your organization:
The area of any given environment in the pyramid represents the volume of deployment activity occurring in that environment. For development, the volume is large indeed since it happens across entire development teams. But notice that everything other than the production deployment itself helps test the production deployment.
As you can see, even the developer’s local development environment (e.g., the developer’s workstation or laptop) should be a part of the pipeline if feasible, since that’s where the greatest deployment volume occurs. One way to do this, for instance, is to run local production-like VMs (say via Vagrant and VirtualBox), and then use configuration management tools like Chef or Puppet along with common app deployment tools or scripts throughout the pipeline.
With that background in place, we’re now in position to understand why development siloing is bad for continuous delivery. When development teams see themselves as wholly separate from the operations side of the house, two major problems arise.
Problem 1: Production deployments aren’t sufficiently tested
This happens because the siloed development and operations teams use different deployment systems. In one example with which I’m personally familiar, the development team wrote its own deployment automation system, and stored its app configuration and versionless binaries in Amazon S3. The ops team on the other hand used a hybrid Chef/custom deployment automation, sourced app configuration from Subversion and versioned binaries from Artifactory.
Generically, here’s what we have:
The earlier pipeline stages use a completely different configuration management scheme than the later stages. Because of the siloing, only a small region of the pyramid tests the production deployment. So when it’s time to deploy to SIT, there’s a good chance that things won’t work. And that’s true of production too.
The next problem is closely related, and even more serious.
Problem 2: Impedance mismatch between development and operations
Having two disjointed pipeline segments means that there’s an impedance mismatch that absolutely prevents continuous delivery from occurring in anything beyond the most trivial of deployments:
From personal experience I can tell you that this impedance mismatch is a continuous delivery killer. Keep in mind that the whole goal of continuous delivery is to minimize cycle time, which is the time between the decision to implement a change (feature, enhancement, bugfix, whatever) and its delivery to users. So if you have a gap in the pipeline, where people are having to rebuild packages because development’s packaging scheme doesn’t match operation’s packaging scheme, painstakingly copy configuration files from one repository to another, and so forth, cycle time goes out the window. Add to that the fact that we’re not even exercising the deployment system on the deployment in question until late in the development cycle, and cycle time takes another hit as the teams work through the inevitable problems and miscommunications.
Avoiding the impedance mismatch
In Continuous Delivery, Humble and Farley make the point that while they’re generally supportive of a “let-many-flowers-bloom” approach to software development, standardized configuration management and deployment automation is an exception to the rule. Of course, there will be differences between development and production for cost or efficiency reasons (e.g., we might provision VMs from scratch with every production release, but this would be too time-consuming in development), but the standard should be production, and deviations from that in earlier environments should be principled rather than gratuitous.
So to avoid the impedance mismatch, it’s important to educate everybody on the importance of standardizing the pipeline across environments. If there’s a central release team, then that means all the development teams have to use whatever configuration management infrastructure it uses, since otherwise we’ll end up with the disjointed pipeline segments and impedance mismatch. But even if development teams can push their own production changes, it’s worth considering having all the teams use the same configuration management infrastructure anyway, since this approach can create deployment testing economies.
Some teams resist standardization instinctively, largely because they see it as stifling innovation or agility. Sometimes this is true, but for continuous delivery, standardization is required to deliver the desired agility. It can be useful to highlight cases where they accepted standardization for good reasons (e.g., standardized look and feel across teams for enhanced user experience, standardized development practices reflecting lessons learned, etc.), and then explain why continuous delivery is in fact another place where it’s required.
One sort of objection I’ve heard to a standardized pipeline came from the idea that the (internal) development team was essentially a third-party software vendor, and as such, ought not have to know anything about how the software is deployed into production. In particular it ought not have to adopt whatever standards are in place for production deployments.
This objection raises an interesting issue: it’s important to establish the big-picture model for how the development and operations teams will work together. If the development team is really going to be like a third-party vendor, where it’s independent of any given production environment, then it’s correct to decouple its development flow from any given flow into production. But then you’re not going to see continuous delivery any more than you would expect to see continuous delivery of software products from a vendor like Microsoft or Atlassian into your own production environment. Here leadership will have to choose between the external vs. internal development models. If the decision is to pursue the internal development model and continuous delivery, then pipeline standardization across environments is a must.