This article is brought to you by:
As a software engineering and management consultant, I don't often get to talk shop with my mom. But I recently received this email from her:
I think I understand better what this "continuous delivery" you talk about means.
They made an update in the software I use to do my job. I didn't find out until midday today after opening a support ticket because none of my work had been downloaded to the remote site, so that team could start working on it.
As they did some digging, they discovered that the update had broken the automated system that takes work from my system and transfers it to the remote system. The two systems are used at other sites as well and tickets were flooding in. So they "fixed" it and broke something else. We've all lost two days of work and it's still broken.
As practitioners in the software development industry, when your mother is asking about Continuous Delivery (CD) and DevOps, it's hard to deny the impact they've made on the way we think about shipping software. Indeed, the practices CD and DevOps espouse have brought us some of the most innovative methodologies the industry has seen in the last decade for software companies to address the desire to ship features—and thus create customer value—at a more rapid pace.
But there is often a subtle disconnect when organizations start to experiment with shipping their products more quickly, one handily seen in my mother's experience of "continuous delivery": yes, we may be shipping bits to customers more quickly, but where does software quality and testing fit into the process? Certainly it needs to be accounted for somewhere, or else we're just "shipping garbage, but quicker" right?
It turns out this is often a more complex organizational conversation than we might initially think. The good news is, in my experiences helping companies in a number of different industries with myriad software products, I can assure you: it's not an uncommon conversation for the organization to have, even if it may be an uncomfortable one at first. There are some methods you can use to move the conversation about testing, quality, and test automation forward in a productive way. You can help explain that in a world moving evermore toward continuous integration, continuous delivery, and... well... continuity, thinking about quality and automation and focused investment in those areas is not only necessary: it's actually critical to your software company's survival in this new world of "Continuous Everything."
The Automation Dichotomy
"Automate early, automate often." "It's not 'done' until it's automated." We hear these mantras constantly around the office water cooler and in scrum meetings. One of DevOps' major tenets is a laser-focus on automation and the importance of its role in software and infrastructure delivery.
And yet, these rallying cries are not new to anyone responsible for supporting software development or delivery, like QA or release engineers. In fact, for those of us who were around before the "DevOps explosion," we were often the ones harping on the importance of automation as others in the organization stared back blankly at us. And for all the cawing we (myself included!) did about it, it didn't foster much to change our industry's lack of automation. If it had, the DevOps movement wouldn't still be continually beating that drum.
To be fair, this is not a post facto indictment of those individuals and teams: in context, it made perfect sense why automation wasn't traditionally all that huge of a priority. Part of it, certainly, was the nascent set of tools we had to work with; sustained investment in automation was just more costly, and often involved an organization building its own frameworks to solve its (supposedly) totally-custom automation problems.
But the more important detail: back when we were shipping software on plastic discs that we shoved into boxes and shipped around the planet, there was just more slack-time in the end-to-end delivery process. When we waterfalled changes through a four month development cycle, followed by a two month QA cycle and a one month release cycle, there was a lot of room for QA and release engineers to go behind the curtains, "do their magic," and emerge with a tested product, all packaged and shiny and ready for customers. The means weren't as important to the business as the end, because the extra time of doing it manually or re-doing it because it was done manually was a rounding error in the larger scope of a product release cycle.
That world no longer exists. When looking at companies like Google, who ship their Chrome browser every six weeks, suddenly a test that requires a person to sit down and run through a set of steps is not only very noticeable in that schedule, if the intention is to test every commit in isolation—something that team lives by—that does not scale. No matter how many QA engineers are hired. And therein lies the dichotomy: we have, as an industry, for years extolled the importance of automation, but quite frankly: we just weren't that serious about it. As someone who got caught in the trap of doing the long, tedious manual release process repeatedly, because the organization refused to invest at all in automation, that can be a depressing realization to come to.
But in the context of this new continuous world, it's also a powerful one. We now have numerous examples of companies—Google, Amazon, Netflix, Etsy—whose leadership moved from giving lip service on automation to being almost-fanatical about automation. And that's paid off for them.
We can use their software delivery velocities (and the direct translation to success in the marketplace) as proof that yelling "automate as much testing as early as possible!" while whispering "but... there is no budget in the schedule to automate it properly, much less budget for people or training" is a known anti-pattern. And one that will fail.
This alone can prompt your team's conversations around test automation's role in your software delivery process and can be useful in directly addressing the dichotomy between promises to automate and still-lacking automation.
Death by a Thousand Ad-hoc Cuts
One of the most difficult aspects of transitioning testing automation into a continuous integration or CD pipeline is the often painful move away from ad-hoc, manual, and "here, follow this script" processes. Starting down this road often requires a "re-imagining" of the entire quality testing process, from how tests are implemented and executed, to the methods and abstractions used to perform that testing. A common sentiment experienced during this journey is "Yeah, I get that we're moving things to be automated, but we only run this test at release time, or every few releases, so can we leave it out of the larger automation initiative?"
Fortunately, both Continuous Delivery and DevOps make the heuristic for distinguishing reusable test automation from the manual, infrequent, ad-hoc ones extremely simple: if it is not in the continuous integration pipeline, expressed in code which a framework can run, from building the entire test environment, all the way to producing a clear pass/fail report, then by default, it is not a reusable, automated process. And therefore, logically, it falls into the realm of an "ad-hoc" process. Continuous Delivery and DevOps advocates are extremely uncompromising on this point.
On one hand, that can be nice, as it creates an incredibly clear line in the sand about what tests our team should consider "automated" versus "ad-hoc": it's not considered to be automated until the test runs without intervention on a system where anyone—and I mean anyone—can either induce it to run via a code check-in or click a button, see the test run, and get their results.
On the other hand, this admittedly-inflexible line can turn out to be pretty frustrating for the team: we may be working with legacy software that doesn't have appropriate (or even any!) test hooks; our testing stack may be complex, bulky, and difficult to provision; we may have tests we only run infrequently, for whatever reason (usually related to how painful and/or manual those tests are). The rigidity of the definition isn't intended to be unempathetic to the problems test automation teams face. Rather, it's intended to remove ambiguity, so the tough conversations can be had, the organization can align itself on what it really means to build "quality automation" and, most importantly, get the resources necessary to add those test hooks, fix the provisioning problem, and rework or discard infrequently run tests that may no longer be providing value.
In concert with this effort, successful organizations have a second group approaches the problems of automating integration, functional, and system tests; they tackle validating intended behaviors instead of disproving defects. These efforts usually play out as improvements in frameworks to drive tests, test environment deployment improvements, and better methods or hooks to test integration points or system interactions, maybe even by throwing real live traffic at the application, a technique called "canary deployments."
Certainly, test automation is never really done and even companies far down the road of implementing this pattern never get to 100% coverage, nor do these efforts ever "meet in the middle." But as core components that were once under-served by automated unit and regression tests gain more coverage and as the pain of test environment handling and integration testing is soothed, more time for implementing more, new, and novel tests becomes available... all of them, hopefully, driven automatically by our continuous integration system.
Breaking the (Perceived) Quality Bottleneck
One of the most difficult misconceptions to cope with when arguing for a focused investment in test automation and rolling out an automation strategy is the idea that such an effort will create a bottleneck or increase the overhead the QA process already creates. This is especially common when it's considered in the context of a decree that we must "increase our velocity and ship faster!" If these various types of tests are run in an automatic continuous integration environment (or, even better, a continuous delivery pipeline) nothing could be further from the truth. (In fact, in my experience, protestations to the contrary are actually veiled conversations about engineering resources and, unfortunately, admissions about the value the organization really puts on quality.)
When unit, functional, integration, and acceptance tests are fully automated—a simple definition we've now established—these tests not only raise the entire organization's visibility into the quality state of our software, it presents a fascinating window into how the organization reacts when failure occurs: does a developer and test automation engineer get together to tackle the problem? Or do we comment out the test, so the build goes green and we can ship? And if we do the latter, and it comes back to haunt us, we at least have visibility into the risk analysis decision that led us astray.
Much of the foundations of DevOps come from the ideas of W. Edwards Deming. Deming famously wrote "Quality can not be inspected into a product or service; it must be built into it." It is an old adage, originally describing a manufacturing problem, but it's something seasoned QA engineers have been saying for years. When you account for all of the back and forth communication and rework that occurs when testing is not part of the pipeline through which commits flow from developer to customer, it's much higher than any claimed "bottleneck" that would exist by adding it to that pipeline. And by doing so, we push the responsibility for quality back to anyone—not just developers—putting commits into that pipeline, instead of holding the pipe-dream that a product can be handed off to QA and they can "quality-ify" it for the customer.
Toward Quality Automation
To many, the rallying around automation is not new. The calls to automate as early as possible are not new. The benefits have not suddenly been "discovered": they've been known to practitioners for years. So the interesting aspect of this new focus and attention on automation, in both testing and software delivery, is: if we knew the benefits all along, why weren't we able to get traction?
The answer, of course, is a complex one and is nuanced for every organization. What is clear: the world of software delivery has changed; the pace with which we're expected to deliver software is increasing; and if we want to have any chance of maintaining product quality, to say nothing of improving quality, investment in test automation is no longer an option or a secondary consideration.
Hopefully, the techniques outlined above provide some hints on how to move the conversation forward in your own organization, so that your team can get real, lasting traction on test automation.