5 Keys to Delivering a Successful DevOps Strategy
The successful release and deployment of software systems is an important task that requires firm DevOps strategy; here are the keys to successful DevOps operations.
Join the DZone community and get the full member experience.Join For Free
The release and reliable deployment of software systems have long been difficult and time-consuming processes. As the software industry moved to a more rapid release cadence, the ability to deploy and release frequently became more valuable, particularly for software hosted on the internet. The practice of DevOps, which aspires to match that increased cadence using automation, has inevitably risen to meet the challenge. To enable such unheard-of efficiency, the processes and tools involved have to be tuned to the extreme. This article examines some key hacks or practices that avoid common pitfalls and can unleash the ability to deliver value to customers at a breakneck pace.
Hack #1: Make Devops Fit Your Culture
“Devops is about finding ways to adapt and innovate social structure, culture, and technology together in order to work more effectively.”
― Jennifer Davis & Ryn Daniels, Effective Devops
DevOps isn’t so much a prescribed list of practices, but rather a philosophy that aims to combine some aspects of software development with operations in order to maximize the value delivered to customers. This doesn’t mean that operations folks will write code, or that software engineers will operate the systems. It does, however, imply a high level of automation to smooth the transition of software systems from development to production, a traditionally fraught transition full of errors and finger-pointing.
DevOps is a philosophy with its foundation in incremental change — this means, for those more accustomed to traditional methods, shorter cycles of requirements through delivery resulting in a more evolutionary approach. Another foundation of DevOps is introspection, where the process itself is constantly evaluated and adjusted. Keys to the best cultural fit:
- DevOps is a never-ending journey starting from where you are now. It is not a checklist that you can complete.
- DevOps processes require change beyond IT. Product management, security, engineering, sales, operations, customer success, and customers themselves all have a role to play in defining and delivering customer value.
- The incremental approach starts with the approach to introducing the DevOps philosophy internally. It’s critical to develop a plan, and then constantly re-evaluate and adjust.
Hack #2: The Need For Speed
“If it hurts, do it more frequently, and bring the pain forward.”
― Jez Humble, Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation
The frequent delivery model espoused by DevOps implies frequent builds, testing, and deployment. The scale of your teams and the cadence of delivery (which may vary by product) can put an enormous strain on compute (and possibly network) resources. Investing in the resources needed for a responsive pipeline, whether a private or cloud platform, is critical for overall success.
The need to scale testing, including integration testing, is seldom identified as a bottleneck. Automated unit testing is critical, but without automated integration testing, true visibility into the state of the system is not possible. Integration testing should include testing for end-to-end operation, security, load, and resiliency testing. These are potentially time and resource-intensive, but essential to truly measure the quality of deliveries. If it takes 8 hours to run a regression test, it will be felt further up the pipeline. This implies the performant creation and teardown of virtual test environments.
Hack #3: Availability
Hand in hand with performance is availability. The “continuous” in continuous delivery implies a highly available pipeline. Like the proverbial chain, the DevOps pipeline is only as available as its least available link. Select highly available components, or at least components that perform their task idempotently so they can be rerun in case of partial failure.
Availability can be attained by using highly available systems on-premises or using cloud SaaS solutions that can scale and heal themselves without user intervention (for a price). Github Actions, CircleCI, and Github Actions are popular SaaS services that will provide highly available cloud-based DevOps platforms. Both support on-premises “runners” that enable local testing operations so that you can blend the cloud service with your local environment tools as needed.
Hack #4: Gather and Use Metrics
“If you can’t measure it, you can’t improve it.”
― Peter Drucker
Like any software system, the DevOps pipeline itself needs to evolve and improve over time. In order to improve and support hacks 2 and 3, the systems performing the automated test and delivery must be measured and improved where possible. The metrics associated with continuous delivery should be tied to goals. Metrics like testing time performance (unit and regression), failure rates, cost (if cloud-hosted), and actual availability (downtime) should be evaluated on a consistent basis.
Ultimately, these low-level metrics will tie back to fundamental business metrics that measure performance, such as lead time, deployment frequency, and mean time to restore (from failure). The DevOps toolchain performance underlies all of these.
Hack #5: Promote Focused and Meaningful Testing
"Many people who chase test coverage metrics usually should be doing something more useful instead"
― Martin Fowler
A common development testing metric is “code coverage”. The goal of this metric is to establish what percentage of potential logic paths in the code have been subjected to testing. As such, they are a good tool for pointing the way to areas of the code that may need correction or firming up. Of course, they are limited to analyzing the code they are given. No coverage tool will tell you what code you should have written to handle certain conditions. For example, they won’t tell you that your exception handling is inadequate, or that you’ve neglected to handle a particular return status.
There is a temptation to use a particular code coverage percentage as a hard limit or gate to permit a release. This is a mistake because developers will deliver that number at the expense of quality when under pressure. It is better to understand that test coverage is not an actual measurement of code quality. What it does provide is a level of assurance that the developer checked their work, and (more importantly) a means of detecting breaking code changes in the future long after the original author has moved on. Good test coverage should be a goal but not a hard limit. To catch those “errors of omission”, a robust integration testing capability can help.
It should be mentioned for completeness that all code is not equal. Code that is widely reused, and/or that has the possibility of performing destructive acts (for example deleting customer data) needs a much higher level of test coverage and scrutiny. Writing tests is time-consuming and an ongoing process and limited resources need to be focused on critical code first and comprehensively.
In order to have full confidence in automated test results, an integration test environment should mimic production as much as possible. This can be fairly straightforward in the case of a hosted web application, or almost infinite in complexity in the case of an application that customers install at their premises. The more complex and varied the target environments are, the more testing is required.
The basic goal of DevOps is the streamlined delivery of value to customers. It seeks to break down traditional barriers between developers and operators to enable frequent releases of features and bug fixes. The ability to deliver frequently with confidence requires a high level of automation, especially around testing from the unit through end-to-end.
One of the great benefits of virtualization technology (and the cloud by extension) is the ability to spin up arbitrary collections of servers and network configurations (testing sandboxes). With the proper orchestration tools, specific test configurations can be spun and torn down on demand, greatly increasing the testing surface area, and enabling a very high level of automation and quality confidence.
Opinions expressed by DZone contributors are their own.