While there are many ways to do DevOps correctly, there are specific cardinal sins that will put you afoul of the Church of DevOps. From lacking an incident management tool to handle critical alerts to treating DevOps as a job title, there are many ways for you to hurt your status as a class-A DevOps shop. In order to achieve excellence in DevOps, it is key for executives to avoid committing the cardinal sins of DevOps that are discussed below.
1. You Treat DevOps as a Title, Not a Philosophy
In speaking to directors of engineering at numerous companies, I have heard the phrase, "if you have 'DevOps' in your title, you’re doing it wrong." The point of this statement is that DevOps is a philosophy, not a title. You shouldn’t assume that you can simply put the word DevOps in someone’s title and get anywhere near implementing a DevOps-focused enterprise.
As Matt Juszczak of Bitilancer writes:
Calling yourself or somebody else a DevOps Engineer, a DevOps System Administrator or a DevOps Tester reflects a fundamental misunderstanding about DevOps. This confusion is contributing to a lot of project and program frustrations and failures that are hurting teams and companies, sidetracking careers, and creating backlash for recruiters.
Instead, DevOps needs to be realized as a methodology that takes software development and shifts “left.” More concisely,
DevOps is the practice of operations and development engineers participating together in the entire service lifecycle, from design through the development process to production support.
2. You Don’t Have Buy-In From Everyone From the CIO on Down
DevOps success is ultimately about transforming the way the business thinks about software and its importance to commercial success. DevOps needs to be seen as fundamentally a business change, not a technology change. While DevOps is often associated with new tools and practices, the real change is the new way of working that aligns technology with business strategy.
Teams can have bits and pieces of DevOps philosophy sprinkled through a project’s implementation, but for the organization as a whole to benefit from DevOps, it requires support from the top. This means that for success, your CIO needs to be a DevOps champion – to buy in, support and in some cases be able to lead this effort.
3. You Don’t Focus on Metrics
In a previous blog article, I quoted Peter Drucker who famously said, “If you can’t measure it, you can’t improve it.” That noted, it is important to measure every step of the DevOps lifecycle in order to ensure that you are, for example, actually improving your release numbers, decreasing MTTR, or minimizing the change failure rate. If you are not measuring these numbers then you have no idea if you are doing awesome or if you are a work in progress.
The right metrics are essential to making sure that your DevOps transformation is successful. It’s important, though, to go beyond technology metrics. Metrics such as Mean Time To Resolution (MTTR) or Mean Time to Failure (MTBF) are important but you should also focus on process and people metrics. Things like monthly or daily active users, measuring the development-to-deployment lead time are also important metrics to consider when measuring your current effectiveness.
4. Focusing on DevOps as a Race to Acquire More Tools
Just like DevOps cannot be seen as a title in an org chart, DevOps also cannot be thought of simply as a matter of tools. In the DevOps world, there’s been an explosion of tools in release (Jenkins, Travis, TeamCity), configuration management (Puppet, Chef, Ansible, CFengine), orchestration (Zookeeper, Noah, Mesos), monitoring, virtualization and containerization (AWS, OpenStack, Vagrant, Docker) and many more. DevOps engineers are famous for their love of new tools, but at some point, engineers need to specifically focus on achieving goals.
At the very least, a new DevOps tool should follow the Hippocratic oath and should do no harm to any team. A true DevOps solution will appeal to Dev, Ops, and Security. As one engineer writes,
If daily routines will be impacted by the new “DevOps” tool, getting early buy-in from affected teams is key. Otherwise it will be extremely hard to get the other team to adopt the solution and it will never realize its full potential.
DevOps is about breaking down silos and barriers so employees can get work done more quickly. That means having management buy-in, not just buying more tools.
5. You Think That Failure Is Unacceptable
Companies might be automating correctly and have management buy-in, but the DevOps team gets it wrong when they don’t embrace failure. Netflix, for example, actually tries to anticipate failure so they are ready for when scenarios such as downed servers or non-functioning code do show up.
On the philosophical end, management needs to realize that failure is part of the practice of creating and releasing code. Rather than having painful post-mortems that focus on finger pointing, teams need to focus on constructive, blameless post-mortems that look at understanding the issues and how they can be avoided in the future. Ideally, a failed release is met with:
“New tests...built around your mistake so that it’s caught next time and everyone acts like it’s simply another day. This is when you know your company has adopted an important DevOps philosophy.”
6. You Maintain the Divide Between Devs and Ops
Gene Kim notes that effective DevOps “emphasizes the performance of the entire system, as opposed to the performance of a specific silo of work or department.”
As has been written in many articles describing the problems of DevOps, Dev and Ops cannot sit in silos that don’t speak to one another. Devs cannot create code and throw it over the wall when they are finished and expect Ops to deploy it. That just doesn’t work. Instead, Devs and Ops need to work as one team.
Often, this comes down to having both Devs and Ops on-call. If Devs see that their code is causing numerous problems and is waking them up in the middle of the night, they will be more conscientious about writing and testing their code. Similarly, if Ops sees the pressures Devs are facing, they are more likely to be sympathetic.
7. Not Using Critical Alerting Tools
Relying on ineffective critical alerting tools to notify engineers of critical incidents will serve to magnify many other sins that DevOps commits. If critical alerting tools are not part of the fundamental philosophy of a DevOps team, then:
- The team’s ability to focus on metrics is diminished. If you don’t know when an incident occurs, for example, then how can you decrease MTTR?
- Effective post mortems will be much more difficult. Critical alerting tools like OnPage allow teams to see where the alerting might have gone wrong or if incorrect information was delivered in an alert.
- The divide between Devs and Ops will continue. By putting both teams on ‘on-call’ duty, each can see what alerts were created. By having a clear vision of what is creating alerts, there will inevitably be empathy built on both sides.
Critical alerting tools are key to decreasing downtime, maintaining customer satisfaction and resolving issues quickly. It is indeed quite sinful to ignore these points and continue with tools that don’t effectively alert your on-call team.
Committing any of these DevOps sins is not a mark that a company is irredeemable. Instead, recognizing the fault is probably the first way towards correcting it. The best advice is to work at unraveling one sin at a time.