I was at DevOps World last week (nothing like Disney World, by the way) and happened to be paying attention to a talk by a chap called Jonathan who worked at Barclays Bank. He briefly mentioned a couple of KPIs that they measure to track the success of their DevOps initiative. He mentioned these:
This list looked quite good to me, I thought “They sound pretty sensible, I’ll remember those for the next time someone asks me about DevOps KPIs”. The reason I thought this, you see, is because I get asked “What are good DevOps KPIs?” almost every week. Colleagues, clients, friends & family, random strangers, the dog… Everyone asks me. It’s like I’m wearing a T-Shirt that says “Ask me about DevOps KPIs” or something.
So, the time has come to formulate a decent answer. Or, more specifically, write a blog on it, so I can then tell people to read my blog! Hurrah!
A couple of months ago, while discussing a DevOps transformation with a global telecomms company, the subject of metrics and KPIs came up. We’d spent the previous hour or so hearing about how one particular part of the business was so unique and different to all the others, and that any DevOps transformation would need to be specifically tailored to accommodate this business’s unique demands. I totally agree with this approach. However, when the subject of KPIs came up, the “one-size-fits-all” approach was favored.
It’s common for organizations to want KPIs that span the whole organization. It’s convenient and allows management to compare and contrast (for whatever good that’ll bring). But does this “one-size-fits-all” approach work? Or does it encourage the wrong behaviors?
Personally, I think that you need to be very careful about selecting your KPIs and metrics. Peter Drucker once observed that “you can’t manage what you can’t measure,” which sounds sensible enough, but this leads us towards trying to measure everything (because we want to manage as much as we can, right?).
That’s where things get a bit tricky. As soon as we start measuring things, they change. This is known as Goodhart’s Law. What I’m talking about specifically is people changing their behaviors because they’re being measured.
Once you measure something, it changes.
If we’re being measured on utilization level, we try to expand our work to fill the time we have available, in order to look fully utilized. It’s what people do! By doing this, people lose the “downtime” they used to have, the time when people are most creative, and as a result, innovation suffers.
So, What Should We Measure?
It depends on what you’re trying to achieve and what side-effects you’re able to tolerate. Think very carefully about how your metrics and KPIs could be interpreted by both subordinates and management.
For example, I’m currently working with a team who until recently measured the age of stories in the backlog. The thought was, the larger the number, the longer it’s taking to get stuff done. The reality was different. In reality, there was an increasing number of low priority stories, which were often (and quite legitimately) overlooked in favor of higher priority stories. So, what did the metric really prove — that the team was slow or that the team was effective at prioritizing?
I think generally speaking that most stats need to be accompanied by a narrative, otherwise they’re open to misinterpretation. However, we know that there’s often very little room for narrative and that the fear of misinterpretation drives people to try to “game” the stats (that is to say, legitimately manipulate the results). And this is another reason why we have to be very careful when we’re planning KPIs and reporting metrics.
In 2014, Gartner produced a report entitled “Data-Driven DevOps: Use Metrics to Help Guide Your Journey” in which they listed a range of typical DevOps metrics, categorized by type, such as “Business Performance,” “Operational Efficiency,” and so on. I’ve picked out a few of the metrics in the table below. I’ve also added some others which I’ve been using in one form or another. This is by no means an exhaustive list of DevOps KPIs, but it might be somewhere to start if you’re looking for inspiration.
Measuring Tangibles and Intangibles
One thing to be conscious of is that you can’t really measure things like culture and collaboration directly. Culture, for example, is an intangible asset, and you can only really measure the result of Culture, rather than the culture itself. The same goes for collaboration.
In the table above, be conscious of things like happiness, value, and sharing as these can sometimes be hard to measure directly, not to mention being somewhat subjective.