I originally wrote this post for the NDepend blog. You can check out the original here, at their site. Take a look at NDepend while you’re there; if static analysis interests you in the .NET space, it’s a must-try.
As I’ve probably mentioned before, many of my clients pay me to come do assessments of their codebases, application portfolios, and software practice. And, as you can no doubt imagine, some of my sturdiest, trustiest tools in the tool chest for this work are various forms of static analysis.
Sometimes I go to client sites, by plane, train, or automobile (okay, never by train). Sometimes I just remote in. Sometimes I do fancy write-ups. Sometimes, I present my findings with spiffy slide decks. And sometimes, I simply deliver a verbal report without fanfare. The particulars vary, but what never varies is why I’m there.
Here’s a hint: I’m never there because the client wants to pay my rate to brag about how everything is great with their software.
Where Does It All Go Wrong?
Given what I’m describing here, one might conclude that I’m some sort of code snob and that I am, at the very least, heavily judging everyone’s code. And, while I’ll admit that every now and then I think, “the Daily WTF would love this,” mostly I’m not judging at all – just cataloging. After all, I wasn’t sitting with you during the pre-release death march, nor was I the one thinking, “someone is literally screaming at me, so global variable it is.”
I earnestly tell developers at client sites that I don’t know that I’d have done a lot better walking a mile in their shoes. What I do know is that I’d have, in my head, a clearer map from “global variable today” to “massive pain tomorrow” and be better able to articulate it to management. But, on the whole, I’m like a home inspector checking out a home that was rented and subsequently trashed by a rock band; I’m writing up an assessment of the damage and not taking their lifestyle.
But for my clients, I’m asked to do more than inspect and catalog. I also have to do root cause analysis and offer suggestions. So, “maybe pass a house rule limiting renters to a single bottle of whiskey per night,” to return to the inspection metaphor. And cataloging all of these has led me to be a veritable human encyclopedia of preventable software development mistakes.
I was contemplating some of these mistakes recently and asking myself, “which was the biggest one” and “which would have been the most preventable with even simple analysis in place?” It was interesting to realize, after a while, that the clear answer was not at all what you’d expect.
Some of The Biggies
Before the best candidate, some obvious runners-up occurred to me that line up with the kind of thing you might expect. There was the juggernaut assembly. It was a .NET solution with only one project, but man, what a project. It probably should have been about 30 projects, and, when I asked why it wasn’t, there was an awkward silence.
It turned out that there had been talk of splitting it up, and there had even been attempts. Where things got sticky, however, was around the fact that there was a rather large and complex namespace dependency cycle among some of the more prominent dependencies. Efforts had revolved around turning namespaces into assemblies, and, while namespace dependencies are tolerated by the compiler, assembly ones… not so much. The “great split-up” then became one of those things that the team resolved to do “next time we get some breathing room.” And, as anyone (including those saying it) would likely have predicted, “next time” never came.
Had there been relatively basic static analysis in place, these folks could have seen a warning about it the first time someone created a cycle from the previously acyclic graph. As it stood, who knows how many months or years elapsed between its introduction and discovery.
Of course, there are others that are easy to explain. There was the method with a cyclomatic complexity pushing four digits that someone probably would have wrangled before it got to three digits. There was the untested class that every other class in the code based touched, directly or indirectly (I’m sure you can predict one of the problems I heard about there). There was the codebase with the lowest cohesion score I’ve ever seen, accompanied by complaints of weird bugs in components caused by changes in other, "unrelated" components.
The Worst of All
But the worst case I’ve seen was not really like these. It wasn’t a matter of some dreadful pocket of code or some mistake that could have been caught early. Instead, it was an entire codebase that never should have been.
I’m going to change some details here so as not to offer clues as to true identity, so let’s just say that I was doing a tour consulting with a large shop with a large application portfolio. Historically, they’d had dozens or even hundreds of Java applications, but they were starting to dip their toe into .NET, and specifically C#.
By the time I’d gotten there, they’d taken a converted, long-tenured Java developer and tasked him with building out a "framework" to enable rapid development of future .NET applications within the company. They’d also hired on a bunch of .NET folks to assist in this. When I got there, the codebase was disconcerting. There were anti-patterns and common pitfall errors galore, as well as a strained use of inheritance and zany, unnecessary runtime binding schemes. The most amazing feature, though, was a base “DataTransferObject” class, from which every property bag object in the application inherited, that, in the instance constructor, iterated over all of its own reflected properties, and stored a hash of their string name to their expression value, in an instance variable. Every simple DTO in the system took 0.25 seconds to be instantiated.
It was a mess. And it was a mess that they were furiously prototyping all over the organization, in spite of the diplomatic protests of some of the newer .NET dev hires.
Static Analysis as Reality Check
You might wonder how this is a case that static analysis could have solved. After all, they could have been dinged for the excessive inheritance, but there aren’t any “do you have an explanation-defying reflection scheme in your constructor?” queries, nor are there any obvious warnings for “are you relating objects with lots of magic strings?” Static analysis wouldn’t have caught these errors per se.
But, what it would have done was lit up like a well-decorated Christmas tree on this nascent codebase, indicating to anyone who was looking that there was a sizable gulf between their code and what the industry considers good code. And that might just have caused someone in a position to do so to put the brakes on rolling out this boondoggle en masse, before it was too late.
There isn’t any single line of code that’s going to bring your business to its knees (in all likelihood, anyway), nor is there going to be a specific tipping point with method complexity, fan-in, or anything like that. Those are mistakes that get made and get corrected. But static analysis, as a whole, shines a bright light on whether the trusted staff at an organization knows what it’s doing or not.
The biggest mistake I have seen and continue to see, without question, is that organizations trust a single, tenured developer to be infallible and to steer the ship with only subordinates as copilots. They are the guardians, but no one guards them...until you introduce static analysis to guard the guardians.