I’ve gotten some questions about the idea of what we do being scientific or not, and that raises some interesting discussion points. This is a “You Asked for It” post, and in this one, I’m just going to dive right into the reader question. Don’t bury the lead, as it were.
Having attended many workshops on Agile from prominent players in the field, as well as working in teams attempting to use Agile I can’t help but think that there is nothing Scientific about it at all. Most lectures and books pander to pop psychology, and the tenants of Agile themselves are not backed up by any serious studies as far as I’m aware.
In my opinion we have a Software Engineering Methodology Racket. It’s all anecdotes and common sense, with almost no attention on studying outcomes and collecting evidence.
So my question is: Do you think there is a lack of evidence based software engineering and academic rigor in the industry? Do too many of us simply jump on the latest fad without really questioning the claims made by their creators?
I love this. It’s refreshingly skeptical, and it captures a sentiment that I share and understand. Also notice that this isn’t a person saying, “I’ve heard about this stuff from a distance, and it’s wrong,” but rather, “I’ve lived this stuff and I’m wondering if the emperor lacks clothes.” I try to avoid criticizing things unless I’ve lived them, myself, and I tend to try something if my impulse is to criticize.
I’ll offer two short answers to the questions first. Yes and yes. There is certainly a lack of evidence-based methodology around what we do, and I attribute this largely to the fact that it’s really, really hard to gather and interpret the evidence. And, in the absence of science, yes, we collectively tend to turn to tribal ritual. It’s a God of the Gaps scenario; we haven’t yet figured out why it rains, so we dance and assume that it pleases a rain God.
Evidence is Hard
When I say “evidence is hard,” please don’t hear my voice sounding satirically whiny in your mind. I’m not being snarky; I’m being dead serious. It’s really hard to gather and synthesize evidence of efficacy in the field of software engineering. I understand this, perhaps more keenly than some, because I took graduate level CS and software engineering courses where I read and wrote white papers on relevant industry topics.
I can’t recall the exact details, but I recall reading a paper in a course called, “Advanced Topics in Software Engineering” that made a study of the relationship between the length of methods in a code base and the incidence of bugs. Doubtless, this was a well-conceived attempt to answer the question “how long should your methods be?” once and for all. I’d say this is deceptively simple, but I doubt anyone is deceived. You can, no doubt, imagine problems right off the bat, particularly in attempting this a decade ago.
We probably have to pick a tech stack immediately, and hope it’s representative. After all, not all lines of code are created equally across languages and paradigms. We also have to go out and find a code base that’s in something resembling production because, absent that, “bug” is sort of meaningless. But, even assuming we find that, how do we know the rate of bug reporting is representative? I mean, what if the code is in production and it has no users? What if it has users that just don’t care enough to report bugs? And what exactly is a bug, anyway? You can see where this is going.
But before I finish asking study-confounding questions, let’s get to perhaps the most fundamentally bothersome ones. What production code can we find that isn’t jealously guarded as intellectual property, and whose outcomes can we measure that don’t mind their dirty laundry being aired? Do you think, especially 10 years ago, that Microsoft and Apple were in a race to let the world measure their OS method lengths against their defect counts?
And so when we do these types of studies, we conduct them using open source projects to represent the average software project, and we use people like grad students and undergrads to represent the average developer. Even if the methodology were flawless and the results compelling for any given study, we’ve successfully studied something that doesn’t begin to represent the industry.
I’m not saying that there haven’t been studies that peeked inside corporate walls — I can’t make that claim. I’m just looking to present an ipso facto challenge that has existed when it comes to software construction that never existed for, say, biology. Biology would be a lot harder to get right if nearly every organism on Earth were owned by some company or another.
I don’t want to belabor the point unnecessarily, so if you accept my premise that it’s really hard to run experiments correlating software construction techniques with qualitative outcomes, you’ll agree that we’re faced with a daunting, but theoretically possible task. We could, theoretically, pry people’s jealous fingers off of their source code (and one might argue that we’re moving slowly in this direction as an industry) and start to inject more transparency into approaches and outcomes.
But even with that, it’s important to note that there are fundamental differences between what experimental scientists do and what we, as commercial software developers, do. A physicist, for instance, observes phenomena in the world at large, hypothesizes as to cause, runs experiments to confirm or falsify, and molds those into theories. The scientist studies nature and makes predictions.
Let’s consider three actors in the realm of physics, as a science.
- A physicist, who runs electricity through things to see if they explode.
- An electrical engineer, who takes the knowledge of what explodes from the physicist and designs circuitry for houses.
- An electrician, who builds houses using the circuits designed by the electrical engineer.
I list these to illustrate that there are layers of abstraction on top of actual science. Is an electrician a scientist, and does the electrician use science? Well, no, not really. His work isn’t advancing the cause of physics, even if he is indirectly using its principles.
Let’s do a quick exercise that might be a bit sobering when we think of “computer science.” We’ll consider another three actors.
- Discrete mathematician, looking to win herself a Fields medal for a polynomial time factoring algorithm.
- R&D programmer, taking the best factoring algorithms and turning them into RSA libraries.
- Line of business programmer, securing company’s Sharepoint against script kiddies uploading porn.
Programming is knowledge work and non-repetitive, so the comparison is unfair in some ways. But, nevertheless, what we do is a lot more like what an electrician does than what a scientist does. We’re not getting paid to run experiments — we’re getting paid to build things.
Which means that, if we’re going to be applying science to the field of programming, we’re not the scientists. We’re the subjects.
Here’s a third set of actors.
- Neuroscientist studies cognition and sees how humans most effectively process sequential instructions.
- Language designer uses that knowledge to build a programming language that is maximally grokkable by programmers.
- Savvy programming teams use that programming language.
As I said, we, as programmers, build things. If science is going to be part of the discussion, it’s almost certainly going to be drawn from disciplines around cognition, psychology, and systems theory, and it’s going to be applied to things that we can use to be more effective.
To return to the original thrust of the question, then, it’s apparent that there’s some sort of impedance mismatch. “How do we make humans most effective at automating and building things” is not a question that we solve by writing and tweaking how we write software, but rather, by studying humans and systems. Writing software and then looking back at our experience writing that software doesn’t produce science — it produces anecdotes. And, to quote something I heard from a man named Mark Crislip, whose podcast I love (don’t think it’s his originally), “the plural of anecdote is ‘anecdotes’ — not ‘data’.”
There are a lot of people out there who make a lot of money describing the best ways to write software. Is software best written in offices, cubicles, or flat tables in warehouse-like spaces? Should you write automated tests before or after you write production code? Heck, if you want to see some strangely angry people come out of the woodwork, go on twitter and talk about whether you should estimate software projects. And, do you know what? It’s all largely based on personal experience, shared collectively and writ large.
Certainly, there are practices that have higher success rates than others, and even measurably so. Electricians can certainly tell you which wire caps have lasted the longest and which have resulted in repeat visits. But nevertheless, any process approach should come with the heavy, ever-present caveat of, “this is something I’ve tried and found success with, so it’s possible that you might also.” I’d be intensely leery of any stronger suggestions or promises for success when it comes to software process. One of the best things to take away from the good lessons I’ve seen offered by the agile movement is “inspect and adapt.” It’s not prescriptive, and no one’s getting certified, but it’s certainly honest and well intentioned. All you can really do for now is try stuff, see if it works and adjust accordingly as you go.
This is post was a response to a reader question. Want to ask me one? You can tweet me at @daedtech.