Last week, a post I wrote, “The Myth of the Software Rewrite,” became pretty popular. This generated a lot of comments and discussion, so I decided just to write a follow-up post to address the discussion, as opposed to typing a blog post’s worth of thoughts, distributed over 20 or 30 comments. This is that post.
First of all, I want to be clear about what I’m talking about. I’m talking specifically about a situation where the prime, determining factor in whether or not to rewrite the software is that the development group has made a mess and is clamoring to rewrite it. In essence, they’re declaring bankruptcy — “we’re in over our heads and need outside assistance to wipe the slate clean so we can have a fresh start.” They’re telling the business and their stakeholders that the only path to joy is letting them start over.
Here are some situations that the article was not meant to address:
- The business decides it wants a rewrite (which makes me skeptical, but I’m not addressing business decisions).
- Piecemeal rewrite, a chunk at a time (because this is, in fact, what I would advocate).
- A rewrite because the original made design assumptions that have become completely obsolete (e.g. designed around disk space being extremely expensive).
- Rewriting the software to significantly expand or alter the offering (e.g. “we need to move from web to mobile devices and offer some new features, so let’s start fresh.”).
A Lesson From Joseph Heller
Joseph Heller is the author of one of my all time favorite works of fiction, Catch 22. Even if you’ve never read this book, you’re probably familiar with the term from conversational reference. A catch 22 is a paradoxical, no-win situation. Consider an example from the book.
John Yossarian, the ‘protagonist,’ is an anti-heroic bombardier in World War II. Among other character foibles, one is an intense desire not to participate in the war by flying missions. He’d prefer to stay on the ground, where it’s safe. To advance this interest, he attempts to convince an army doctor that he’s insane and thus not able to fly missions. The army doctor responds with the eponymous catch 22: “anyone who wants to get out of combat duty isn’t really crazy.”
If you take this to its logical conclusion, the only way that Yossarian could be too crazy to fly missions is if he actually wanted to fly missions. And if he wanted to fly them, he wouldn’t be noteworthy and he wouldn’t be trying to get out of flying them in the first place.
I mention this vis a vis software rewrites for a simple reason. The only team I would trust with a rewrite is a team that didn’t view rewriting the software as necessary or even preferable.
It’s the people who know how to manufacture small wins and who can inch back incrementally from the brink that I trust to start a codebase clean and keep it clean. People who view a periodic bankruptcy as “just the way it goes” are the people who are going to lead you to bankruptcy.
The Business of Rewrites
But let’s leave the cost benefit analysis of code bankruptcy aside for a moment. I had a lot of comments wondering and even asserting that the cost of rewrite would be less than the cost of refactoring (which is interesting as it appears to presuppose that “refactor” is a giant, monolithic activity, rather than an incremental one). As far as I could tell, this was a simple man-hours calculation of time spent multiplied by salary.
Let’s assume that this is true. Let’s assume that it will take X hours to completely rewrite the code base from scratch and that it will take, say, 1.5X hours to work the existing code into a state where it’s of comparable maintainability to the shiny, new code. Further, let’s give the team the benefit of the doubt and assume that they do truly learn from their first mess and make things better, resulting in hypotheticals where either path leads to code of the same maintainability.
What about the logistics, from a business perspective? The legacy code base is in production and either generating top line revenue or saving bottom line revenue. Do you just put the brakes on that while the rewrite occurs?
Probably not. Instead, the business probably splits the software group, picking a “tiger team” of the thought leaders behind the original mess, and tasking them with a rewrite. Then the B team of maintenance programmers sits in break/fix mode on the existing code base, issuing fixes, patches, and small features while the tiger team cordons itself off and generates V2.0 awesomeness.
So you have one team working on V1, and another team working on V2. To illustrate how I think that would go, I’ll offer a hokey little allegory.
The Awful Chase
Let’s say that we have two friends, Alice and Bob, that want to meet for dinner. Alice is on the far north side of town, and Bob is right in the city center, near the restaurant district. Alice calls Bob, and they agree on a plan. Since Alice is a really fast runner and in really good shape, she’ll jog to Bob’s location, and then they’ll walk the remaining few blocks south together so that Alice can cool down in time for dinner. Obviously, it’s critical that Bob remain right where he is unless there’s an emergency, since Alice is running to catch up with him.
Everything stars out according to plan, but when Alice is about a quarter of the way there, Bob notices an elderly man struggling to carry his groceries to his apartment, a block south. Bob dutifully lends him a hand, but calls Alice to let him know about his new location. She’s a bit annoyed because she has to stop jogging, pull over, and answer the phone, but whatcha gonna do? You kind of have to help the old man.
This happens a few more times, with Bob drifting further south. No single time is a big deal, but Alice starts to get annoyed. This is throwing her plan to meet Bob and then cool down all out of whack. She reluctantly asks Bob to stop helping elderly people, and Bob agrees. Alice starts jogging again, with the matter settled.
About halfway to Bob’s new destination, Alice gets another call from Bob. Annoyed, she pulls over and answers and is surprised to Bob talking over a noisy crowd. “You won’t believe this. My boss just called and I mentioned that we were going to dinner, and she just happened to be at the restaurant. I kinda had to meet her to say hi. You understand. So, what I’ll do is just wait here for you and have a beer. We’ll just eat when you get here.”
Seriously annoyed, Alice hangs up and starts to run faster because the evening’s plans are starting without her. She gets that Bob has obligations to outside parties, but it’s still frustrating, particularly since now she has to alter her route to head directly to the restaurant. But, she makes good time and is optimistic until the next call from Bob comes in. And Bob sounds a little drunk.
“Hey Alice! My boss finished up and left, but then my neighbor stopped in and ordered us a few rounds. I’m totally sorry about this, but I got a little drunk.”
“Whatever, Bob, I’m almost there. We can order dinner when I arrive.”
“Well, that’s the thing. I got sorta hungry and kinda ate a bunch of nachos, so I’m not really hungry anymore. We’re going to take the last ferry across the river now and go to the riverfront bar, so just meet us there. They have food there, too.”
“Bob! There’s no good place to cross the river here – I’ll have to double back a ways and rethink my whole approach! You can’t –“
“Sorry, Alice, gotta go. I feel bad, but I’ll see you soon.”
Alice, now furious, doubles back to where she can cross the river via bridge. Instead of three quarters of the way, she’s back to being only halfway there again. So, she runs even harder to make up lost time. Problem is, her ankles are starting to hurt since she’s actually covered a lot more distance than she originally planned to at a much higher speed. She’s not sure that this is sustainable and she knows she’s going to be hurting in the morning. She grits her teeth and makes progress, eventually arriving at the river front bar, exhausted and ready to eat.
Just then, she notices a voicemail. “Alice, it’s Bob. Listen, I was feeling pretty bad about earlier, and my neighbors bailed anyway, so I decided to revert back to the original restaurant. See you soon!”
Staring at her phone in disbelief, Alice decides to call a cab and go home.
The Reality of Rewrites
If a proposed rewrite meets the criteria I outlined at the beginning of the post, then you have a team looking to create an identical copy of a production system. They’re doing this because the production system is a mess, and the only way they can see it not being a mess is if they start all over.
So what you wind up with is two teams, where one represents the finish line and the other represents a runner, sprinting as fast as possible, to get to the finish line. Except that, with the rewrite, the finish line moves at the whims of customers, and the frantic sprinting causes the rewrite to start to be as much of a mess as the original system.
You could conceive of exceptions to this, I suppose (and, I wasn’t ever intending to define some kind of immutable law of nature without exceptions). One commenter mentioned a rewrite from Microsoft, and it stands to reason that a company with massive pockets, legions of developers, and 3-year release cycles for box software could pull this off. But that doesn’t mean that it would be the common case, or even that it would be a good idea (having the approach not doom a company of “too big to fail” isn’t saying much).
If you’re going to recover from a messy situation, you’re going to need a team that can improve the situation while continuing to keep the software producing business value. That isn’t accomplished by declaring bankruptcy and opting for a rewrite. That’s accomplished by a lot of discipline, belt tightening, and dogged, incremental improvement.