The DevOps Zone is brought to you in partnership with Librato. Check out Librato's whitepaper on Selecting a Cloud Monitoring Solution.
Recently, I was catching up with a former colleague. He mentioned a service that I wrote years ago, and how it has since become known as the Career Killer. Basically everyone who touched the Career Killer ended up leaving the company. If the company wanted to have > 0 developers, the only solution at this point was to take a few months and refactor this service completely.
I have two things to say about this. First, that code was at 85% unit test coverage when I left so don't go blaming me. Second, this huge refactoring? It's not going to work.
Every codebase has at least one component that is widely hated and feared. It does too much, it has too many states, too many other entities call it. When it comes time to pay down technical debt, you should definitely focus on this component. However, if you have an incomplete understanding of this component and you stop everything to completely rewrite it, your odds of success are low. That component, as scary and complex as it appears, is actually way more scary and complex than you think.
How do you think that component got into this unfortunate shape? Is it because the company hired a nincompoop and let him run wild in the codebase for years? Or is it because the component was originally a sound abstraction, but its scope of responsibilities had grown over the years due to changing requirements? (For the sake of my ego, I'm hoping the Career Killer is the latter.) In all likelihood, this component arrived at its current, scary state via smart people with good intentions. You know what you are right now? Smart people with good intentions. If you proceed with a big refactor, you'll trade one form of technical debt for another.
In order to truly pay this debt down, you need to untangle the complexity around the problem. You need to spend time looking at all the clients calling this component. You need to spend time talking with your colleagues, learning more about the component's history and how it's used. You need to make a few simplifying changes around the periphery of the component and see what works. Each week, you spend a little more time and untangle the problem just a little bit more. Given a long enough timeframe, you'll eventually untangle all of the complexity and brought a teeny bit of order to the universe.
Practically speaking, what do you do here? Rather than 3 full months on a complete refactor, spend 25% of your time over the next year. It's the same time commitment either way, but with the 25% plan, you get time to analyze and plan. You get time to untangle.