A google search for define: legacy gives:
Noun: An amount of money or property left to someone in a will.
Adjective: Denoting software or hardware that has been superseded but is difficult to replace because of its wide use.
Most of us have worked on legacy code OR are working on it OR are even creating legacy code (!!). Just the thought of working on legacy code gives developers spend sleepless nights (yeah, slight exaggeration, but required to highlight the pain). We have yelled and cursed while working on legacy code. Lets step back and reflect a bit about legacy code and see how it can be tackled (if at all!)
What is legacy code?
At the start of the post I tried to get a definition for the term: “legacy”. (We wish if the noun was true, but indirectly legacy code is one of the reasons for our pay). The adjective definition is somewhat closer to our “legacy code”.
Legacy code is something:
- that has been written previously ( may be years old or even months, may be written by another developer or by the same developer) and continues to work just well to satisfy the customer needs.
- that is being written as I am writing this post (and it is something which evolves to a legacy code stature in no time) and just enough to implement the requirements.
Legacy code which was written previously
Quite often you wish to develop a system/software from scratch but you end up working on a code that has been written 10years before or even earlier. We always yell and curse the one who wrote that code and we spend more time in understanding the code than adding something new/fixing some bug in the code. Any attempt to change the code would put us in danger of breaking some other functionality. Sometimes the code would have been written in a language which we are not familiar with ( there cant be anything worse than reading a code which is difficult to understand and in a language we are not familiar with) or other times just to get the code working/building the code is a huge task.
Why is it so?
- legacy code is devoid of good number of tests- so it becomes difficult to test the changes to see if it breaks something you are not aware of. Also if you are not sure of the changes to the code then you try to keep the changes as minimal as possible- just enough to fix the issue and not to improve the quality.
- mostly while the code was written there might not have been peer code reviews. Code reviews are often a good way to find places where the code can be refactored and improved. Its always advantage for a third eye to read the code. And its helps in improving the quality of the code to some extent.
- legacy code would have been written with cramped deadlines where the importance would be to just get the software out of the door, to be in the market as early as possible. Under such circumstances its the quality of the code that takes an hit.
- the platform/language used for the code might not generate code that’s very readable/manageable.
- legacy code lacks the required documentation/specification, not readable enough to understand the logic.
Legacy code which is being written
Yes, we all write legacy code, at least we can't call it right away, but 6 months or 1 year down the line the code is equivalent to the pain of legacy code. In few places all the reasons stated above would hold, but considering the current advancements in langauges and platforms (advent of object orientation, or other paradigms, TDD, agile practices) there would be less concerns about the testability, manageability, readability of the code. The possible reasons why we write legacy code would be:
- code with lack of tests- yes! you can find lot of code which goes in without tests. By implementing a requirement that code should have some % of code coverage can help to a great extent. Often the wrong notion of “tests have to be written by QA or test engineer” is also a reason for code without test.
- code which doesnt follow good design principles. Object orientation helps in writing code which is readable, manageable, extendable. But the same can be used for writing code which is a nightmare for the developers. (I havent worked on other programming paradigms, so I might as well be a bit inclined towards Object oriented programming. Please feel free to add on for different paradigms).
- lack of good code reviews. Lot of organisations might feel that code reviews are waste of developer time, few developers feel that code reviews are not accounted for their performance or few do a bad job of code reviews, do it just for the heck of it.
I just listed a few possible reasons as to why the legacy code is like the way it is. There might be more and I would be missing, feel free to chime in and I would be more than happy to accommodate them in my post.
There are lot of aspects related to legacy code- like how to work with legacy code. I would like to cover working with legacy code sometime in future. There’s also a book: “Working Effectively with legacy code” written by Michael Feathers, I havent read this book yet, but might be useful read.