This article is part of a series about Java Forum Nord 2015, a conference that took place in Hannover, Germany. Links to articles about other talks I visited there can be found below.
Falk Sippach talked about one of the hot topics of the conference: Dealing with legacy code. This seems to be a growing concern in software craftsmanship. In his talk, Falk addressed ways of enhancing the quality of a given codebase. Two of his sources of knowledge are the famous books "Refactoring" by Martin Fowler and "Working effectively with legacy code" by Michael Feathers.
An important thing to note is that "Refactoring" makes the assumption that the code is testable and there are already tests that prevent inserting new errors by refactoring. This is often not the case in real-world legacy applications. Falk talked about the need of refactoring when noticing code smells such as long methods and wrong code comments (and also mentioned the old religious war around the question if code comments are per se a bad thing). "Working effectively with legacy code" makes the contrary assumption that there are no tests.
Following different lines of thought, Falk derived that automated tests are one of the most important things to have when building a good codebase. And this is where problems start: To understand legacy code, often refactoring is necessary. To refactor safely, tests are necessary. To write test, one has to understand the code. A perfect cyclic dependency! Falk proposes the following steps to resolve this dependency:
- Identify what to change.
- Identify what to test.
- Break dependencies.
- Write the tests.
- Modify and refactor.
Falk talked about methods that assist a developer in refactoring after these steps:
This method ensures on a very abstract level that a codebase doesn't change its behavior while being changed. Behavior is made visible/readable by adding log statements. They don't change the code much and don't break anything. After writing these statements, the complete codebase has to be copied to a safe place and becomes the "Golden Master". While modifying the original codebase, its log output can be compared to the output of the Golden Master. If both logs are equal, it's likely the application works as before. If the output differs, its behavior differs and the refactoring changed the behavior of the application. Falk also mentioned ways to deal with generated data in these tests that are also called "characterization tests".
Subclass to test
This method makes it possible to test private or package-private methods. A new class is created that extends the class under test. That way, every method is accessible and testable while the class under test doesn't change.
Extract Pure Functions
Pure functions and functional programming in general has been a big topic on the conference. One reason for this is the introduction of lambda expressions in Java 8. A pure function has no side effects and ensures "referential transparency". This means that its result is only dependent from input parameters. The function will give the same result when being called multiple times with the same parameters. Pure functions are way easier to understand than methods with side effects and also easy to write tests for. Hence, extracting this type of method from legacy code is a good deed.
The DRY-principle should be common to every developer these days: "Don't repeat yourself". This means that a thought or concept should only be written once and used where needed, instead of having copies of this code all over the codebase. However, Falk proposed to only remove duplication when there are at least three copies of one concept. This is called the rule of three.
Very often, legacy code has "god classes" that span over thousands of lines of code and seem to do a million different things. These classes grow even larger over time because the basic intend is not visible anymore and it's easy to "just add a few more lines". These classes doen't respect the simple responsibility principle, which ensures testability and readability of the code. Hence, large classes have to be refactored into smaller classes.
Dependency Inversion as one of the SOLID-principles states that higher-level modules should not depend on lower-level modules, both should depend on abstractions and abstractions should not depend on details. This sounds confusing. I rephrased it for me into "have clear and nice dependencies between your modules, packages and classes". A lot of tools out there support dependency management.
An interesting thought Falk mentioned is "Refactoring is exhausting". I can only support that. Doing refactoring, I have to pause more often than in "normal" programming / implementation mode. It's good to see that other developers have the same issue.
Here are Falk's slides.
Other Content of Java Forum Nord 2015
These are the talks I visited:
- Keynote by Adam Bien
- "Reflecting Software Architectures" by Stefan Zörner
- "Yes we scan - Software analysis with JQAssistant" by Dirk Mahler (blog post by Dirk Mahler himself)
- "Mastering legacy code in x simple steps" by Falk Sippach
- "Swimming upstream in the container revolution: Containerless Continuous Delivery" by Stéphane Nicoll
- "Functional Programming in Java 8" by Nicole Rauch
- "where has my software architecture gone?" by Oliver Gierke
At Java Forum Nord 2015, Falk Sippach gave a number of methods how to change legacy code without changing it's behavior.