How To Make Legacy Code More Testable
Implementing automated testing in legacy code, focusing on refactoring for better testability using principles like SOLID, DRY, and KISS.
Join the DZone community and get the full member experience.
Join For FreeMuch has already been said about the importance of automated tests: they provide a great safety net when modifying system components, alerting to issues much earlier in the development lifecycle. As a result, bugs are prevented from ever reaching production environments.
When we're working with legacy code that has very low automated test coverage (or no test coverage at all), building automated tests can be difficult and frustrating. The initial effort of setting up automated tests is frequently more than the team can afford at the time, and we end up deferring it indefinitely.
This creates a snowball effect: as complexity grows over time, changes become riskier and more difficult to implement, the gap to be covered by automated tests grows larger, and even though we spend more time maintaining the software, we are less willing to spend time improving the testability of the codebase. In this article, we'll look at some best practices for getting started with automated testing on legacy code.
Part 1: Types of Automated Tests
Before we move on, it’s important to look at the different types of automated tests we can write and understand which ones are better suited to the problem we’re trying to solve.
End-To-End Tests
End-to-end (E2E) tests are automated implementations that simulate the system's final user interacting with it. Typically, they represent a specific flow (or user story), such as "a user can reply to a comment," and would include all the steps required for a real user to do that in the system, such as clicking buttons, typing text, scrolling, and so on, thus encompassing the entire software stack. They are frequently regarded as the most valuable tests to implement because they simulate user interactions and ensure that the entire system functions properly as a whole.
The downside is that E2E tests are typically much slower to run and flakier than other automated tests due to their many dependencies, making them more difficult to maintain.
Because these tests simulate system interaction rather than testing specific components of code, their application on legacy systems is not fundamentally different from that on newer systems. Aside from minor changes, such as establishing UI identifiers, the test tools will typically run independently of the system's internal code. For this reason, we won't be focusing on E2E tests for the rest of this article.
Unit Tests
Unit tests typically focus on the smallest components of software, such as functions and methods. For example, a function that calculates the total cost of items in a shopping cart can be the subject of a unit test, validating some scenarios of inputs and outputs. Unit testing is the most common type of testing, which is usually easier to set up; it is critical for detecting bugs early on and thus facilitating changes.
Unit tests can become a problem if they are overly numerous or overly detailed, resulting in high maintenance costs or long CI waiting times. Over-reliance on unit tests can also lead to the omission of larger system-level issues.
This type of test is frequently difficult to implement in legacy code due to tight coupling and a lack of modularity. For older tech stacks, the available tooling for unit tests may pose an extra challenge when trying to set up new automated tests in an existing codebase.
Integration Tests
Integration tests are commonly used in the software development lifecycle to ensure that multiple interconnected components work seamlessly within one system. Integration testing, for example, can be used to test the interaction between a database and a server or for API calls. It is critical for detecting problems with interfaces and the interaction of integrated units.
Integration tests can be more difficult to set up and execute than unit tests, especially when there’s a dependency on external systems. As they have more dependencies to execute, they are typically slower and less reliable than unit tests.
In the context of legacy code, integration tests tend to be particularly hard to implement. Validating the complex interactions between different parts of a system that may not have been well-documented is certainly a challenging (and often not rewarding) task. It’s not rare to see integration tests that mock so many dependencies that they end up being too distant from the actual production behavior.
Performance Tests
Performance tests measure a system's speed, responsiveness, and stability under varying loads. They are critical in ensuring that the software meets performance requirements and can handle expected traffic. In practice, they take the form of load testing, stress testing, and spike testing.
The primary drawback of performance tests is that they can consume a significant amount of resources and time. They can also be difficult to replicate precisely, resulting in inconsistent results.
Performance testing is critical for legacy systems in scenarios like when they are being integrated with newer technologies or when the traffic volume is expected to increase significantly. However, we won’t cover performance tests in this article because the strategies tend to be much more specific to each system and architecture, and this would require a dedicated article.
Recap
These are just a few of the automated tests available in the industry; we will concentrate on the most common ones, unit and integration tests.
Part 2: What We Should Test
Now that we've decided on a strategy for unit or integration tests, the next step is to figure out what kind of code we're dealing with and what testing strategies should be used for each of them. Of course, there are numerous ways to categorize types of code, but consider the following:
- Trivial code: This is code like getters/setters and other similar low-complexity boilerplate code that has little or no impact on the application's functionality or performance. They usually exist only because a specific language or framework requires it, and testing them is pointless except in very specific scenarios.
- Core code: This is the most important part of the codebase because it contains business logic, data transformation, application rules, and all application decision-making. Core code is typically found in a system's backend, but this is not always the case. Other layers, such as UI components, may also represent a significant portion of the logic and be considered system core code. In most cases, we should concentrate our efforts on writing automated tests, particularly unit and integration tests, on core code.
- Orchestrators: These are the components that connect to other components or act as bridges between them. In web development, for example, it is common to have a controller that accepts user interaction (input) and invokes the appropriate core logic to process the data and return the response. As long as these components are merely orchestrators, there is little point in directly testing them with unit tests, for example, but they will almost certainly be tested as part of some integration tests.
- Spaghetti code: This is the code that combines two or three types of code described above in one place. It's not uncommon for it to grow into massive classes with thousands of lines of code. Even when they're not that bad, it's difficult to create tests because the smallest components lack well-defined responsibilities. The issue here is that this code almost certainly contains core code that we want to test, but getting started is difficult. What is the solution? Refactoring! But that's a topic for the next installment.
Part 3: Refactoring
Code refactoring is the process of restructuring existing code without changing its external behavior. Its primary goal is to clean up and simplify code design to improve readability and maintainability. As we're aiming for testability, the best place to start is by breaking large functions down into smaller, more focused functions, making the code more modular, easier to understand, and easier to maintain.
Looking back to the types of code we just saw, our main goal in legacy codebases is breaking codes of type #4 (Spaghetti code) into smaller and more manageable pieces of code that look a bit more like types #1 to #3, and then start writing code for those bits that are more meaningful for what we need.
Code Example
To get more practical, let's consider the code below:
public class UserDataProcessor {
public void processUserData(String name, int age, String address) {
if (name == null || name.isEmpty()) {
throw new IllegalArgumentException("Name cannot be empty");
}
if (age < 0 || age > 120) {
throw new IllegalArgumentException("Invalid age");
}
if (address == null || address.isEmpty()) {
throw new IllegalArgumentException("Address cannot be empty");
}
// Perform the main processing logic
// ...
Logger.log("Processed data - Name: " + name + ", Age: " + age + ", Address: " + address);
}
}
In this example, the processUserData method performs validation, processing, and logging. We can improve readability and maintainability by breaking it down into smaller functions:
public class UserDataProcessor {
public void processUserData(String name, int age, String address) {
validateUserData(name, age, address);
processMainLogic(name, age, address);
logProcessedData(name, age, address);
}
private void validateUserData(String name, int age, String address) {
if (name == null || name.isEmpty()) {
throw new IllegalArgumentException("Name cannot be empty");
}
if (age < 0 || age > 120) {
throw new IllegalArgumentException("Invalid age");
}
if (address == null || address.isEmpty()) {
throw new IllegalArgumentException("Address cannot be empty");
}
}
private void processMainLogic(String name, int age, String address) {
// Perform the main processing logic
// ...
}
private void logProcessedData(String name, int age, String address) {
Logger.log("Processed data - Name: " + name + ", Age: " + age + ", Address: " + address);
}
}
And here's the best part: we don't have to complete everything all at once. We can always start by extracting into functions only the parts that will be changed or are more critical. In the preceding example, we could have only broken down one of the tasks and left the others in the original function until the next opportunity. The source code lives as long as the software, and we can improve it iteratively. Even in the preceding example, we could go a step further and divide the validateUserData function into three smaller functions:
public class UserDataProcessor {
public void processUserData(String name, int age, String address) {
validateUserData(name, age, address);
processMainLogic(name, age, address);
logProcessedData(name, age, address);
}
private void validateUserData(String name, int age, String address) {
validateUserName(name);
validateUserAge(age);
validateUserAddress(address);
}
private void validateUserName(String name) {
if (name == null || name.isEmpty()) {
throw new IllegalArgumentException("Name cannot be empty");
}
}
private void validateUserName(int age) {
if (age < 0 || age > 120) {
throw new IllegalArgumentException("Invalid age");
}
}
private void validateUserName(String address) {
if (address == null || address.isEmpty()) {
throw new IllegalArgumentException("Address cannot be empty");
}
}
private void processMainLogic(String name, int age, String address) {
// Perform the main processing logic
// ...
}
private void logProcessedData(String name, int age, String address) {
Logger.log("Processed data - Name: " + name + ", Age: " + age + ", Address: " + address);
}
}
We could think of many other ways to make it more modular. The validator functions could, for example, be part of a UserDataValidator class, and so on. This may appear to be overkill for this example (and we'll stop there), but the important takeaway is that refactoring legacy code can be viewed as a continuous improvement process.
To return to testing, any of these small pieces are now much easier to test than the entire initial function. We had a function that was simple enough that we could easily break it down into 1-2 iterations. In real-world scenarios, we may need to start by extracting one small part of a larger function and then write tests only for that small part. As we progress through this process, testing each component becomes easier than it was before we refactored it.
Side Note: Testing Private Functions
As we broke down the original public function, new smaller functions were created. However, these are now private functions, and it is possible that this triggered an alert because testing private functions is often regarded as bad practice. Yes, and we can still agree on this principle while opting to test these smaller functions, knowing that we'll be in a better state (with some smaller and tested chunks of code) than before (with a larger and untested chunk of code).
Principles of Refactoring
There is one more important aspect of refactoring to discuss: What makes a good refactoring? In software engineering, we frequently implement different solutions to the same problem, and the same is true for refactoring. There is no single way to refactor code, and decisions are influenced by both personal preferences and system constraints. However, some coding principles can help us decide what and how to refactor. We won't go into detail about clean code and coding principles here, but let's take a quick look at three of the most well-known coding principles: SOLID, DRY, and KISS.
Solid
Probably the most famous one in this list, the name SOLID is an acronym for these five Object-Oriented programming principles:
- Single responsibility principle: Each class or module should have one and only one responsibility and therefore do only one thing.
- Open-closed principle: Software components should be open for extension but closed for modification.
- Liskov substitution principle: Objects of a superclass should be seamlessly replaceable with any object of its subclasses.
- Interface segregation principle: Clients should not be forced to depend upon interfaces that they do not use.
- Dependency inversion principle: Dependency should be on abstractions (interfaces, abstract classes, etc) rather than concretes (specific subclass implementation).
Dry
This principle, which is an acronym for "Don't Repeat Yourself," is all about reducing code repetition. Instead of repeating the same code in multiple places, we should isolate the common logic and reference it from everywhere else. We frequently see the following rule of thumb applied here: If it's the first time, we write the code. If it's the second time, we'll duplicate it. We refactored it the third time to make it reusable.
Kiss
The goal of this last one is to keep our code as simple as possible, and the acronym stands for "Keep It Simple, Stupid." This favors avoiding unnecessary complexity, particularly over-engineering, which is common in some projects.
Conclusion
In this article, we looked at how to make legacy systems easier to test, thereby improving their maintainability and reliability. We investigated the various types of automated tests and determined which ones are best suited for different code types. Refactoring legacy code is a key practice to gradually improve testability on legacy systems and thus gradually improve their test coverage.
Opinions expressed by DZone contributors are their own.
Comments