Do You Trust Your Unit Tests?

DZone 's Guide to

Do You Trust Your Unit Tests?

Mutation testing is most practical for code that changes daily and is business critical.

· DevOps Zone ·
Free Resource

I have been thinking about the value of unit tests. Sometimes I discover tests without asserts or the right assert(s) for the particular contexts. Sometimes I tend to remove tests which in my eyes are not useful anymore, but I do not have anyone to discuss this removal with. So I am not sure. Moreover, it always brings the question how many of our unit tests are valuable.

Of course, when using TDD, we are all starting with unit tests, which are shielding us from making mistakes and which cover almost 100% of the production code. However, what about afterward, when the implementation has been existing for some time? When changes in production happen, and someone is not diligent enough to cover all the modified code? Of course, there should be a code review process and code coverage threshold set, but in my eyes, it is somehow not enough. It is maybe rather a philosophical problem because how I can trust my production code if I do not trust my test suite? I am not surprised that there is a saying in philosophy: “Who watches the watchmen?”.

I ended with this resolution. I believe that the code review process does not have the ability to discover all the issues with newly produced code. More importantly, I do not believe the code coverage. Alternatively, in better words, I believe that measuring code coverage has a significant impact on quality of the code, but does not ensure that all the code does what it should do. It only says what and how many times the code has been called during the execution of the unit tests. This is not saying anything about whether the results match the expected reality. It says nothing about how many different paths or error scenarios the execution of the tests performed. If we have the super trooper code review process, we can cover most of these situations, but not all, and even if more people are working on the same thing, there is no guarantee that the code is bug-free due to the presence of unit tests with close to 100% code coverage.

What we can do about that? Iis it even possible to fix? I would say that we have one step available to incorporate into our software quality process.

Welcome mutation testing!

Mutation Testing

The idea behind mutation testing (MT) is quite simple, and it is surprising that I stumbled on it just recently. The idea is to modify existing production code slightly (producing so-called "mutants"). For example, we could change the meaning of an "if" or "while" statement and run the test suite against the mutated code. If the test suite is failing against this slight modification (mutants all killed), it is good, because our unit test safety net caught unexpected behavior (semantic change). If the test suite does not fail (some mutants survived), it is bad, because our test codebase is not covering the production codebase complexity (when someone makes a semantic change without writing a test, no test is failing).

Uncle Bob has written a great article about MT, where he discusses the philosophy of MT and some actual benefits. The diligent reader can find out theoretical information there or in other useful places (this or this). Instead, I’ll try to discuss practical usage and areas where the benefits of the MT are greatest.

Cost of Addressing All Mutants

There is a high probability that your test suite contains surviving mutants. Even for the simplest code, it seems (as in my minimalistic example) that the effort to kill all the mutants is very exhaustive. So when I addressed these surviving mutants, it appeared to be better to slightly modify production code than to exhaustively respond to all these mutants. However, even with several attempts to modify the production code, I was not able to make the test code more cohesive. So the second attempt involved searching to see what the survived mutants meant and how to respond properly. Well, it was tough. While fixing one method with one simple if-else statement, I spent over an hour to have all the mutants killed.

I believe that even with increasing practical knowledge about MT, the cost of addressing all the survived mutants is very significant.

So When to Use the Mutation Testing?

Short answer: in the business most critical part of the application.

Long answer: it depends. I agree with Uncle Bob, who stated that it is irrelevant for code with anything other than 100% code coverage. However, in these days and technologies is very impractical to achieve this goal. Moreover, it seems the same for MT. Applying MT to the whole codebase appears to be very impractical as the cost of killing all surviving mutants is very high. The goal to apply MT to the whole codebase is challenging, and maybe it is not necessary, as codebases often have places which are not 100% covered unit tests anyway. These places are mostly not business critical and don’t change regularly.

MT is most practical for code which is changing daily, where the code contains critical business functionality or has a high probability of mistakes. There should be a prioritization process for identification of places where the return on investment is the best.

I would also state that every code marked as a library should be covered by MT. A library has the chance to be used in many different situations, paths, and circumstances. So it seems very practical to have library code challenged by MT.

I would use MT in the project which hasn’t been written with TDD style, where you do not trust the codebase enough. Using MT and making appropriate changes in the test suite can help you increase the trustworthiness of practically every codebase.

Who Should Be Handling MT — Programmers Or Testers?

Who should handle the MT output? Testers verify the programmer’s work by checking their results. So, in fact, they are playing the role of guarding the programmer’s guards (unit tests). In contrast, it seems that it is impossible to handle MT outputs without programming knowledge and maybe even without skills in overall programming testing. So the responsibility is closer to the programmers as MT primarily focuses on unit tests, which are also handled by programmers.

I would say that handling of survived mutants can be the perfect task for a pair of programmer and tester. With the possibility of false positives (paths which are impossible), the tester can easily spot any behavior which doesn’t need to be treated as a surviving mutant. Moreover, they can learn more about the product than in any other situation.

Quality Assurance Testing

I am pretty confident that Quality assurance testing is not a term known in the community. With this term, I am referring to the test suite which probes whether the test suite fulfills, for example, coding standards that are impossible to check by static analysis, e.g. a presence of some annotation on particular classes.

Tools for the Java Landscape

Just for completeness, let’s do a review of available tools:

In Conclusion

I am still surprised that MT has been around for decades, but even more surprised that MT is not incorporated into most software projects yet, because the benefits are worth the cost and time needed. In all projects, MT should be challenging the most critical parts of the application, eliminating the possibility of errors, and improving software quality. While for complex mutations it is hard work to get rid of the surviving mutants, the benefits of having a comprehensive test suite and leaving no room for mistakes makes MT the best approach to increase trust in unit tests.

MT is a big step towards having quality software, and it is a great teammate to code coverage, code review, and static analysis.

mutation testing, unit tests

Published at DZone with permission of Michal Davidek , DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}