Over a million developers have joined DZone.

How do you test your tests? – Mutation analysis of Java programs with PIT

· Java Zone

Microservices! They are everywhere, or at least, the term is. When should you use a microservice architecture? What factors should be considered when making that decision? Do the benefits outweigh the costs? Why is everyone so excited about them, anyway?  Brought to you in partnership with IBM.

Software testing aims at checking the correctness of a program. But how can you check the correctness of your tests? Quis custodiet ipsos custodes? Mutation analysis can help you evaluate the quality of a test suite.

The basic principle of mutation analysis is to insert faults into a program, then run the test suite to check if the faults are detected. First, mutation operators create different versions of the program (called '*mutants*'), where specific kind of faults have been inserted. These faults usually mimics faults often made by programmers. Then the test suite is run on each mutant. If the test suite can detect the inserted fault the mutant is considered 'killed'. The result of the analysis is the mutation score, which is the percentage of mutants that have been killed.

Why not just cover all the branches?

Mutation analysis could be considered as a test criterion ("test until all mutants are killed"). Then how does it compare to other test criteria, such as branch coverage? This is a tricky question because the mutation operators that are used have a huge impact, but to kill a mutant you not only need to execute the instruction where the fault is inserted, you also need an oracle able to detect this fault.
You should do both. Structural coverage criteria are fast to evaluate, allowing you to have a fast feedback, and force you to cover instructions where no faults have been inserted by the mutation operators. Mutation analysis is dynamic and takes longer to evaluate (in the worst case scenario, you need to execute all the test cases on all the mutants), but your test suite will be able to detect specific kind of faults.

PIT – bytecode-based mutation analysis

PIT  is a tool for the mutation analysis of Java programs. It works on bytecode and in memory, which means that it is rather fast (all things considered) and you do not have to manage extra versions of your source code. PIT requires Java 5 or above, and works with JUnit 4 or TestNG 6. Note that as JUnit 4 is able to run JUnit 3 test cases, you can still use PIT with legacy JUnit 3 tests. Another interesting feature of PIT is that it first measure the coverage of your test cases so it will only run the test cases that cover the mutated instruction. This means faster execution time, especially if your tests have a low coverage or your code.
PIT can be executed from command line, with ant, or maven. There are several options available, which lets you specify the classes to mutate, the operators to use, the tests to run, the output format (html, csv, or xml - default is html), etc. It is possible to exclude some methods, or even some method calls (for instance if you do not want to test non-functional calls). There is also a Gradle plugin   an  , and a   an Eclipse plugin , and a Sonar plugin, all developed by third parties.

Running example

For the remainder of this article, I will be using "Game of Life", an open-source demonstration project for the Jenkins: The Definitive Guide book . Specifically I will use the gameoflife-core maven module.

Initialization with maven

First, you need to add the PIT maven plugin in you pom.xml configuration. (Make sure you are using the latest version!)
The <configuration> tag can contain all the options you need (see the documentation for more information).

Running PIT

To run PIT you just need to execute the org.pitest:pitest-maven:mutationCoverage goal. The console will show the results, but a detailed report in the specified format can be found in target/pit-reports/YYYYMMDDHHmm (as long as you don't clean target, you can keep several reports).

HTML Report

The index of the report shows the mutation score (the percentage of killed mutants, called mutation coverage here) as well as the line coverage. There is also a summary for each package.

You can also view the result for each package and for each class. Here all the mutants have been killed, except one in GridWriter.

To have more information on the surviving mutant, we need to go to the class view, which gives line by line information on the mutants and the coverage of the tests.

A note on the left of line indicates how many mutants have been inserted, with a more detailed report at the end of the page. Here we can see that two mutants were introduced at line 14 of “GridWriter.java”, one is killed but the other survived. The surviving mutant has been created by the “conditionals boundary mutator”, which means that this code:


has been replaced by this code:


Here the mutant survived, which means that no test case run on this code has failed. It could mean that there are no test cases where row.length is zero (test data problem), or it could mean that there is such a test case, but that its assertions are not able to detect that the instruction in the block has been executed (oracle problem).

Equivalent mutants

Equivalent mutants is one of the most difficult problem when dealing with mutation analysis, as it is undecidable in the general cases. An equivalent mutant is a mutant that cannot be distinguished from the original program. For instance these two snippet are equivalent:
int index = 0;
while (…) {
		if (index == 10) {
int index = 0;
while (…) {
		if (index >= 10) {
One particular case where an equivalent mutants could appear is with non-functional code such as the use of a logging framework. It is possible to filter out calls to some method, and PIT already excludes calls to major logging frameworks.

What to do next?

I encourage you to experiment with mutation testing. It will give you a new point of view on your tests, and will allow you to improve the test data as well as the oracles. Take a look at the list of mutation operatorsimplemented by PIT, to have an idea of the kind of faults it will force you to detect. Also, PIT is not the only framework for mutation testing of Java program, nor the first. If you are interested there is a detailed comparison on the PIT website.

Discover how the Watson team is further developing SDKs in Java, Node.js, Python, iOS, and Android to access these services and make programming easy. Brought to you in partnership with IBM.


Published at DZone with permission of Romain Delamare. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}