DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Two Cool Java Frameworks You Probably Don’t Need
  • Mastering Unit Testing and Test-Driven Development in Java
  • Comprehensive Guide to Unit Testing Spring AOP Aspects
  • Improving Java Code Security

Trending

  • Navigating Change Management: A Guide for Engineers
  • Introducing Graph Concepts in Java With Eclipse JNoSQL, Part 2: Understanding Neo4j
  • How to Merge HTML Documents in Java
  • What’s Got Me Interested in OpenTelemetry—And Pursuing Certification
  1. DZone
  2. Coding
  3. Java
  4. Fighting Fragility With Property-Based Testing

Fighting Fragility With Property-Based Testing

In traditional unit testing, we set up our tests around edge cases. Jqwik validates against a whole range of possible inputs, making it better at catching regressions.

By 
Jasper Sprengers user avatar
Jasper Sprengers
·
Dec. 24, 21 · Tutorial
Likes (1)
Comment
Save
Tweet
Share
5.1K Views

Join the DZone community and get the full member experience.

Join For Free

However long you work in software, you always feel late to the party. You encounter some seemingly cutting-edge new tool only to learn it has been around for decades, sometimes inspired by research papers from 1970. Still, you can’t keep up with everything and have a life. Property-based testing (PBT) is such an established technology and it deserves more attention. Java has the Jqwik library, Scala has ScalaCheck and Python has Hypothesis. 

Check the links at the end for some good tutorials. Here I want to skip the nitty-gritty and focus in detail on its killer feature, which is the ability to warn when some change to production code is no longer sufficiently covered by a test suite.

For the uninitiated: PBT validates so-called properties. I like the definition from ScalaCheck best: A property is a high-level specification of behavior that should hold for a range of data points. For example, a property might state that the size of a list returned from a method should always be greater than or equal to the size of the list passed to that method. This property should hold no matter what list is passed.

To test properties a PBT framework creates data that is arbitrary – not random! – and constrained within a specific range, for example, integers between 10 and 100. Jqwik has an extensive API to create such arbitrary data and clever logic to try out as many scenarios as needed to break a test. What’s the point of trying out so many possible inputs when usually only a few are relevant? Bear with me.

Finding the Relevant Edge Cases

For practical purposes, the range of valid inputs to a method under test is often infinite, especially with strings and large numeral types. PBT cannot try them all, and why should it? If results are predictable for a range of input values, we traditionally need only to validate the significant edge cases. If, say, a function that squares an integer returns 25 for 5 and 100 for 10, inputs 6 to 9 are deterministic. 10, -10, and 0 should be enough. Traditional unit testing relies on handpicked scenarios because it doesn’t make sense to test everything. Or so you think.

Imagine an embarrassingly simple function that calculates some monetary amount based on a person’s age. Here’s the specification:

  • A valid age must lie between zero and 125. Let’s be on the safe side.
  • Only people 18 years and older are eligible.
  • An eligible age returns 200 euros.
  • A non-eligible age returns zero.

It seems we have three significant numbers: 0, 18, and 125.

Java
 
public class BenefitCalculator {
    int calculateBenefitForAge(int age) {
        if (age < 0 || age > 125)
            throw new IllegalArgumentException("Age is out of range [0-125]: " + age);
        return age < 18 ? 0 : 200;
    }
}


If you rephrase the rules as general statements about a range of values, you get the properties:

  • any input less than zero is not acceptable
  • any input greater than 125 is not acceptable
  • any input between 0 and 17 returns zero
  • any input between 18 and 125 returns 200

In traditional scenario-based unit testing, we set up our parameters around the edges of significant values. For numbers we take the nearest neighbor that produces a different result than said value:

  • 0 does not throw, but -1 does.
  • 125 does not throw, but 126 does. 
  • 18 returns 200 while 17 returns 0.

Junit’s parameterized tests are an elegant mechanism to test the valid age range and the returned benefit amount without too much duplication.

Java
 
@ParameterizedTest
// For each entry in the array the test is invoked. The comma-separated //values must match the test method parameters in size and type.
@CsvSource({"0,-1", "125,126"})
public void test_valid_age_range(int inRange, int outOfRange) {
    calculator.calculatBenefitForAge(inRange);
    assertThatThrownBy(() -> calculator.calculateBenefitForAge(outOfRange)).hasMessageStartingWith("Age is out of range");
}

@ParameterizedTest
@CsvSource({"17,0", "18,200"})
public void validate_benefit_amount_for_age(int age, float benefit) {
    assertThat(calculator.calculateBenefitForAge(age)).isEqualTo(benefit);
}


Don’t Stop at the Edges

PBT on the other hand does not stop at the edges. It explores the entire range. Here are our four properties expressed in code: 

Java
 
@Property
public void for_every_input_greater_than_125_the_function_throws(@ForAll @IntRange(min = 126) int age) {
 assertThatThrownBy(() -> calculator.getBenefitInEurosForAge(age));
}

@Property
public void for_every_input_less_than_zero_the_function_throws(@ForAll @Negative int age) {
   assertThatThrownBy(() -> calculator.getBenefitInEurosForAge(age));
}

@Property
public boolean any_input_between_0_and_17_returns_0(@ForAll @IntRange(max = 17) int age) {
   return calculator.getBenefitInEurosForAge(age) == 0;
}

@Property
public boolean any_input_between_18_and_125_returns_200(@ForAll @IntRange(min = 18, max = 125) int age) {
  return calculator.getBenefitInEurosForAge(age) == 200;
}


@Property marks the method as a jqwik test. Nothing is needed at the class level. @ForAll instructs the framework to try random values of the age parameter it annotates. @IntRange adds a necessary constraint. A property fails when it returns false or throws.

Maybe I did not win you over yet. The parameterized approach is less verbose (two versus four methods) than PBT and arguably more readable. It is certainly more performant: the default number of tries in jqwik is a thousand, or until all combinations have been exhausted. All this checking seems excessive. There’s no scenario between 18 and 125 after all where the code would suddenly behave differently.

But on a point of principle: traditional unit tests do not (in)validate properties, they only check handpicked examples. Suppose we augment the logic and squeeze in a special case for persons between 40 and 64 years old. We add the following property:

Any input between 40 and 64 returns 300

But we’re not done! The existing properties must be adjusted and augmented:

  • any input less than zero is not acceptable
  • any input greater than 125 is not acceptable
  • any input between 0 and 17 returns zero
  • any input between 18 and 125 returns 200 now becomes "any input between 18 and 39 returns 200"
  • any input between 40 and 64 returns 300 (new)
  • any input between 65 and 125 returns 200 (new)
Java
 
public int calculateBenefitForAge(int age) {
    if (age < 0 || age > 125) {
        throw new IllegalArgumentException("Age is out of range [0-125]: " + age);
    } else if (age < 18) {
        return 0;
    } else if (age >= 40 && age < 65) {
        return 300;
    } else {
        return 200;
    }
}


This code change has invalidated property 4, but the two original parameterized tests still succeed. That is because the test is strictly correct. The parameters 17 and 18 of the edge case still hold. But it tacitly suggests that every valid age over 18 yields the same output, and that is no longer true. 17 and 18 are no longer the whole truth. The test turns a blind eye to numbers 40 and 65, which constitute new edge-case scenarios. That doesn’t feel right. This is a simple code change, with possibly big repercussions. After any code change, you would expect at least one test to fail. 

We Practice Strict TDD, Except on Friday Afternoon

Now the above is never a problem because as TDD adepts we develop our tests and production code strictly in tandem, right? You even add the extra test cases before you touch the production code. Unless of course, it was time for Friday afternoon drinks, in which case you can add the test after your holidays, obviously.

So much for sarcasm. Proper PBT is a little more verbose, as you write a separate method for each property. But that is a much better safeguard to ensure that the suite is in sync with the production code. A code change is more likely to break a property than a parameterized test. The property that guaranteed return value 200 for inputs 18 to 125 now fails instantly. You tell me if adding these four extra edge cases is more user-friendly than the PBT approach.

Java
 
@CsvSource({"17,0", "18,200", "39,200", "40,300", "64,300", "65,200"})


More Unknown Unknowns

PBT is also great at other ‘unknown unknowns’. Imagine some calculation where a non-obvious input leads to a division by zero further down the line. Or consider something far less intricate, for the more mathematically challenged like yours truly.

Java
 
public int square(int input){
  return input * input;
}


This won’t work: any value greater than 46340 or less than -46340, and the result of the square is too large for a Java int. It’s easy to miss, but jqwik will tell you so:

Java
 
@Property
public boolean all_int_ranges_are_valid(@ForAll int input){
    return NumberUtils.square(input) >= 0;
}


Shell
 
Property [SquarePropertyTest:all int ranges are valid] failed with sample {0=46341}


Jqwik doesn’t arrive at this edge-case by accident. After the first failure, it zooms in to find the failing value that is closest to a passing value in a process called shrinking. PBT forces you to think and refine your properties and assists in the process. You would probably use along for the return type and add an explicit range check at the top of the method.

Who Watches the Watchmen?

Although very different technologies, PBT shares a trait with mutation testing. They add a touch of controlled randomness to improve the robustness of your tests. Mutation testing does so by purposely messing up your byte code (introducing so-called mutants), the idea being that these changes should cause existing tests to fail, called "killing the mutants". PBT tries your tests across a wide range of values that you assume should yield a predictable result. It puts those very assumptions to the test. The results will often catch you off guard and lead to more robust tests and production code.

Further Reading

  • Sample project – Code samples of this article on gitlab
  • Jqwik – homepage of the jqwik project, with extensive documentation
  • Property-based testing patterns – common patterns that can help you distill properties from business rules.
  • What is Property-Based Testing? – a more theoretical look at the principles.

 

unit test Property (programming) Java (programming language) Edge case

Opinions expressed by DZone contributors are their own.

Related

  • Two Cool Java Frameworks You Probably Don’t Need
  • Mastering Unit Testing and Test-Driven Development in Java
  • Comprehensive Guide to Unit Testing Spring AOP Aspects
  • Improving Java Code Security

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!