Principles for Creating Maintainable and Evolvable Tests
Having [automated] unit/integration/functional/… tests is great but it is too easy for them to become a hindrance, making any change to the system painful and slow – up to the point where you throw them away. How to avoid this curse of rigid tests, too brittle, too intertwined, too coupled to the implementation details? Surely following the principles of clean code not only for production code but also for tests will help but is it enough? No, it is not. Based on a discussion on our recent course with Kent Beck, I think that the following three principles below are important to have decoupled, easy to evolve tests:
- Tests tell a story
- True unit tests + decoupled higher-level integration tests (-> Mike Cohn’s Layers of the Test Automation Pyramid)
- More functional composition of the processing
(Disclaimer: All good ideas here come from Kent Beck and my course-mates. All misconceptions are genuinely mine.)
1. Tests tell a story
If you think about your tests as telling a story – single test methods telling simple stories about small features, whole test classes about particular larger-scale features, the whole test suite about your software then you end up with better, more decoupled, and easier to understand tests. This implies that when somebody reads the test, s/he feels as if reading a story.
This actually corresponds nicely to what Gojko Adzic is teaching about acceptance tests. i.e. that they should be a “living documentation” of the system for the use of the business users, and this is their by far greatest benefit (and not, as it is usually believed, that you have a set of regression tests or that you can show that the system works).
Why should you write your tests as stories?
- It forces you to concentrate on telling what the code is supposed to do as opposed to how it should do it. Your tests will be therefore more decoupled from the implementation and thus more maintainable and more likely to discover a defect in the way how a requirement is implemented.
- The story-telling approach forces you to abstract from unimportant details, f. ex. by creating an abstraction layer between the test and the low-level details of what’s being tested (the layer may consist of something as simple as a helper method or something more elaborate)
- The tests will be easier to understand. As we all know, code is read many more times than written so understandability is very important. If your tests are easy to read and grasp, they will serve as a very good documentation of the code under test. Future generations of programmers working on it will love you.
- If you find it difficult to write the test in a story-like manner then something is likely wrong with your API and you should change it.
How should you write tests to read as stories?
- Not: should_return_20_if_sk_or_ba – this tells me nothing: what is 20? what is ba, sk? (For the curious: airlines, namely SAS and British Airways)
- Yes: should_give_discount_for_preferred_airlines – this tells me what and why is done
- Proper level of abstraction – As mentioned above, to get the true
benefit out of your tests and make them ready to live long you need to
keep them on the proper level of abstraction
- Move unimportant details away from the test, into helper methods, objects (such as an ObjectMother) or setUp. Don’t obscure the main logic of the test. (Of course, if you need extensive setup or if you have a lot of low-level code in your test, something is rather wrong with your test approach, the design of the tested code, or both. Fix it first.)
- For example some people claim that having a loop in a test is too low-level – if you need it then perhaps your API is insufficient, its users might need to loop too, so why not just provide a suitable method that would do it for you?
- By the way, nobody says that it is easy to write tests on the right level of abstraction. But it pays off.
- A (rather conceived) example (click to expand):
- Minimal coupling to implementation details – The more your test code resembles your implementation code the less useful it is. The whole point of tests is doing the same thing differently (and usually much more simply) then your implementation so that you are more likely to catch bugs in it. Copying & pasting code from the implementation and adjusting it slightly is therefore a really bad thing to do. If you can’t do things more simply (are sure you can’t?!) try to do them at least differently.
- Mostly public API usage – To keep your tests as decoupled and maintainable as possible, not compromising the evolvability of your code, it seems reasonable to me to try to stick to the public API of the tested class as much as possible. If you want to check some low-level implementation details to be sure you’ve done them right, create another test case for it so that when the implementation changes, you can just throw it away while the test exercising the “story” implemented by the code will be able to live happily on. (See Never Mix Public and Private Unit Tests!)
- TestCases per Fixture (a separate test class for each of different setup needs) – I hope you already know that it is normal to have more test cases for one business class. Actually JUnit forces you to it e.g. by requiring the Parametrized test’s data to be on the class level. If you group tests that require the same setup, you can move the setup code to the @Before method and thus keep the tests themselves much simpler and easier to read. Of course it is always little difficult to find the right balance between the number of test cases, fixtures, and test methods.
(Sidenote: Also the xUnit Patterns book states Tests as Documentation as one of the main goals of testing.)
2. True unit tests + decoupled higher-level integration tests
(-> Mike Cohn’s Layers of the Test Automation Pyramid)
For discussing this JUnit perhaps isn’t a good example – it is very special because it uses itself to test its behavior and the tests must fail even if there is a defect in the test framework – but still we can perhaps learn something from it. What surprised me a lot is that it has a high number of integration tests*, around 35%. The recipe for long-lived, evolvable tests seems to be to write:
- True unit tests, i.e. tests that only check one class and don’t depend on its collaboration with other classes. Such a test is not affected by changes to those collaborations (that are covered by the integration tests) and if the tested class itself changes, the test is likely to either keep up with the change – or, if it is a large-scale change, you just throw it away (likely together with the class being tested) and write a new test for the new design. (I don’t claim this is easier to implement and it also requires a particular way of coding and structuring the software [see #3 below] but it certainly seems to be a way to go.)
- Integration tests that check the collaboration of
multiple objects – not necessarily the whole application or a subsystem,
just about any piece of definable functionality. Again, the integration
test should be telling a story about what the module does – and this
story itself is unlikely to change even if the way how the group of
objects internally implement it evolves. It is thus crucial to test on
the right level where you aren’t bothered too much by concrete
implementation details yet you aren’t too far from the code you want to
- Kent mentioned a nice example of how they refactored the way JUnit 4.5 manages and executes individual phases of testing by replacing nested method calls with command objects (which made it possible to introduce @Rule) – thanks to the integration test being on the level “I give some input to the execution subsystem and expect a particular output”, they still worked. If they were on a lower level and depended on the fact that there was a series of nested method calls, the refactoring would be much harder.
*) What is an integration tests? According to one of the possible definitions, unit test is a test where seeing the failure message you can immediately pinpoint the piece of code or even line where the problem originated. Contrary to that, if an integration test fails, you usually can’t say why and have to dig into it or perhaps debug it a little.
3. More functional composition of the processing (i.e. kill the mocks!)
Do you need a mocking framework to write tests? Then you might be doing it in a suboptimal way. (Disclaimer: There surely are different yet equally good approaches to nearly anything .) When Kent was explaining the way he composes his programs he draw a picture similar to this one:
What is interesting about it? It is not the typical network of objects where one object does something and calls another one to *continue* with the processing or to do a part of it. It is composed of two types of objects: workers that receive an input and produce an output, and integrators that delegate the work to individual workers and pass it from one to another with minimal logic of their own. The workers are very functional and thus easy to test with true unit tests (and as said, if a different implementation is required, you just may throw the worker with its test away and create a new one) while the integrators should be as simple as possible (so the likelihood of a defect is smaller) and are covered by integration tests. (Everything fits nicely together, doesn’t it?)
Given that most of bigger business software systems live for quite a number of years, it’s essential to write tests in such a way that they enhance and not limit the evolvability of the system. It is not easy but we must make efforts towards that.
To succeed we must structure our code and tests in a particular way and approach our test methods, test classes, and test suites as telling stories about what the code under test does.
- Uncle Bob: Manual Mocking: Resisting the Invasion of Dots and Parentheses – uncle Bob explains why he usually uses hand-coded stubs, talks about the difference between testing choreography and testing behavior, and when a mocking framework is really necessary