The perils of long-running test suites
First of all, automated test suites are part of the commit and integration process. This means they are run before every remote commit (your code leaves your machine and gets integrated in the main branch.)
Chances are that the suite is also part of the deployment pipeline, an automated sequence of tasks that packages up your code into binaries or archives (for interpreted code), tests it in every way known to man and ships it to a Quality Assurance machine and then to the production environment. The deployment pipeline concept is treated in the Continuous Delivery book.
So what if your suite takes 20 minutes to run?
- A commit is a dreaded occasion. No one wants to take a coffee waiting for the suite, so people will either commit without running it, or make giant commits at the end of a day of work.
- The Continuous Integration system picks up multiple commits at every new job, because between runs more than one commit is performed. A failure cannot be fixed with a single revert.
- The load on the CI machine increases and slows down other projects.
- The integration time increases: to be sure you have integrated your code, you have to wait not only for the test suite to finish on your machine, but also for the same result to come out from CI. That's a minimum of 2 times the suite execution time, so don't go around telling people to integrate more often*link in these conditions or it will be the only thing they do in their day.
- If you want to test or run something in the late stages of the pipeline, like a QA machine, you'll have a slow feedback system that means you only have a limited number of change sets that you can get to QA every day.
- The feedback from production is slowed down in the same way. You can never fully reproduce the production environment, even if we usually have a good model of it. Tests can cover all of the code but not every execution path, and therefore can only prove the presence of bugs, not their absence. If you really broke a feature in production, would you like to know it 10 minutes after your commit, or after an hour, so that it's not clear which change has caused it?
For the benefit of all of us that encounter a slow test suite, I put together a series of ideas to speed you up.
Improvements can be made over the whole build process, but some are language-dependent (e.g. incremental compilation), so I'll stick to the ones applicable to xUnit test suites.
- Test profiling and optimization. xUnit frameworks (JUnit, or PHPUnit) can produce an XML output that contains the execution times measured for each single test. Take a look at the 10 tests with highest execution time, and simplify or delete those ones if they do not provide a good value/cost ratio.
- Test deletion. Tests may be redundant, especially when developed after the code and without a clear picture of the responsibilities of the SUT.
- Test substitution. Test a behavior at the unit level instead of going through the whole system; for example, tests that internally rely on Apache or MySQL processes may substitute a Test Double for the http- or database-accessing code.
- Fakes. If the system interacts with Test Double many times, provide a low-overhead implementation of the external system. A classic solution for testing queries: instead of going mad by stubbing fixed results for everyone of them, provide a Sqlite instance instead of the MySQL real one and execute them over there.
- Caching. In some projects, the database schema can be reused between tests or between commits. In other, even a basic data set that populates it can be recycled between tests. Test isolation is sacred, but test speed is not much down in the list.
- Divide the test suite in multiple stages: unit tests, functional and end-to-end tests, integration with other systems. If the unit tests suite fails, fail the build; otherwise mark it as green and go on to the next stage.
- If all else fails, separate the current project in multiple ones (different Bounded Contexts) that can evolve independently. Their suites will be smaller, and the feedback rate in each of them will increase accordingly; for example, if you divide a project with a test suite than runs in T in half, you can get features in production on both of the new components in T/2.
You don't have to accept hour-long build and test execution times. It's a form of technical debt that you can fight to improve the feedback rate of the build: the sooner a build fails or pass, the less time you wait to choose between reworking a feature, monitoring it in production, or go on to the next story.