We all know that Continuous Integration (CI) can deliver real, tangible benefits to our projects. So why are the benefits so elusive? Many software projects claim to be exercising CI, but have builds that run for 30 minutes or more, or worse, just have a nightly build. Build times that exceed a few minutes are excessive; all too commonly we see build times reaching 20 minutes to an hour or two. Real projects tend to have build times that gradually increase as the project evolves, resulting in a failure to reach the full potential that CI promises to deliver. Luckily for many of these projects, there's a solution that can get your project back on track.
I've seen many projects with this problem, and they all have the same pattern: a monolithic build that builds components in a serial fashion. While many of these projects are modular, where modules can be built independently, dependencies between modules prevent them from being built in parallel.
So how do we get shorter build times? We cheat.
By building modules in independent CI jobs, we can get faster feedback for independent modules of our project without reducing the total build time. As we make discrete changes to the source code, only the affected modules will rebuild.
With this approach each module is built separately from the others. As a result we get very fast feedback as to the impact of our changes, possibly even before downstream modules have finished building. Furthermore, in many cases only part of the project will need building — thus resulting in a reduction in total build time. I know, I said that total build time is not reduced using this approach, but for many cases the total build time is in fact reduced since we don't always need to build the whole project!
So how do we employ this approach in practice? These are the steps that I took on a recent project:
- Identify logical modules of related functionality in your project. For Eclipse projects, this could be related plug-ins. For my project which involves about 100 Eclipse plug-ins (OSGi bundles) this meant splitting the bundles into about 7 modules.
- Next, change your build process so that each module builds independently. To do this I started with the module that would need to be built without any dependencies on other modules. After creating a CI job to built it, I moved on to the next module, which could now download its dependencies from the first CI job. Repeat this process until each module is built by an independent CI job that downloads its dependencies as binaries from other CI jobs.
- After all modules were building on the CI server using binary dependencies, some minor tweaks were required to limit job concurrency and unnecessary thrashing on the build server. This is where the power of Hudson really shines.
- Configure each module's job under Post-build Actions to trigger the next module in the series of dependant modules: check Build other projects and list them there.
- Using the Locks and Latches plugin ensure that all module jobs are using the same lock, preventing them from all building at once.
Here's what I ended up with:
Before: 1 job, 1100 tests, 30 minute build time.
After: 7 jobs, 1075 tests, 20 minute 30 second build time, with a break down as follows in dependency order:
|Job 1:||20 seconds|
|Job 2:||26 seconds|
|Job 3:||1 minute 31 seconds|
|Job 4:||5 minutes 54 seconds|
|Job 5:||5 minutes 45 seconds|
|Job 6:||5 minutes 12 seconds|
|Job 7:||1 minute 12 seconds|
The net result is that most of the time, we get feedback within 6 minutes, whereas we were having to wait 30 minutes. Even if you can't parallelize your build, splitting your build into independent modules that build serially can produce measurable benefits such as faster feedback and shorter build times.