Large-Scale Continuous Integration Requires Code Modularity
Join the DZone community and get the full member experience.
Join For FreeWhere large development teams and codebases are involved, code modularity is a key enabler for continuous delivery. At a high level this shouldn’t be too terribly surprising—it’s easier to move a larger number of smaller pieces through the deployment pipeline than it is to push a single bigger thing through.
But it’s instructive to take a closer look. In this post we’ll examine the relationship between continuous integration (which sits at the front end of the continuous delivery deployment pipeline) and code modularity. Code modularity helps at other points in the pipeline—for example, releases—but we’ll save that for perhaps a future post.
The impact of too many developers working against a single codebase
For a given codebase, continuous integration (CI) scales poorly with
the number of developers. Fundamentally there are a couple of forces at
work here: with increasing developers, (1) the size of the codebase
increases, and (2) commit velocity increases. These forces conspire in
some nasty ways to create an painful situation around builds. Let’s take
a look:
- Individual builds take longer. As the size of the codebase increases, it obviously takes longer to compile, test, deploy, generate reports and so forth.
- More broken builds. Even if developers are disciplined
about running private builds before committing, any given commit has a
nonzero chance of breaking the build. So the more commits, the more
broken builds.
- A broken build has a greater impact. In a “stop the line”
shop, more developers means more people blocked when somebody breaks the
build. In other shops, people just keep committing on top of broken
builds, making it more difficult to actually fix the build. Either way
it’s bad.
- Increased cycle times. After a certain point, the commit
velocity and individual build times, taken together, become sufficiently
high that the CI builds run constantly throughout the workday. In
effect the build infrastructure is unavailable to service commits on a
near-real-time basis, which means that developers must wait longer to
get feedback on commits. It also means that when builds do occur, they
involve multiple stacked-up commits, making it less clear exactly whose
change broke the build. This again increases feedback cycle times. (Note
that there are some techniques outside of modularization that can help
here, such as running concurrent builds on a build grid.) Once the
feedback cycle takes more than about ten or fifteen minutes, developers
stop paying attention to build feedback.
- Individual commits become more likely to break the build.
Even though the global commit velocity increases, individual developers
may commit less often because committing is a painful and risky
activity. Changelist sizes increase, which makes any given commit more
likely to result in a broken build.
- Delayed integration. Painful and risky builds create an
incentive to develop against branches and merge later, which is exactly
the opposite of continuous integration. Integrations involving such
branches consume disproportionately more time.
- General disruption of development activities. Ultimately the
problems above become very serious indeed: developers spend a lot of
time blocked, and the situation becomes a huge and costly distraction
for both developers and management.
- Difficult to make improvements. When everybody is working on
the same codebase, it’s harder to see where the problems are. It could
be that a certain foundational bit of the architecture is especially apt
to break the build, but there aren’t enough tests in place for it.
(Meanwhile some other highly stable part of the system is consuming the
“build budget” with its comprehensive test suite.) Or perhaps certain
teams are have better processes in place (e.g., a policy of running
private builds prior to committing) than others. Or it may be that some
individual developers are simply careless about their commits. It’s hard
to know, and thus difficult to manage and improve.
There are various possible responses to the challenges above. One can, for example, scale the build infrastructure either vertically (e.g., more powerful build servers) or horizontally (e.g., build grids to eliminate build queuing). Another tactic is to manage test suites and tests themselves more carefully: individual tests shouldn’t run too long, test suites shouldn’t run too long, etc. Make sure people are using doubles (stubs, mocks, etc.) where appropriate. Etc. But such responses, while genuinely useful, are more like optimizations than root cause fixes. Vertical scaling eventually hits a wall, and horizontal scaling can become expensive if resources are treated as if they’re free, which often happens with virtualized infrastructure. Limiting test suite run times is of course necessary, but if it’s done over too broad a scope, it results in insufficient coverage.
The root cause is too many cooks in the kitchen.
Enable continuous integration by modularizing the codebase
It would be incorrect to draw the conclusion that continuous integration works only for small teams. CI works just fine even with large teams developing to large codebases. The trick is to break up the codebase so that everybody isn’t committing against the same thing. But what does that mean?
Here’s what it doesn’t mean: it doesn’t mean that each team should branch the codebase and work off of branches until it’s time to merge. This just creates huge per-branch change inventory that has to be integrated at the end (or more likely toward the middle) of the release. Again this is the opposite of continuous integration.
Instead, it’s the codebase itself that needs to be broken up. Instead of one large app or system with a single source repo and everybody committing against the trunk, the app or system should be modularized. If we can carve that up into services, client libraries, utility libraries, components, or whatever, then we should do that. There’s no one-size-fits-all prescription for deciding when a module should get its own source repo (as opposed, say, to having a bunch of Maven modules in a single source repo), but we can apply judgment based on the coherence and size of the code as well as the number of associated developers.
Modularizing the code helps with the various continuous integration problems we highlighted above by reducing the size of the build, reducing the commit velocity, and removing incentives to delay integration. It has other important advantages outside of continuous integration, such as decoupling teams from a release planning perspective, making it possible to be more surgical when doing production rollbacks, and so forth. But the advantages to continuous integration are huge.
Note that code modularization brings its own challenges. Code modules require interfaces, which in turn require coordination between teams. SOA/platform approaches will likely require significant architectural attention to address issues of service design, service evolution, governance and so forth. Moreover there will need to be systems integration testing to ensure that all the modules play nicely together, especially when they are evolving at different rates in a loosely coupled fashion. But the costs here are enabling in nature, with a return on investment: greater architectural integrity and looser coupling between teams. The costs we highlighted earlier in the post are pure technical debt.
Published at DZone with permission of Willie Wheeler, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments