Continuous Integration and Pull Requests
The Agile Zone is brought to you in partnership with Hewlett Packard Enterprise. Discover how HP Agile enterprise solutions can help you achieve high predictability and quality in your development processes by knowing the status of your projects at any point in time.
There is a well-known tension between the Feature Branches model and Continuous Integration: the approaches vary on a spectrum that goes from six-month long and version-based branches to the deployment of every commit. However, as Jez Humble notes, a disconnection from the trunk of a codebase is always present, even just as the local working copy which is indeed a small and (hopefully) short-lived fork.
Today I want to explore one particular instance of Feature Branch, called Pull Request.
Pull requests pick their name from GitHub and in general from the terminology of distributed version control. Within this model, every developer has a published repository that is forked from the project's original one. The original repository is the only one that has features such as issue tracking, so the forked ones are second-class citizens.
Instead of proposing a patch for integration with the trunk, contributors to a project can publish a full version of the project (usually a feature branch), to be merged in the trunk (master branch) of the mainline by someone with write access. If the contributor already has write access, he can propose a pull request with a branch in the main repository with the goal of getting its code reviewed and approved.
This process has a great advantage on the patch proposal model in the fact that is moves the time cost of integration on the contributor instead of than on the maintainer; since every fork is a full-fledged copy of the project, the test suite can be run publicly so that every pull request is corredated by Travis CI certifying that the fork has a green build. The Git model by itself automatically keeps important information in the fork such as the point in time and branch where it was forked from.
In open source, the pull request model does wonders:
- it recognized maintainers as a bottleneck in accepting contributions. As stated above, it simplifies their work by providing full projects to merge along with metadata, instead of lists of patches.
- It allows everyone to use temporarily his own published fork while a new feature is being reviewed and accepted upstream.
- As a contributor and maintainer of PHP open source projects, I can tell you it avoids unfinished work.
Let's expand a bit on the last point, by making a comparison with the opposite model of Continuous Integration. Open source projects cannot employ Continous Integration between different contributors because of:
- cadence: many contributions arrive during the weekend or with random frequency. Many open source projects are run and developed by volunteers, not by full-time employees.
- Communication difficulties: it's difficult to get code reviewed once in the trunk, while for pull requests they remain listed as open.
- Atomicity: CI allows multiple integrations with the trunk, even when the first ones do not add value but are part of a refactoring. Pull requests are usually feature complete as the interactions with the trunk are human-based and costly.
Judging by their success, Pull Requests are the next best thing in open source: they allow for an orderly process of review and integration of finished code into the project's mainline. Continuous Integration is difficult to get to work without a team that works together, in similar hours and with goals shared on a Scrum board; pull requests allow distributed development by volunteers in spare time, while maintaining code quality.
In proprietary code
Pull requests, however, are also employed to develop proprietary code. The main argument for them is their focus on the quality of the code with respect to development speed, as I read in experiences of the Google iOS team (can't find the link anymore unfortunately).
If you do not pair program all the time like XP teams, pull requests are a way to get your code reviewed by another person before it gets into production and raise the quality of the application; with this goal in mind you have to trade-in speed because each pull requests adds cycle time since another person has to see and approve your commits.
You can wait for a day or at least until the next stand-up meeting before enough people approve the code (the quorum required can be from one to three team members, discounting the fact that a +1 given after reading the code is values less than the constant review of pair programming it together.)
Pull requests have little overhead however, as they can be dealt in sequence as a list of notifications, arbitrarily small. I think it's good to organize commits in pull requests because the process forces you to identify the reasons behind a code change and to explain it to a colleague.
However, before it gets into CI, and in the environments where end-to-end and other slower tests are run. You may find surprises later than the usual, only after the merge of the pull request. I know of no one that runs the whole pipeline (end-to-end tests with other projects, load or cross-browser tests) on feature branches, only the usual test suite. It would be unfeasible only for the machine time required, and it should be constantly re-run after commits on the trunk to keep feature branches a fast-forward.
Pull requests can easily require 15 days before being ready, for medium-sized stories. Trying to break down stories as much as possible can help. Otherwise, you're continuously deploying every day, but releasing just the small stuff instead of the more risky code that sit in branches.