Trunk-Based Development at Facebook
Tech Crunch have an article called The Next 6 Months Worth Of Features Are In Facebook’s Code Right Now, But We Can’t See. That’s a great title. Never mind that the article is nearly two years old. Linked to from it is a 52-minute video that should be mandatory for all developers and their management to watch, even if after watching it they decide to not do the same. It’s all about toggles, Trunk Based Development (TBD) and Branch by Abstraction. Facebook’s Chuck Rossi talks us through the the contract and understanding between developers and release engineers concerning releases.
Facebook release/change wisdom
All quotes from the video:
“The business Requires Change … yet change is the cause of most problems”
“We have two options:
- we can discourage change in the interests of stability,
- or we can allow change to happen as often as it needs to"
“Lowering the risk of change via Tools and Culture”
How I interpret what they do
Here is a branch diagram for just over a week at Facebook. It’s my guess of course:
= a Branch as it’s made
= a merge (cherry-pick)
Here’s a screen-cap for their description of the pushes:
It is from that, and the accompanying talk, that I’ve made my version of the branch diagram.
The importance of cherry-picks
Four (our numbers are contrived) red commits () are merged to the main Tuesday release – these constitute “production hardening”. They were cherry picked, meaning other commits are left on trunk. Three Blue commits made it to the Weds release, via a cherry-pick merge. Similarly two Purple ones are for the Wednesday release, and one Brown one for the Friday release. Saturday and Sunday have no planned releases. Sunday, however, it all starts again as a new release branch is cut. There’s also that Monday release (the last one on the old branch), that has the fewest possible commits cherry-pick merged to it. One presumes that they can skip releases if there’s nothing to go out.
Note the merge directions
Things happen on Trunk and get merged to a release branch, if they get merged at all. Things can be regular enhancements, or can be defect fixes.
One might think that defects should be fixed on the release branch and merged back to the trunk. That should only be the case if it is impossible to reproduce the same bug on the trunk. The reason you want to do it the way shown, is that you want zero regressions. Regressions can’t happen if the change is made on the trunk then cherry-picked to a release branch.
Inevitability of release
If you’re a developer and you’ve committed and pushed to the trunk, your stuff is going to go live in at maximum seven days. You don’t have to nurse things through the merge-to-release branch workflow; you can just wait for it to go live anyway.
Not illustrated in the branch diagram
Culture & Developer responsibility
Developers don’t break the build when they commit to trunk. If they do, they are automatically rolled back. They can fix the mess in their own time. To achieve this Facebook will have a serious Continuous Integration infrastructure.
If you didn’t watch the video, you need to know that developers have to stick around in IRC for the merge moments. That isfollowing their declaration of intent for their commits to be cherry-picked into the existing release branch. This is a Facebook rule, and other companies doing TBD might not do the same. Indeed for others it might be release engineers doing merges, and based on an known list that will essentially patch a release branch, that was not expected to have a point release. It’s that Facebook kinda know in advance they have enough for releases on Weds through to the last release of the old branch on Monday.
Facebook calls these Gatekeepers (22 mins into the video)
If you’re working on something and it’s not ready to go out yet, then you’ll be wrapping the UI and logic in a ‘Feature Toggle’. It could be that the release engineers will toggle things on or off themselves, especially if marketing are driving the release moments. As likely is the scenario that you’re going to need another week or two to complete the mini-project.
This and the implicit Branch by Abstraction for code that is replacing the previous implementation of something, mean you’ve avoided a feature branch that has a nebulous cost.
They also have the ability within one released version to set percentages of the public that can see a feature. Toggles are not just two-state for Facebook.
Analytics Facebook style
See the defect analytics piece at 21 mins into the video, and performance analytics at 27 mins in.