For the last three years, I've been working in the OpenStack Telemetry team at eNovance and then at Red Hat. Our mission is to maintain the OpenStack Telemetry stack, both upstream and downstream (i.e., inside Red Hat products). Besides the technical challenges, the organization of the team always have played a major role in our accomplishments.
Here, I'd like to share some of my hindsight with you, faithful readers.
Meet the Team
The team I work in changed a bit during those three years, but the core components always have been the same: a few software engineers, a QE engineer, a Product Owner, and an engineering manager. That meant the team size has been always been between six and eight people.
I cannot emphasize enough how much team size is important. Not having more than eight persons in a team fits with the two pizzas rule from Jeff Bezzos, which turned out to be key in our team composition.
The group dynamic that is applied by teams not bigger than this is excellent. It offers the possibility to know and connect with everyone. Each team member has only up to seven people to talk to on a daily basis, which means only 28 communication axes between people. Having a team, for example, of 16 people means 120 different links in your team. Double your team size, and multiply by 4 your communication overhead. My practice shows that the fewer communication axes you have in a team, the less overhead your team will have and the swifter your team will be.
All team members being remote workers make it even more challenging to build relationship and bound. We had the opportunity to get to know each other during the OpenStack summit twice a year, and doing regular video conferences via Google Hangout or BlueJeans also really helped.
The atmosphere you set up with your team will also forge the outcome of your team work. Run your team with trust, peace, and humor (remember, I'm on the team!) and awesome things will happen. Run your team with fear, pressure, and finger-pointing, and nothing good will happen.
There's little chance that when a team is built, everyone will be on the same level. We were no exception. We had more and less experienced engineers. But the most experienced engineers took the time needed to invest and mentor the less experienced. That also helped to build trust and communication links between members of the team. In the long run, everyone is getting more efficient; the less experienced engineers are getting better and the more experienced can delegate a lot of stuff to their fellows.
Then they can chill or work on bigger stuff. Win-win.
It's actually no more different than that the way you should run an open-source team, as I already claimed in a previous article on FOSS projects management.
I might be bad at practicing agility; contrary to many people, I don't see agility as a set of processes. I see that as a state of mind, as a team organization based on empowerment. No more, no less.
And each time I meet people and explain that our team is "Agile," they start shivering and explaining how much they hate Sprints, daily stand-ups, Scrum, and planning poker, saying that this is all a waste of time and energy.
Well, it turns out that you can be Agile without all of that.
In our team, we tried at first to run two-week Sprints and used planning poker to schedule our user stories from our product backlog (to-do list). It never worked as expected.
First, most people had the feeling to lose their time because they already knew exactly what they were supposed to do. Having any doubt, they would have just gone and talked to the product owner or another fellow engineer.
Secondly, some stories were really specialized and only one of the team member was able to understand it in details and evaluate it. So, most of the time, a lot of the team members playing planning poker would just vote a random number based on the length of the explanation of the story teller. For example, if an engineer said "I just need to change that flag in the configuration file," then everyone would vote 1. If they started rambling for five minutes about "how the configuration option is easy to switch, but that there might be other things to change at the same time, and things to check for impact bigger than expected, and code refactoring to do," then most people would just announce a score of 13 on that story. Just because the guy talked for three minutes straight and everything sounded complicated and out of their scope.
That meant that the poker score had no meaning to us. We never managed to have a number of points that we knew we could accomplish during a Sprint (the team velocity, as they call it).
The only benefit that we identified from planning poker, in our case, is that it forces people to keep sit down and communicate about a user story. However, it turned out that making people communicate was not a problem that we needed to solve in our team, so we decided to stop doing that. It can be a pretty good tool to make people talking to each other.
Therefore, the two-week Sprint never made much sense as we were unable to schedule our work reliably. Furthermore, doing most of our daily job in open-source communities, we were unable to schedule anything. When sending patches to an upstream project, you have no clue when they will be reviewed. What you know for sure is that in order to maximize your code merge throughput with this high latency of code review, you need to parallelize your patch submission a lot. So, as soon as you receive some feedback from your reviewers, you need to (almost) drop everything, rework your code, and resubmit it.
There's no need to explain what this does not absolutely work with a sprint approach. Most of the Scrum framework lays on the fact that you own workflow from top to bottom, which is far from being true when working in open-source communities.
Daily Stand-Up Meetings
We used to run a daily stand-up meeting every day, then every other day. Doing that remotely kills the stand-up part, obviously, so there is less guarantee the meeting will be short. Considering all team members are working remotely in different time zones, with some freedom to organize their schedule, it was very difficult to synchronize those meetings. With member spread from the U.S. to Eastern Europe, the meeting was in the middle of the afternoon for me. I found it frustrating to have to stop my activities in the middle of every afternoon to chat with my team. We all know the cost of context switching to us humans.
So, we drifted from our 10 minutes daily meeting to a one-hour weekly meeting with the whole team. It's way easier to synchronize for a large chunk of time once a week and to have this high-throughput communication channel.
Our (Own) Agile Framework
Drifting from the original Scrum implementation, we ended up running our own agility framework. It turned out to have similarity with Kanban; you don't always have to invent new things!
Our main support is a Trello board that we share with the whole team. It consists of different columns, where we put card representing small user stories or simple to-do items. Each column is the state of the current card, and we move them left to right. The columns are as follows:
- Ideas. Where we put things we'd like to do or dig into, but there's no urgency. It might lead to new, smaller ideas in the To-Do column.
- To-Do. Where we put real things we need to do. We might run a grooming session with our product manager if we need help prioritizing things, but it's usually not necessary.
- Epic. Here, we create a few bigger cards that regroup several To-Do items. We don't move them around, we just archive them when they are fully implemented. There are only five to six big cards here at max, which are the long-term goals we work on.
- Doing. Where we move cards from To-Do when we start doing them. At this stage, we also add people working on the task to the card, so we see a little face of people involved.
- Under review. 90% of our job being done upstream, we usually move cards done and waiting for feedback from the community to this column. When the patches are approved and the card is complete, we move the card to Done. If a patch needs further improvement, we move back the card to Doing and work on it, and then move it back to Under review when resubmitted.
- On hold or blocked. Some of the tasks we work on might be blocked by external factors. We move cards there to keep track of them.
- Done during week #XX. We create a new list every Monday to stack our done cards by week. This is just easier to display and it allows us to see the cards that we complete each week. We archive lists older than a month, from time to time. It gives a great visual feedback and what has been accomplished and merged every week.
We started to automate some of our Trello workflow in a tool called Trelloha. For example, it allows us to track upstream patches sent through Gerrit or GitHub and tick the checkbox items in any card when those are merged.
We actually don't put many efforts on our Trello board. It's just a slightly organized chaos, as are upstream projects. We use it as a lightweight system for taking notes, organizing our thoughts, and letting us know what we're doing and why we're doing it. That's where Trello is wonderful because using it has a very low friction: creating, updating, and moving cards is a one-click operation.
One bias of most engineers is to overthink and overengineer their workflow, trying to rationalize it. Most of the time, they end up automating, which means building processes and bureaucracy. It just slows things down and builds frustration for everyone. Just embrace chaos and spend time on what matters.
Most of the things we do are linked to external Launchpad bugs, Gerrit reviews, or GitHub issues. That means the cards in Trello carry very little information, as everything happens outside, in the wild Internet of open-source communities. This is very important, as we need to avoid any kind of retention of knowledge and information from contributors outside the company. This also makes sure that our internal way of running does not leak outside and (badly) influence outside communities.
We also run a retrospective every two weeks, which might be the only thing we kept from the Scrum practice. It's actually a good opportunity for us to share our feelings, concerns or jokes. We used to do it using the six thinking hats method, but it slowly faded away. We now use a different Trello board with those columns:
- Hopes and wishes
- Puzzles and challenges
- To improve
- Action items
All teammates fill the board with the card they want, and everyone is free to add themselves to any card. We then run through each card and let people who added their name to it talk about it. The Action Items column is usually filled as we speak and discover we should do things. We can then move cards created there to our regular board, in the To-Do column.
Sure, people have different roles in a team, but we dislike bottleneck and single points of failure. Therefore, we are using an internal mailing list where we ask people to send their request and messages to. If people send things related to our team job to one of us personally, we just forward or Cc the list when replying so everyone is aware of what one might be talking about with people external to the team.
This is very important, as it emphasizes that no team member should be considered special. Nobody owns more information and knowledge than others, and anybody can jump into a conversation if it has valuable knowledge to share.
The same applies for our internal IRC channel.
We also make sure that we discuss only company-specific things on this list or on our internal IRC channel. Everything that can be public and is related to upstream is discussed on external communication medium (IRC, upstream mailing list, etc). This is very important to make sure that we are not blocking anybody outside the Red Hat to join us and contribute to the projects or ideas we work on. We also want to make sure that people working in our company are no more special than other contributors.
We're pretty happy with our set-up right now, and the team runs pretty smoothly since a few months. We're still trying to improve, and having a general sense of trust among team members make sure we can openly speak about whatever problem we might have.
Feel free to share your feedback and own experience of running your own teams in the comment section.