Four Points You Should Consider for Scaling DevOps
In this article, we’ll explore four critical components for bringing DevOps practices to your wider organization: customer listening, teams, quality, and deployment.
Join the DZone community and get the full member experience.Join For Free
DevOps is commonly known as the union of people, process, and products and a way to enable continuous delivery of value to customers. Many teams are experimenting with these principles at scale — but how do you go from a handful of teams to scaling to the whole company? In this article, we’ll explore four critical components for bringing DevOps practices to your wider organization: customer listening, teams, quality, and deployment.
Teams are the biggest asset in a DevOps transformation. Giving a certain amount of freedom to your teams, while ensuring everybody is going in the right direction, is one of the secrets (and challenges) of high-performing teams. Let’s see what we can do to get there.
Being agile requires mixing two notions that may seem contradictory: alignment and autonomy. Autonomy is crucial for a team to be efficient. At the same time, enterprises need alignment across the organization to meet business objectives.
Team structure is key to achieving autonomy and efficiency. Creating “vertical teams,” often called feature teams, is a critical step in your teams gaining autonomy. Of course, the bigger your solution is, the bigger the need for cross-collaboration and visibility, and that’s where tools can help you (like Delivery Plans). There isn’t a silver bullet, and your organization should periodically evaluate where it stands, evolving to suit your context, customer needs, and goals.
Defining the Right Objectives and Key Results
We know that setting goals is vital for success. What is more difficult is knowing how to set and use those goals. Traditionally, leadership teams set the goals, and they’re used quite exclusively to determine “performance” and handout bonuses. However, these aren’t necessarily shared goals or something every team member can unite around. In my current role, we use Objectives and Key Results at the organizational, team, and individual level (read more about OKRs).
Below are a few things to keep in mind as you set your objectives or “goals” — regardless of what you call them:
- Focus on a shared objective(s) that clearly articulates your company or departmental mission, making it easy for any individual to understand what you’re trying to accomplish as a company.
- Encourage each team to define additional objectives that contribute to your larger company or departmental objective(s).
- Be flexible. Some objectives will be measurable (e.g., to have fewer bug reports this quarter than the quarter before), but sometimes they may not. Are you planning to refactor a huge part of your stack? “Creating at least six architecture proposals” is a precise objective, but it focuses on a target instead of the real purpose. “Emerge with the best architecture for our needs and context” may be less precise, but much more aspirational.
From there, you have a framework and can reevaluate throughout the year to allow for updates and improvements.
Changing the HR Landscape to Better Support DevOps
What happens after you’ve empowered your teams and set great primary objectives and key results? The ripple effect of these changes will span across a wide range of your business. You need to make sure your Human Resources department is along for the ride. They can help you adapt the corporate context by changing the performance review cadence to be more flexible, introducing new metrics or new tools. They can also share their expertise in, well, human resources! I went through this process in a prior role, and we ended up merging some HR into the Engineering organization, and even applied Agile principles to their recruitment process.
The ripple effect of these changes will span across a wide range of your business.
Scaling Customer Listening
In this context, “customer listening” is establishing an ongoing connection with your customers to ensure you’re delivering value to them. Depending on your product, this can take many forms, but let’s look at a few metrics that can apply regardless of your scale.
What to Measure
As the old adage warns, “you get what you measure.” With that in mind, work to define what metrics represent what the customer needs and how well you’re listening to your customers and addressing their needs.
Here are some examples:
- Feature-level churn: Analyze if customers continue to use a feature to learn about the features’ continuous value. In other words, what percentage of users activate or adopt a feature, then abandon it?
- Service Level Agreement (SLA) per customer: An SLA is a contract; evaluate your product’s SLA, but also identify the SLA per customer (or the number of customers affected by a live incident). Business continuity is one of the most universal requirements for your customers, and understanding your performance here will also help during customer and sales conversations.
- Percent of features created from customer ideas or requests: Use this as an indicator, not necessarily a target. You likely will not (or don’t want) to implement all the features your customers request, but it’s important to understand what percentage of these requests you do end up implementing.
- Time-to-learn: Track the time it takes to collect useful data about how users engage with newly shipped or updated features and products. I remember a feature I ported over from one mobile platform to another. Due to the sprint’s duration, we shipped half of the feature implementation. One week later, we discovered user engagement was better with half of the feature than the complete feature! Due to our traffic on this feature and our effective telemetry pipeline, we only needed a week before we knew where to stop to get the best effort/user value ratio.
Get These Measurements Into the Development Process
When you’ve managed to get your customer listening metrics in place, the next step is to incorporate them into your development process. For example, if a user story originates from a customer discussion, customize your work item tool to track the feedback source: customer name, feedback medium (such as customer support, public dedicated website, or in-product feedback).
You can extend this to other development techniques, like feature flagging or A/B testing. If you want to change a user flow in your app, start doing it on a small fraction of your users. Then after a certain period, check if it has any impacts on feature-level churn, and if the changes are positive, go on. Dividing tasks among multiple sprints is not always easy, but it can be a great opportunity to refine feature development.
Ship Features the User Will See, or GIF-Based Development
One of the key aspects of DevOps is to build features for the customer and/or user. However, in larger engineering teams, individual teams may not feel like they have a direct impact on this (or, even worse, they’ll feel that if their team doesn’t deploy customer-facing updates this sprint, another team will, so it’s okay).
Enter “GIF-based development”: at the end of each sprint, each team captures a mini-video (or GIF) of a customer-facing update (something that delivers value to the end user). It may be more difficult for some teams, but the overall goal is to show how you improved something for others. Creative teams may look at process improvements that made other teams faster or more customer-focused, which in turn impacts end users. The “what” isn’t as important; simply introducing the concept leads to teams spending more time in the planning phase to ensure that they’re delivering value to your customers and impacting your overall metrics and goals.
Deploying frequently requires confidence in what you’re deploying and your deployment process. You can’t have a test suite that is taking nights (or days!) to run. Rather than focusing on integration testing — at the right, or end, of the development process timeline — put more effort into tests at the lower level like unit testing, at the left, or beginning, of the process.
Deploying frequently requires confidence in what you’re deploying and your deployment process.
If you handle your test code the same as production code and run these tests at each phase — from pull requests to production systems — you’ll catch errors earlier in the process. Integration testing is still useful, but catching small issues with unit tests early saves time and resources later.
Quality is a team sport: everyone should be involved and committed to delivering quality code. Code reviews can be a great way to work on that, but not in the way you may think. A Microsoft study found that only about 15% of comments indicate a possible bug in code. If catching bugs is not the primary benefit, what is? From the same study, at least 50% of all comments give feedback related to the code’s long-term maintainability. Code reviews are one of the greatest ways to disseminate business and technical expertise to the whole team. If junior developers can learn a lot just by reading them, who is doing which code review can have a tremendous impact.
Scaling Deployments: Understand Your Blast Radius...
Shifting automated deployment from a small team to the whole company also means that if something goes wrong the impact — or blast radius — becomes significantly larger. For each deployment pipeline — which may deploy work from several teams or microservices — you need to understand your blast radius: how many customers can be impacted? How long would it take you to revert to the prior, stable version?
Testing, user feedback, and telemetry could help frame your view — but this isn’t one-size-fits-all. Each organization needs to build their own risk tolerance, then evaluate.
...and Control It
Once the blast radius for each pipeline is defined, you can put strategies in place to manage it. One well-known tactic is to build a ring-based deployment pipeline: you deploy new releases in rings, not all at once. Like the target in a game of darts, the outer rings impact a broader range of users, while the inner rings impact a concentrated group. The broader the ring, the greater the blast radius.
One well-known tactic is to build a ring-based deployment pipeline: you deploy new releases in rings, not all at once.
Another way to control blast radius is to put a different quality level on different types of deployments. Want to deploy a minor feature update? A single person code review and a successful build is enough. Want to deploy a DNS update and change a path in the authentication flow? Let’s make sure that’s reviewed by several people from different teams.
Even if you’re hosted on premises, applying some of the Cloud Design Patterns can help you create more resilient services, and limit the impact of an outage.
In closing, scaling DevOps practices organization-wide is not a project. It’s a never-ending journey. Getting teams aligned, maintaining autonomy, providing goals and metrics, and focusing on customers’ needs are a few challenges that you’ll face — but these best practices and fundamentals will also set you up for scale and continuous, compounding improvements.
This article was originally published in our Scaling DevOps Trend Report. To read more articles like this, download the report today!
Opinions expressed by DZone contributors are their own.