SRE vs. DevOps
In this article, you will gain an understanding of the distinctions between Site Reliability Engineering (SRE) and DevOps.
Join the DZone community and get the full member experience.
Join For FreeThis is a question that I hear on a fairly regular basis, not just internally but from external customers as well. So it’s one that I would like to help you walk through so that you can really figure out what makes sense in your organization, and I think the answer is probably going to surprise you a little bit.
I think probably the most important thing to understand is this isn’t a versus question. You don’t have to have one or the other. As a matter of fact, I would argue, and I think that many people would agree, that SRE is actually an essential component of DevOps, and a good, properly implemented DevOps method leads to the necessity of SRE when it comes to deployment. So there are two sides to the same coin, so that will obviously lead to a little bit of confusion because DevOps is the development methodology; it’s all about integrating your development teams and your operations teams. It’s about knocking down those silos between them. It’s about ensuring that everybody is singing the same songbook, and that’s very important. SRE is in charge of automating all of the things and making sure that you never go down.
There are really two parts of the same group, so let’s look at the differences because they do have some differences. Probably the first and largest one is that when we think about our DevOps.The DevOps guys, particularly your developers, are doing the Core Development. They are answering the question, “What do we want to do?” they are working with product, they’re working with sales, they’re working with marketing to develop the design and deploy. What is it that we do? They’re working on the core.
On the other hand, SRE is not working on the Core Development. What they are working on is the implementation of the core, they are working on the deployment, and they are constantly giving feedback back to that core development group to say, “Hey, something that you guys have designed isn’t working exactly the way that you think that it is” If you want to think about it this way DevOps is trying to develop. SRE is saying how we deploy and maintain and run to solve this problem. It’s theoretical versus practical. Ideally, they’re talking to each other every day because SRE should be logging defects; they should be logging tickets back with development. Still, probably most importantly, they need to understand that they have the same goals. These groups should never be aligned against one another. And so, they do have to have a common understanding.
Let’s see about the most important part; we’re going to talk about failure because failure is not necessarily failure; it’s just a way of life. It doesn’t matter what you deploy. It doesn’t matter how well it goes; it will happen. There is a failure budget or an error budget where things will go wrong. SRE team, when it comes to failure, they’re going to anticipate it, they’re going to monitor it, they’re going to log it, they’re going to record everything, and ideally, they can identify a failure before it happens. They’re going to have predictive analytics that will say, “All right, this thing is going to go bad based on what we’ve seen before.” So, SRE is responsible for mitigating some of those failures through monitoring, logging, and doing the preemptive parts. So we’ll do the monitors, we’ll do the logs. SRE is also going to lead all of your post-actual failure incident management. They’re going to get you through the incident, to begin with, and then they’re going to hot wash it, and when it’s done, you have to get Dev online because these are the guys who are going to solve the core problem; some RCAs might be solved by SRE internally. Then SRE team will integrate the fix into their monitoring and their logging efforts to make sure that we don’t get into another RCA for the same kind of problem.
There are different skill sets. Core development DevOps, these are the guys that really love writing software. SRE is a little bit more of an investigative mindset, right? You have to be willing to go and do that analysis, figure out what things have gone wrong, and automate everything. But there’s a lot that they have in common. Everyone should be writing automation; everyone should get rid of toil as much as possible because we just don’t have the time to do manual tasks. When we can put the computers in charge of it, computers are not great at thinking on their own, but if you need it to do the same thing repeatedly, you can’t beat computing for that. And so, automation is key; you have a slightly different mindset. DevOps is going to automate deployment; they’re going to automate tasks; they’re going to automate features. SRE will automate redundancy and manual tasks that they can turn into programmatic tasks to keep the stack up.
Published at DZone with permission of Pradeep Gopalgowda. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments