Building a DevOps Strategy and Environment
Building a DevOps Strategy and Environment
Watch or read a discussion with several expert practitioners, discussing DevOps, security, and IT transformations.
Join the DZone community and get the full member experience.Join For Free
Learn more about how CareerBuilder was able to resolve customer issues 5x faster by using Scalyr, the fastest log management tool on the market.
The need for accelerated application deployment has given rise to the DevOps movement. However, despite the undoubted benefits that DevOps promises, many IT organizations struggle to understand and get started with a DevOps strategy and project.
In a recent Cloud Luminary Fireside Chat, Bernard Golden explored the topic with two expert practitioners: Mike Kavis, Vice President & Principal Cloud Architect at Cloud Technology Partners, and Matt Stratton, Senior Solutions Architect at Chef.
The webcast focused on the practicalities of implementing a DevOps process, including topics such as: who needs to be involved; identifying what needs to be changed; compliance and security needs; building a DevOps toolchain; and the organizational side of DevOps.
Below is the recording and a transcript of the discussion.
About the Guests
Mike Kavis, VP/Principal Architect at Cloud Technology Partners
Mike has served in numerous technical roles such as CTO, Chief Architect, and VP positions with over 25 years of experience in software development and architecture. Mike is a pioneer in cloud computing, having led a team that built the world’s first high speed transaction network in Amazon’s public cloud, winning the 2010 AWS Global Startup Challenge. He is an analyst and blogger at Forbes, The Virtualization Practice and DevOps.com, and is the author of "Architecting the Cloud: Design Decisions for Cloud Computing Service Models (IaaS, PaaS, SaaS)".
Matt Stratton, Senior Solutions Architect, Chef
Matt Stratton is a solutions architect at Chef, where he demonstrates how Chef’s automation platform provides speed and flexibility to clients’ infrastructure. He is devoted to concepts like Continuous Delivery and Infrastructure as Code, and his license plate actually says “DevOps”. He is the creator and co-host of the popular "Arrested DevOps" podcast.
Matt has over 15 years experience in IT operations, ranging from large financial institutions such as JPMorganChase and internet firms, including Apartments.com. He has given presentations at Microsoft-sponsored events, ChefConf, DevOpsDays, Interop, and various local groups within the Chicagoland area.
He lives in Chicago with his three children, and has an unhealthy obsession with Doctor Who, Firefly, and Game of Thrones.
Who Needs to Be Involved?
Bernard: In previous sessions we’ve talked about enterprise DevOps and the drive for it, some of the challenges. One of the things that we haven’t really talked about is, who does it touch? I recently finished a very well-known book in this area called "The Phoenix Project" by Gene Kim...One of the things that’s interesting is, as they go further and further into this project, they keep finding that there’s more and more groups involved, and there’s more and more things that they hadn’t really been aware of. So let me just kick it off...Who needs to be involved? But also, how do you figure out who needs to be involved?
Matt: ...My one word answer to this one was "everyone", which is actually intended to be glib but it’s somewhat true. The point is, DevOps is an unfortunate name because it seems to imply we’re only talking about those two groups...you said this touches a lot of groups and that’s absolutely true. These lines are blurred. What’s rather a matter than saying, "who needs to be involved" — because I don’t think that it’s the same in every organization...what’s really important is being able to identify them and bring them in as you can.
There’s a couple of ways you can do that. There’s the ability to do the research of your own to reach out to people, but you need to make your tent big and encourage people to come with you. That’s like, for example, at GE Capital when they started making this transition, they said, “Okay we’re going to invite people to be part of it." And if you choose to not be part of it, security groups, testing groups or different parts of the product groups don’t want to be part of it, that’s okay. But you need to invite people under that...
...Anyone that has a vested interest in the value stream of your product, let’s put it that way. That’s how it is. If you look at the difference between what it takes from a product going from an idea to being in front of a customer, anyone who’s touching it should be part of your DevOps transition because they’re part of that product.
Bernard: ...In the context of DevOps, when you use the term "value stream", what do you mean by that?
Matt: What I mean by this, and I think it’s actually very similar to the concept that you’re thinking about when we think about manufacturing principles. The value stream is, what does it take when I’m creating a thing?...John Willis has talked about this idea, “It’s the time [between] commit to cash” between when I committed a change into source control and now its revenue. That could be a way to describe value stream...
And when you’re doing that kind of mapping...what I’m going to do is think about, “What’s the process in my organization? How do I track that?” In manufacturing, you would track it by saying, “I'm following a car through the plant.” I’m saying, it’s going through each step. That’s its stream before it actually shifts.
So how does that happen in your organization when it comes to software? It’s, I’m a product owner, I thought of a feature...What are all the pieces, all the flows that happen before that’s actually in front of a customer of some kind, where I can get feedback as to whether or not my idea was actually valuable?...So everybody who touches it when you track that through, everyone who touches that artifact, they should be involved.
Bernard: Well Mike, let me take the flip side of that...I’ve been involved with organizations that have the challenge of, you invited everyone and there were people who said, “I’m not up for it, I’ve got something more important to do. I’m going to stay home and floss the cat...” Okay, fine you don’t want to be involved. And then when the time comes to really put this into production, they go, “Whoa, wait a minute, you haven’t looked at the security." I’m not going to pick on security, there are other groups...they’ll sort of say, “I’ve got a vito here. I’m not going to accept it.” How do you deal with that?
Mike: Believe me, I’ve seen that a lot and there’s top-down driven initiatives where the C-level people are pushing it down, but most of these are grassroots. Mostly, there's people in the trenches who are tired of "pagers" and cell phones going off and they’re coming from the ground up. They’re the guys and gals fighting that uphill battle because they’re trying to solve their problems, and a lot of other people aren’t really vested in it because the culture is silent. Basically what you see there is, they’ll create small wins, but to really get to the end-game they have to engage everyone.
When you’re not getting that engagement, you really have to work up the management chain and try to get buy-in, and a lot of times you don’t get it until you see results. So it's this catch-22: you have to do what you can do with what you’ve got, show results, capture your metrics before you implement these methods...and then sell. I mean, you become salespeople.
There’s no easy answer. Every culture is different. I saw one company where, at the beginning, security was saying, “No, no, no”. What they were doing — this was more of a platform team — they were standing up a cloud and they wanted to build guard rails around it and create services for developers. They were using Chef and they automated the whole infrastructure, and the app team that they invited to be their pilot did a great job. They implemented CI and CD. The security team initially was opposed to it, but then they started inviting the security team.
The security team started doing security review at sprint zero, even before the project started...and then they would give them a risk score. And If there was high risk, like if there were mobile payments, they would have to go through a lot of structure but if it was web page or something, they allowed stuff to skate. They also worked closely with the guys doing Chef, so they put all the security policies in there. And over time, those security guys and girls became the biggest fans of the platform because now they were able to import their policies in an abstracted way and they were getting enforced. So when people came to them and said, “I want to build X, Y, Z” they go, “You can build on that platform. If you're not building on that platform, I don’t want to talk to you.” They turned it around from being the inhibitor to being the biggest fan, but that took a lot of time.
Matt: I just wanted to respond to two things that you said that are really interesting. One is, something that we’ve started to see — it’s absolutely true...a lot of this is groundswell things — but we’re seeing this coming from top down now. The senior IT leadership is saying, "I want this", which is great news. There’s less salesmanship that needs to be done as an internal evangelist or internal stakeholder. The thing that really echoed to me was when you talked about the enablement. Security groups should love automation, and they do. I always like to say, “People can lie; computers don’t lie”. So if an audit is, “Hey Mike, did you do this?" You go, “Sure did!" What good is that? But [automation is] great for that. I’m seeing that in companies when I come in to talk about Chef. More and more of the cyber security or infosec folks are in those conversations.
But what all this comes down to is, what are you implementing? Because you really shouldn’t be implementing DevOps. What you should be doing is solving a problem you have. What your people will be on board and will be invested in is if you’re solving a problem for them. A way to do that is, when you bring people under the tent, like Bernard [says], “Hey we’re going to do this DevOps transition. Everybody come and join us” and they’re like, “I’m too busy. This DevOps is a bunch of hooey and hippies and whatever.” If you say, “Hey we’re going to have a brown bag once every week or we’re going to talk about how we can improve our process and we can make things better. Let’s spend the first two brown bags talking about our pain points and figure out the problems to solve", people come along. And it’s interesting because if you look at "The Phoenix Project", the title says something about DevOps, but throughout the whole thing they’re not actually making a DevOps change. They’re trying to solve a problem and they’re implementing it with these things.
"I saw one company where, at the beginning, security was saying, “No, no, no”...And over time, those security guys and girls became the biggest fans of the platform because now they were able to import their policies in an abstracted way and they were getting enforced."
Mike: Real quick, I’d like to add two groups that are often forgotten. One is product manager. Typically a product owner is incented to create features and somehow they’re not incented for quality and reliability which are outcomes that we’re trying to improve. So even though we have all these IT people working together collaborating and saying we need to improve these things, if the person prioritizing doesn’t care about that, and they only care about the next feature, then you don’t get the end results. It's too easy to say, "that's QA’s problem", or "that’s Security’s problem". If I’m a product owner building the next BMW, I have to put airbags in there...We should think about IT systems like that too; you own the whole thing.
The second group is Operations: help desk, the service desk. I’ve seen that same company that did a great job automating infrastructure, a great job with CI/CD. They pushed the deploy button and everyone was happy until something broke. They really didn’t think about what happens when something breaks. Now the Dev team is supposed to fix it. Well, they’ve got to create a ticket to go get a log file. They didn’t tie it into Operations. You’ve got to bring those guys and girls in there too.
Matt: Into that part of Operations. That’s actually a really interesting point I haven’t thought about, which is, those grizzled old sys admins like me are always like, “You damn Dev guys are never talking about Ops” and then we come along with DevOps and you’re like, “What about the service desk?” and we’re like, “Those guys? Screw those guys. They’re not real Ops”. I like that.
Bernard: I sometimes think one of the biggest challenge is people’s mindsets...like, it’s always been this way, it’s going to be this way...I was doing a workshop and I start off by talking about what cloud computing is and its characteristics. I started with this definition, “The very first characteristic is self-service...It’s like buying a book on Amazon. You fill out a web form and you hit submit. It goes off and ten minutes later you have resources.”...And the guy says, “Yeah, as soon as the sysadmin reviews it and then executes it.” I said, “Well that’s not really the point of view of this. That gets automated so there’s nobody who needs to be involved. Somebody just fills out a form..." and he says, “Yeah as soon as the sysadmin reviews it and approves it.”...I finally realized this is like Godzilla and Mothra, we’re just going to go on. I said, “Okay let’s go on to characteristic number two” and to think about it later, it wasn’t like he was just being difficult. It was like he couldn’t imagine a world where that didn’t have to occur.
Identifying What Needs to Be Changed
Bernard: Within the IT value chain, how do you identify what needs to be changed? They did that in the Phoenix Project. They kept iteratively going through and saying, “This is a bottleneck.” Mike...how do you identify what needs to be changed and how do you identify what should replace it?
Mike: We do exactly what they do in the Phoenix Project. A lot of times when we go into an account, they’re focusing on a single domain...You sit there and you start from requirements to running operations in the back end. You have them walk through the process and you identify bottlenecks. Then the real important part is prioritizing those bottlenecks. The people who have read the goal [know that if] you don’t fix the right bottleneck first, you’re just stacking problems someplace else.
You do those value stream mapping exercises. You interview a lot of people. When we can, we get business people in there as well because they probably have a different story on the bottlenecks than the IT people do and a lot of stuff bubbles up. You prioritize the best you can and you start attacking it.
I wrote a post a while back — the top bottle necks I’ve seen. The number one I see is inconsistent environments, that’s the top one...The second most popular one I see — and I’m just focusing on the technology ones, there’s definitely organizational ones and process ones — is provisioning time, like a place [where] it takes six months to get a server up. If those are your bottlenecks and you fix those two alone, you can increase productivity so much.
But then there are also often process bottlenecks. Going back to that one client who did all that automation, they still had this legacy ITIL process. ITIL doesn’t need to go away, but they needed to make ITIL agile. So instead of, I automate my deployment but then I’ve got to go to a weekly CAB review to get it approved, hit the button, and there’s 20 other people in line so it takes two weeks...What we did in that case, we automated a lot of the information and approved it automatically — the ones that hit the lower risk profile — and turned a CAB review into a post-mortem review instead of a stop gate. Stuff like that, you look at the process...Why does it take four months to change the firewall, stuff like that.
Matt: First of all, I’m going to agree with you 100 percent because people like Eli Goldratt and smarter people than us already figured all this out...this theory of constraints. The good news is this stuff is actually not super intimidating to do. You can do it at this big holistic level, but sometimes it can sound really challenging for an organization. It can be just a very simple exercise which is, “Okay, product owner, you have a feature. Let's map it through. What does it do?” It doesn’t mean you have some kind of fancy software. Just draw it on a piece of paper. So then you understand it.
The problem is there’s very few people, probably, in your organization who understand that whole flow. They understand the input that comes to them and where they kick it out to. There’s an anecdote from Kevin Parker from Serena Software back when I was doing more ITSM stuff. He was talking about improving your change management process. He would tell CTOs or CIOs to do this as an exercise: Put a change into your change request system, which is literally to change nothing, and see how long it takes for that change to go all the way through the process and it would probably take, like, three weeks.
Similarly, just say, “Okay I’m going to go through the exercise. What if I actually had a feature? What would be all the stops?” And you’re going to look and you’re going to be like, “This goes out to three different teams and then storage does this thing...Now it would sit and wait for three weeks. Wow. Here’s a constraint.”
The most important part of all of this is, you can’t cherry pick your bottleneck...you identify your bottlenecks. That’s good because you need to know what they all are, but you only get to work on one. It’s not the one that you think is the most important because it seems like it’s important or it’s easy...[Leadership might say] “We can’t change that so go fix something else instead.”...you have to organizationally be buying into, we have to make some tough decisions even if they aren’t fun. Fixing top bottlenecks is usually not a whole lot of fun. It’s not sexy work. It’s usually fixing a lot of crappie tech data. Really, there’s a lot of, “Pull yourself up by your boot straps, put on your wading boots and do hard work.”
Ensuring Compliance and Security Needs are Addressed
Bernard: Let’s talk about compliance to security. How can DevOps ensure that those things are addressed? In my experience, those are often pretty significant bottlenecks...because it’s out-of-band. It’s like, “I’ve got to this point and now I’ve got to go make this special request,” and it gets into a loop that’s off by itself and is maybe short staffed. That can be quite challenging.
You need to schedule meetings with security groups to review this. The first opening on this agenda on their calendar is two weeks from now...How do you address those kinds of things? How do you ensure that there are addressed in a DevOps context? DevOps, at least part of it, is the notion that you want to go from the time that someone checks in code to the time it’s running in production as quickly as possible.
Matt: With as much velocity as possible. That’s a real important distinction. I’m jumping in here because this is a core thing we’re doing at Chef right now. It’s a core part of our philosophy, is this idea of compliance of velocity. "Move things as quickly as possible" sounds like, "pell-mell, damn the torpedoes, I ain't got time for security. We’ve got to ship some stuff.” That’s not high velocity, that’s just going fast...I think that a lot of these ideas can actually really enforce those things that you’re talking about...Like you said, we touched on it a little bit, that a lot of this tooling is really powerful.
One of the most important lessons of the Phoenix Project is with the security guy — and this is so common for us when we think about audits...It’s always, “We can’t do this because of our audit requirements and because of compliance.”...because what we’re used to doing when we talk about compliance is, like the guy in the Phoenix Project, we have our three-ring binder that has all the checklists that we have to do.
What we care about is the checklist, not what it’s for...Bernard, right back to your point [about] the guy who was saying, “Well we can’t do this. You put it in and then the sysadmin reviews it.”...For that fellow, the important thing was that the sysadmin reviews it, but the real important thing was why was the sysadmin going to review it? Because we needed to make sure that we trusted it. But we could do that in a different way and still accomplish that.
I think a lot of times when we look at things like audit and compliance...we’ve created this sort of ritual around how we do compliance and we care more about the ritual than the reason we’re doing it in the first place. That’s a challenging piece of that. It’s understanding that, again, it’s hard work. It’s much easier just to say, “I have my binder and we have to tick the boxes in the binder.”
What is a Common DevOps Toolchain?
Bernard: Let’s talk about the tools and tool chains. Mike what do you see people do? What is the common tool chain? I’ll start off by saying, I’ve been to a number of DevOps camps and they’re always so inspirational because people get up and tell these incredible stories. But they always seem like their own snowflakes. It’s like, “Well I grabbed this open source project and I did this and I wrote a bunch of glue code...” It’s like, yeah, but now you’ve got a pile of tinker toys that you put together. God forbid you should ever leave the organization because who’s going to know what’s there?
Mike: I’m tool agnostic so I don’t have a cat in this game, but I see exactly what you’ve described: people grabbing a lot of different tools. And it really comes down to what are we defining as the scope of DevOps tools? I see stuff as far as...tools to manage the project from Kanban and stuff like that, all the way to monitoring, logging tools and the stuff in between: testing tools, the scripting tools like Chef and Puppet, containers. And I do see heavy open source use because if you were to pay for all these tools, the ROI in doing this is might not look so good.
Matt: To be honest, just because it’s open source it doesn’t mean that it’s free. I have different thoughts on why open source is leveraged so much in DevOps, but part of it is also using free tools as much as you can.
Mike: Yeah. One of the reasons I think people are grabbing a lot of these tools is because they don’t necessarily need to go up the chain and get permission, so they can plug things in and test fast. A lot of the tools are focused on the pipeline: How do we do the build? How do we do the test? How do we run a security scan? How do we run a code scan?...There’s so many tools. I’ve seen a slide before, it probably had 100 logos on it and you go into any shop and they’ve got a good number of them.
Matt: That’s a risk actually, in a lot of ways. One of the things that’s good about this top-down adoption...is because when you’re coming bottom-up, you’re doing what it is that you can do. There’s definitely the challenge of, someone patched together and hacked together this workflow that works at webrespondo.com, this start-up they're working at. I’m not seeing this in the enterprise. Maybe they’re doing it for their POC, but the places that I’m going into and selling Chef...these things are standardized. They are building standards around open source tooling...
"As we’re getting this top-down adoption of DevOps and things are being implemented as standards within your organization, even if they are a combination of multiple technologies...To me, that’s a valuable thing...If I'm the person who just wants to use it, I don’t have to go make those mud pies because, theoretically, they’ve been made for me by my organization."
...There is a risk to someone doing something just because it’s fun. As we’re getting this top-down adoption of DevOps and things are being implemented as standards within your organization, even if they are...a combination of multiple technologies, which is really robust, that combination is specific to GE, or JP Morgan Chase. We've always had architects in those places...To me, that’s a valuable thing...If I'm the person who just wants to use it, I don’t have to go make those mud pies because, theoretically, they’ve been made for me by my organization.
Bernard: The way I characterize it is...“Don’t build the new legacy. You're trying to extract yourself from legacy systems...You have insight now."
Mike: Can I clarify one thing, though? They are standardizing the scripting tools — they will pick a Chef, a Puppet, an Ansible, but there are so many different features and tools they you need to automate the pipeline. You need code scanning, you need security scanning, you need a test harness. When I say they're cobbling together a bunch of tools, they are picking a standard in each one of those areas, but there's usually about 12 tools it takes to do the whole chain.
Matt: Are you saying within an organization, different groups are using different tools for the same function?
Matt: I just want to know what we are talking about, because instructing the whole industry to standardize on a tool chain is silly. I'm the vendor, and I'm not saying the whole industry should use Chef. Use stuff that’s right for you...What you shouldn’t do is have five different CM (configuration management) tools because we’re going do a lot of repeated effort...The next evolution is coming together and saying, “...This is the defined pipeline of how we do work,” and now you can just consume it because that's what should be happening. My job as a developer...is not to like build CD pipelines. My job is to support my business and do that. So the quicker that stuff can get out of the way, it's an enabling tool… it's a means to an end.
Mike: You get that when it's top-down, but a lot of these companies, they have multiple grassroots.
Matt: Well, I think that's the shift we are seeing now...but it has to go in both directions too, right? You have to have buy-in coming up too...When we had Bill Joy on my show, we talked about the idea of compliance versus commitment. Compliance means I'm doing this thing because my CTO said to, and if don't do it, I'm going to get fired. Commitment is, I actually do believe this is the right thing to do, and I'm in now. Maybe I was at a point where it was appropriate in our organization to disagree, but now that we've made the decision, we are moving forward. If it's only top-down, you’re not going to get commitment. But at the same token, you’re not going to get support, you're not going to get consistency if it’s all grassroots. You've got to meet in the middle.
Does DevOps Reduce Outages?
Bernard: We had a question submitted that I think would be interesting to address: Is DevOps also going to reduce outages?...Mike, what is your take on that?
Mike: It's not DevOps. It's the people who implement whatever you're implementing. The DevOps mindset is all about continuous improvement, building better quality, reliability. But it's not DevOps that gives you the end-game, it's the people in the organization that put the tools and the processes in place. That's one of the big fallacies is, “Oh, I've got DevOps now. I've solved all these problems.” No, now you got a new problem — you need to learn how to operate that new model.
Matt: It helps you facilitate, for the innovative folks. The response I gave to that question was with a question: Is it a reduction in the number or the impact of the outages? That goes a little bit more towards thinking less about DevOps and more about small batch size and about understanding the change and the impact of the change.
If you're talking of DevOps being the idea of Continuous Delivery, then you're having potentially more outages, but the size of the outage can be much, much smaller. The likelihood of an outage goes up, but the impact goes way down. That's a challenge in a lot of organizations, to understand that that’s actually probably okay. The way I look at it is unless you're running something like air traffic control, nuclear reactors or a hospital, you can take an outage of 30 seconds and impact 10 percent of your traffic and it's okay.
Bernard: We preach to people who use our product to really start reducing the size of releases, do them more frequently and you don't necessarily have outages. What you might have is a feature you release and then pull back as you go; put it out there and, either it doesn’t work properly...or whatever it is, you need to have the ability to back it out quickly.
Matt: The impact of the changes is much smaller....An outage occurs because a change has gone bad of some kind...If you're changing a lot of things and it goes horribly wrong, a lot of things are going to be broken. If you're changing a little thing and it goes horribly wrong, a little thing is horribly broken. It's much easier, like Bernard said, to either back it out or resolve it quickly.
Bernard: You could also find it more quickly.
Matt: Yeah, exactly. And you don't piss off all the other people who were along for the ride with that change. Their change didn’t break anything but their stuff got rolled back too because it was on a rider bill.
What About NoOps?
Bernard: I've heard people talk about NoOps. By NoOps, I've heard this phrase originally from...chief architect at Netflix who asserted they had no Operations people. Basically, developers put it into production and ran it...What about that concept?
Matt: The first thing is...are we even using this term anymore? I thought this was like two years ago and we stopped talking about it. But it's important because what I'm seeing is a lot of times when people think DevOps, they think NoOps. They think it's full stack engineer, 10x engineer, all this myth crap. My thought on it is that, there is a certain percentage of smart creatives and capable people who can be everything to everyone...you have a certain percentage who can do NoOps. And NoOps doesn’t mean "no ops". It just means everybody does everything. It's full stack engineer. The problem is, everybody thinks that there is this infinite number of full stack engineers, whether they exist in the first place to begin with. It certainly doesn’t scale.
"I think Netflix is a great organization...But to think that Netflix does it that way, that Clorox can do it...just to pick any company at random...that would be like me saying, “I'm going to buy the same running shoes that Usain Bolt buys and I'm going to be as fast as Usain Bolt."
- Bernard Golden
I think you can have examples...That could work if you hired those people, those 100 people that can do that work and are willing to do that work — they came to work at Netflix, awesome. But that doesn’t mean that I can go cargo-cult that over to the Board of Trade...or to a different organization, not even because it's "enterprise versus internet." I think if you do NoOps and it works, you got lucky. You got people that are willing do it.
Mike: I always kind of interpreted NoOps as, the system kind of runs itself, we don't need people running the system. I think the problem was, people assumed that there wasn’t anyone. If you look at the help-desk, service-desk type operations, people think that's gone too. There's nothing further from the truth. If you go to Netflix and you look at the 100,000 monitors on the wall, there's people keeping a close eye on it. To me, the end-game would be nice if I could run self-healing, self-running systems and have as little people hand-holding stuff as possible. That's a highly automated, high-performing system.
Bernard: LessOps, that’s the term we should use, LessOps.
Mike: Me personally, I would love to get to a place where most everything that makes sense is automated. But to get there, you have to build a system that creates a ton of data...so systems can tell us when things are going wrong. That’s another part people forget is, if you want to have less people touching it, you’ve got to produce more intelligence through your system.
Bernard: Interestingly, at my previous company we booked a day with an analyst firm to review our approach...They said, “You know, we have enterprises marching in here all day, every day, saying how can I be like Netflix?” I think Netflix is a great organization...But to think that Netflix does it that way, that Clorox can do it...just to pick any company at random...that would be like me saying, “I'm going to buy the same running shoes that Usain Bolt buys and I'm going to be as fast as Usain Bolt.” That's just never going to happen...people sort of look at that and go, “Oh, I could do that,” instead of, “What's realistic within the kind of frame I have?”
The Organizational Side of DevOps
Bernard: Does DevOps mean people losing their jobs?...Mike, what about the organizational side?
Mike: I have many answers for this one. My first one is just kind of a snide remark. If you're going to fight DevOps, you might be out of a job some day. Holistically, though, what's really happening here is, once you start automating these things, removing the bottlenecks, becoming a more high-performing company — that's how I describe DevOps — especially on the systems administrator side, you're freeing [developers] up to work on higher value tasks...You're not configuring the same machines over and over. You’re getting the waste out of the system and these people can focus on higher value things.
Where I have seen it have an impact on headcount, we had one company who was very strategic top-down, and they were trying to build, like, "What are we going to be in five years?"...Using the assumptions that they were going to automate a lot of tasks, they said, “We’re not going to need the budget for as many resources going forward.” So they weren’t getting rid of people, but their future state organization is going to be less dependent on people.
Matt: ...I absolutely agree 100% with what Mike said...John Allspaw said this about Etsy...people are like, “You have so much automation at Etsy, why do you need a web operations team?” He's like, “Are you kidding? They have tons of work to do, but what they are not doing is copying files around and doing rote mundane things.”
"...once you start automating these things, removing the bottlenecks, becoming a more high-performing company — that's how I describe DevOps — especially on the systems administrator side, you're freeing [developers] up to work on higher value tasks...You’re getting the waste out of the system and these people can focus on higher value things."
Now what can happen — this is where I could see it could mean someone losing their jobs — is either A...In an organization where they had, it was some ridiculous number like 1:10 admin-server ratio...so when they did automation, they were clearly over-staffed by a ridiculous amount...Or if you have folks who, their only skill and their only desire to have skill is to copy files and be a click-next-admin, then yeah, then you probably will lose your job, and that might be okay.
This is happening a lot in Microsoft. Microsoft will say the days of the click-next-admin are over, and there's an awful lot of Microsoft admins who are pissed about this. They say, “Oh, this is all baloney. It's all this DevOps crap. There's always going to be the need for this.” The answer is, yeah, there will always be a need for server-janitors, but it's not going to be well-paying work and it's not going to be very interesting. That’s the thing. If you're towards the end of your career...then keep doing what you're doing, that's awesome. But If you want to be around for a little while...you’ve got to keep learning. Things are changing.
Bernard: My response is sort of an extension of what you just said, Matt, which is, I think the challenge for lots of people is that as you move toward automation, obviously you need fewer people who are just...shove the CD in and click next. You need people who can design automated systems and implement them. That, to me, is clearly a next-level-up skill. Automating a system is much more difficult than actually having the skills to do it. I think the challenge for many people is, are they able to up-skill? Are they motivated to up-skill?...It's sort of the difference between being able to work on an assembly line and attach a part, and design an assembly line that incorporates robotics.
Matt: The thing is that most of the people you have doing this rote thing do have a higher level of skill than doing the rote thing because you can't do the rote thing without having the higher level of skill. There's that great quote in the Continuous Delivery book, which is, “The problem is these manual tasks actually can’t be done by someone who has a low level of skill because they have to be able to deal with it when it doesn't work.”
When I was doing consulting around this, I would go and talk to stakeholders around doing a DevOps transition, the question I asked every stakeholder group was, “I have a magic IT wand and I'm giving you one wish. What is your one wish?” And every group of stakeholders had different answers, except for Ops. Almost without question, every Ops group I ever talked to had the exact same wish. You know what that wish was? More people in Ops. And It's because they don't have enough time to do all the things they want to do because they're doing all the other crap.
So the thing is these teams and these people, they actually do have skills and things that need to be done, but they are not able to do it because there is not enough time. The automation frees that up. You're right Bernard, there are people where that particular skill set of being able to take from being able to do, and learning how to design automation around it...might not be for everyone, but that’s not the only thing for an Ops person to do when they are leveling up. If you're not leveled up, then yeah, maybe that particular type of career doesn’t exist anymore.
Bernard: Retrain as a COBOL programmer.
Matt: Again, there are still people employed as COBOL programmers and there's still people who are employed as Windows 2003 server admins. The jobs will be there. They just aren’t really very fun.
Bernard: Well, we're reaching the top of the hour...I really appreciate both of you participating and contributing and sharing your experience, it's been fantastic. I've really enjoyed it. I'm sure our audience has really enjoyed it and found it educational.
Follow on Twitter:
Learn more about the Cloud Luminary Fireside Chats: http://fireside.activestate.com.
Published at DZone with permission of Mike Kanasoot , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.