The original post can be found on the Electric Cloud blog.
In a recent Continuous Discussions (#c9d9) video podcast, expert panelists discussed must-have’s for Application Release Automation (ARA). Our expert panel included: Carl Caum, senior technical marketing manager at Puppet; Fabian Lim, a DevSecOps engineer; Marc Priolo, global software configuration manager at Urban Science; and our very own Sam Fell. During the episode, the panelists discussed the benefits and challenges of release automation, and some best practices for release modeling, coordination, and ARA at scale. Continue reading for their full insights!
Benefits of Application Release Automation
Automation is about letting humans do what humans are good at, explains Caum: “Humans are very bad at doing the same repetitive task over and over and over again. That’s what automation’s all about – taking the things that humans do over and over again and making it repeatable. So you automate it, and then you let the humans to what they’re good at, which is decision-making. It allows you to be able to say, ‘I understand what’s going on at this moment. I have predictions and expectations and I can react when things are doing something differently than what I expect.’ Application release automation isn’t just about making the process more predictable, it’s also about changing the way you act as a company, the way you innovate, and the way you deliver to your customers.”
Lim argues that automation increases your speed of delivery – helping cut down your cycle time. “What is also really important I think,” he adds, “is to embrace failure and to experiment. With automation we can actually test out certain things, certain features – is it good, is it bad, does it break? When you’re sure you can support speed – if it does break, you know you can always go back to version X-minus one, so you’re safe.”
Priolo agrees that “one benefit of ARA is the speed. You couldn’t get that type of speed having humans doing the individual steps manually. That was just one of the major benefits. He shares the impact automation has had on their deployment rate and release cadence: “When we adopted automation, it opened up new avenues for deployment models that just were not possible no matter how many people we hired. And that was a really great win that wasn’t something you can foresee. When we realized we can deploy move faster and deploy faster – we have teams that went from deploying once quarterly to deploying nightly. It also had us switch from the mentality of ‘when we have resources available, we’re going to go ahead and do these automated deployments or release pipeline models,’ to ‘When the code is ready, we’re going to go ahead and deploy these ones.”
Further describing his use case, Marc shares that they now have over 300 applications using what they established as a “pipeline as a service”- a repository of pre-defined pipelines used across applications, to automate their delivery and ensure compliance and proper testing. “We started by offering a limited amount of vetted, pre-defined, automated pipelines for applications to use to get to Production. Out of our 40 teams, some teams use a pipeline that automatically deploys to Production once the code is ready. Others, use pipelines with more manual gates according to their comfort level and test coverage – so that when the code is ready it is promoted upstream at their pace, and then depending on the level of automated testing, they could either be using a continuous deploy model or they could be a Continuous Delivery model where they still have somebody that checks off on it.”
Even though the benefits are obvious, it can be challenging to get an entire organization on the ARA train, says Fell: “Enterprises are afraid of deployments in production because they’re not sure exactly what’s going to break. A lot of the Continuous Delivery and DevOps conversations that people have had is all about reducing batch size. You can do it more frequently. You know when there’s something wrong because it’s a small batch and you know what has been deployed that caused that problem. To get to that mindset, technology is part of it, but understanding the cultural shift to that different way of working is one of the things people are really struggling with. They’re struggling to understand how do I get my whole organization around that? The benefits are clearly there, right, more efficiency, better quality, better predictability.”
Challenges With ARA at Scale
Here’s a trick for safeguarding your pipeline, per Lim: “When we build a pipeline for the whole organization to go through so everyone will go through the same pipelines, it can be challenging. Think about trying to isolate the security risk that you find for each application because you have 40 projects, right, you only need one to compromise the whole pipeline per se, if something is wrong. So while designing the pipeline, you want to think about how that can be contained per app, per project, etc. Think about how to isolate the blast radius of the compromise into its own account. Project X has its own X account, project Y has its own Y account but they are all going to go through the same pipeline. So when X gets compromised it’s just X. You close X account, you spin up a new X1 account, so that was one thing that I learned – one of my CI/CD pipeline tricks.”
Getting the entire team on board is a major challenge, per Fell: “A challenge is getting all of these different teams to adopt this standardized process, just to understand what the benefits of that standardization is. It comes down to shared learning. You have that shared resource that you provide, Marc, to all your different teams as a pipeline as a service to allow them to take advantage of the best practices of all the learnings. All the toe stubbing that you do when you’re trying to move things from the left to the right in that pipeline to make that rehearsable and more predictable to the point where in production it’s been rehearsed thousands of times already, so of course it’s going to work. Because you have fidelity across the environments, because you’re using configuration management, you have fidelity across the binaries because you have the artifact repository and the process, because everybody’s following that same process.”
As organizations scale, they tend to form specialized teams (a.k.a. silos), says Caum: “The biggest challenge that I’ve seen is as organizations get bigger and bigger, they tend to specialize. So you have people who specialize in Apache, people specialized in SQL, etc. Often that turns into specialized teams which then become silos. When you start to have all of these things come together, that becomes a real challenge. How do you have one team that is responsible for the application build deployment use one tool, have another team that’s responsible for the middleware use another tool, and then have the core operating systems configurations being managed by yet another team? And, how do you understand how all of that work comes together and how or where it conflicts? If you make a change to a system, how do you understand how that affects everything else on this system? So it’s a matter of two things, one is scaling expertise, having these specialized teams and individuals be able to codify how they manage things in a standard way and making that accessible and consumable by other teams, but also understanding how all these teams and all these individuals work comes together. And at scale, that just becomes a tremendously difficult task.”
ARA is an evolutionary process, you are never done improving, says Priolo: “It’s always a work in progress. It’s understanding why you’re doing it and how you’re doing it and not just doing it for the sake of doing it. So it’s always an evolutionary process. It’s always reevaluating, making sure that you’re going through the right checks and balances, the right people are involved where they’re needed. It’s just that when you’re dealing with a larger scale or you’re dealing with tons of applications or hundreds of applications, sometimes collecting that information, coming up with the decision comes a little longer in the process; a little more vetting is needed to be done.”
A tip from Priolo: “From an artifact perspective, what we do is we enforce the maturity of an artifact. We have four levels of maturity. If other teams are using artifacts from other teams we say that it has to be at least pre-production ready or staging-ready before another team can even look at the artifact. That way any artifact created by any team can be considered by another team once it’s at a point where they feel it’s comfortable for the other teams to consume. That way we don’t run into a situation where Dev team B is consuming Dev team A’s Dev artifacts. They have to be at the point of being production ready before they can cross borders. And then the environment owners are the ones that are really going to dictate what is meant to match those…that environment based off of artifacts that are mature enough to go to those environments. So we’ve enforced the ownership, so there’s going to be a team that owns the environment in question.”
Patterns for Success: Model
Modeling ensures your environments are consistent throughout- from Dev all the way to Prod, without having to necessarily invest in the exact same infrastructure scale across all environments. Priolo explains: “One of the challenges I’ve seen in the past is that you don’t want your Dev to be like your production environment. You don’t want to spend the money on getting the same type of load balancers, the same type of configurations that you have in there because it feels like its throwing away money. If you want to have a successful production environment, you need to downstream it so that you’re not testing in production, you’re testing downstream. Modeling takes the entire environment into account and not just the code portion or just the website portion but how the entire environment’s going to interact within your own infrastructure or within the Cloud and how it’s going to be seen by the client. That needs to be done on Dev. You have to be able to at least have it at scale in Dev with the same exact type of configurations.”
Fell explains our "Process as Code" model at Electric Cloud: “We talk about Process as Code at Electric Cloud, because even if you have all the right ingredients, all the right equipment, but you’re using it all in the wrong order, you’re not going to get the right outcome. Making sure that you’ve modeled that pipeline, that process and making sure that you have the ability to replay that but not in an abstracted way that you can replay. In general, here’s how things are supposed to go and it can flow around the different actual use cases or situations that you run yourself into.”
Don’t just model your applications, model your infrastructure, advises Caum: “Modeling is critical to success mainly because applications don’t have to live in just a single environment. Services don’t live in a single environment. You tend to work within your Dev environment and then go to a pre-production, and then production may actually be several environments. You may have a hybrid model where something’s in the Cloud one minute and then it’s on premise the next. Having the ability to have a model say when the world looks like “this”, this is how you need to deploy it. It’s incredibly important. But I think one thing that people often miss is that they think of only modeling the application. Not to sound like a broken record but the underlying infrastructure is incredibly important too. I’m a big proponent of modeling not just through applications but the underlying infrastructure that is a part of those applications.”
Replicating environments can be costly, warns Lim: “When you replicate environments from production to development or even if you have a staging, making sure that things are going to work, is expensive. A lot of money is being thrown in and things like security tools aren’t that cheap – your firewall, your application scanner, and all that stuff. So virtualizing actually helps a lot to cut that cost quite a bit. Also, what if your developers want to model something on their local machines, do they all get their own separate Dev mini-environments just for testing their local machines or, how do you guys actually define that? What is Dev? Is Dev local or is it a shared resource? Because everyone has their own definition of a development environment, so if we cannot replicate that production environment even at a smaller scale, maybe our laptops or just a simulation. I think that would help speed up even more of us pushing to production.”
Patterns for Success: Automate
Start small, says Caum “If you start automating the big painful things, you’re going to get extremely little value in return. You’re still going to be putting out fires all the time, you’re still going to be fixing problems that you shouldn’t be fixing, and you’re going to have less space in your process. If you start with the little things that you do all the time, that’s a great place to start. The other thing I see that people have fear of is Continuous Delivery. A lot of people I talk to say, ‘Well, we’re not ready for Continuous Delivery because we’re not ready for a complete automation of our test. We don’t have everything automated on our test.’ That’s a bad mentality to have. What I recommend people get started with is just get your pipeline, even if you don’t have any automated testing, get a pipeline, even if the testing phase is manual right now, get started there because once you have a pipeline you have a place to measure.”
Automation is about allowing humans to spend time on tasks they are actually useful for, explains Lim: “Automation is really about trying to make myself useless. The more I make myself useless, the more I’ve automated. So claiming back my busy hours, and trying to do things that I’m passionate or useful for – like decision making. In automation, you can always automate your tools. We see security tools as yet another tool or like a testing tool. So you want to automate all of that and bring out that vulnerability report or some kind of assessment to say, ‘Okay, do I accept all those risks?’ You don’t want to be manually doing all that for each individual application or environment so using tools in automation, it’s quite awesome that you can actually see all that at the beginning even before your release. That way you can actually stop it just in case it gets out the gate.”
“Automate automation” is the end goal for Priolo: “Automation is the bread and butter of what I try and do. My goal is to automate automation. And what I mean by that is almost when you talk to any team, any company, you ask them, ‘Hey, what would it take to automate your stuff?’ And they say, ‘What are you talking about, it’s already automated. I have 50 batch scripts and I just go ahead and click on one after another and it does it all.’ And then I point out, ‘Well, you kind of updated five other things to be more streamlined but let’s automate that.’ So I try to coin the term ‘automate automation.’ Automation as a whole eliminates the things that people add no real value doing or is a waste of those people’s times doing, and let them concentrate on what really matters to the company. And really reclaim the busy work time and allow us to move at speeds that were much faster than before, allow us to do things that we were not capable of doing no matter how much manpower we threw at it.”
Patterns for Success: Coordinate
Tie your coordination back to your project management, advises Caum: “The biggest success that I’ve seen with coordination is when people tie it back to their project management system. Whether that’s in a ticket management system or some way of tracking the requirements going in. One of the things that I talk a lot about is the importance that Continuous Delivery has on product management. Having the ability to understand exactly when something is going to go out helps with the planning process. If you tie that in to the beginning and understand that this feature needs to go out and it’s dependent on these different teams and these different pipelines in order to come together at this point and how you track that milestone is really, really critical for overall success especially when you get to scale. And scale here I’m talking about organizational scale, not technical scale. One of the big successes that I’ve seen is when people tie their pipeline changes and phases back into the tickets that are tracking that work and tracking the requirements, the application.”
Coordination comes down to culture, according to Priolo: “The culture of DevOps is what coordination really comes down to. When talking about coordination I was thinking more from a perspective of coordination between people and teams. And that’s really what it comes down to, is making sure that the teams are openly communicating with each other. It’s also making sure that they have the information that’s needed. They have the tools to make those informed decisions when they’re coordinating with each other. So right now, you have the formula where you have a bunch of operations teams, development teams, project manager teams, all working together. You’ve got to make sure they’re lined up in the right order. Before when we were doing it more manually, we were a little bit more operations heavy where operations would get involved late in the game. And what we did when we were switching over to more of a Continuous Delivery model is making sure operations gets involved earlier in the process. Realistically, the only time operations should be involved late in the process is if there’s something wrong and we have a five-alarm fire and we need somebody to look at it. That’s the only time we should really have them involved later in the process.”
Communication is key to successful coordination, says Lim: “Coordination, it really starts with communication. Communicate any changes you want to do in your pipeline. Tell them that you’re going to make this change so there are no surprises. It’s the same for security. If people are using bad libraries, people are doing things like opening their security firewall to the world because they didn’t know they would do testing. I don’t know what ports I need to open so I just open everything to everyone. Coordination takes a lot of education.”
Keeping operations in mind from the get-go is key to getting them on board for DevOps, says Fell: “DevOps is really great. Ops people sometimes are a little afraid of it. One of the ways that you could make them more comfortable with it is when you start with the end in mind and you start with whatever work we’re doing on the Dev side we need to make sure that it is building up in a rehearsable, repeatable way to whatever we actually need to provide in operations. This goes back to the whole idea of modeling and automation. If you’re modeling things that work in Dev but will never work in Ops, then you’ve failed. Because now you’re creating something that can’t be rehearsed without just having some sort of contortion at the last minute when you need to put it in the Ops world. Ideally, you’d have everything set up from a Dev perspective that you can just kind of punch…hit the button, get a cookie, hit the button, get a cookie all the way through.”
Watch the full episode here.
Want more Continuous Discussions (#c9d9)? We hold our #c9d9 podcast every other Tuesday at 10 a.m. PST. Each episode features expert panelists talking about DevOps, Continuous Delivery, Agile and more. Check out all past episodes and panelists here.