DevOps Protocol: No Manual Changes
Join the DZone community and get the full member experience.
Join For Free You’ve heard about Devops and you like the idea. But how can you grow a Devops culture in your organization? In my series about Devops Protocols I talk about the fundamental building blocks for growing a Devops culture.
No Manual Changes refers to the behavioural trait of not
messing with any productive systems. Let’s discuss why messing with
production systems is bad and what to do about it.
Manual Changes Lead to Configuration Drift
You know how this goes: Your servers are under heavy load and you
just want to kick up the number of worker processes a bit. You ssh into
the first server and just start a bunch of additional workers. Now you
sit back and see whether the box is behaving as expected. You find that
all is fine and make a mental note that you need to persist your change
in the startup script of your service and you need to roll out that
change to all other servers. But your mental note is quickly forgotten
because another urgent issue demands your full attention…
You end up with one server configured differently from the rest of the
bunch. Even worse, your configuration is not even restart safe. As soon
as anyone starts an automated configuration run or restarts the box your
changes are gone. In our example case this might lead to the box
crashing under load without anyone understanding why.
Automate All Changes
The way to go is obvious. Change the number of worker processes in
your configuration management tool and let it reconfigure the box for
you. This approach makes sure that your changes survive restarts and
configuration runs. And it makes sure that your teammates see what
you’ve done if any problems arise. Even your developers are now able to
see that their code might need optimizations because it requires so many
worker processes to run (way more than they initially expected).
Congratulations, you’ve made one more step toward Devops collaboration!
No Rule Without Any Exception – Really?
You might argue that my example is a bad one. If your production
systems are in trouble, there’s no time to lose in going the extra mile
of automation. Jump in and fight the fire!
While this approach might look reasonable at first glance, it is a very
dangerous one. Especially when in fire fighting mode, you might not
remember all the changes made to your production system. You think your
problems solved but in reality they’ll come back worse than ever (and
sooner then you expect).
You need to prepare in advance to avoid the need to go in and hot fix. Setup log monitoring services like splunk or graylog2
to be able to analyze what is happening. And you should have your
configuration management tool (like Puppet or Chef) setup so that you
can try out possible solutions without having to go in and do it
manually.
How do you avoid manual changes to your production systems? Do you “electrify the fence” as described in the Visible Ops Handbook? Please let us know in the comments!
Published at DZone with permission of [deleted], DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments