Configuration (Mis)management, or Why I Hate Puppet, Ansible, Salt, etc.
Why are tools like Puppet, Salt, and Ansible so unintuitive but used by so many companies and developers?
Join the DZone community and get the full member experience.Join For Free
This is going to make a whole bunch of people angry so before everyone gets their pitchforks ready, here’s some background information to set the stage and convince the more reasonable among you that I’m not just talking out of my butt.
At my very first programming job I was basically tasked with writing some tests for power plant control software. The place was a Microsoft shop (Visual Studio, C#/.NET, SQL Server, Virtual Machine Manager, PowerShell, Active Directory, etc.) and it was before Microsoft became cool again by open sourcing all their awesome tools. I went into the job thinking "who in their right mind would be using Microsoft tools when the open source world was full of such goodness?". After a few months I realized how wrong I was and to this day I have not found a more productive stack than what I had at that first job.
It didn’t take me long to figure out that to do my job properly I needed what the professionals called configuration management. I of course did not know the proper terms and had no preconceptions about what I needed to do to get everything in working order. I just asked myself what is the bare minimum that I need, and what would an automation framework for doing what I need look like.
Well, the first thing I needed was a computer that was in a known good state and had the latest version of the project deployed on it. The second thing that I needed was a tool to do that. Fortunately the folks I worked with were smart enough to generate versioned artifacts and put it somewhere accessible on the internal network. They were also smart enough to version their database schemas. Those two pieces (code and schema versions) were the first half. The second half was a working computer and since we were using Virtual Machine Manager that was, in theory, taken care of as well. All I had to do was orchestrate the process somehow, and PowerShell with it’s remote management capabilities was just what the doctor ordered. So I wrote a basic PowerShell script to glue some bits together and set up a pool of VMs running the latest version of the code at a press of a button. It took me about a week to learn PowerShell along with all its associated APIs for talking to Virtual Machine Manager and SQL server to get them to do what I wanted, and another week to weed out all the bugs and set up proper recovery mechanisms.
Once I had things working, what used to take a day or two now took less than a few minutes. The feedback loop from cutting a version of the software to having it up and running was shortened and developers no longer lost the context of what was going on with a specific version of the code. So it was a win all around.
Throughout this entire process I had no clue what Chef, Puppet, Ansible, etc. were all about. I just did what felt like the obvious thing to do and my intuitions were right. Reducing feedback loops in software systems is always a good thing, and I took the pieces I had and worked towards that goal. I was young and had no religious affiliations with any tools. If something was easy to understand and helped tighten feedback loops then it was a tool that I wanted in my tool belt.
Some time later I got a job at another place with a substantial raise. The first place was shortchanging me by quite a bit and at the time it seemed ok but when I found out how much I was gonna get paid in the vicinity of SF I could no longer justify the low salary. So I packed my bags and moved to the technology echo chamber.
The new place had quite a few smart folks and I learned a ton from them just like at my first job but this time around things were slightly different. The stuff I had built at my first job taught me that my intuitions about software systems in general were mostly correct so I was more sure of myself at the second job, but I was still not sure enough to assert myself when it came to design and architecture.
I had no product development background so at this job I was tasked with stuff related to the build and deployment pipelines. The only things I had on my resume were about building solid pipelines and orchestration mechanisms so this was a pretty good fit for what I had done at my first job, except this place was not a Microsoft shop. Fortunately the folks that hired me were smart enough to realize that solid engineers come from all sorts of backgrounds and didn’t hold my Microsoft experience against me.
Their pipeline for building deployable artifacts revolved around Debian packages with basic pre- and post-install scripts. The guy that had built this part was pretty smart and he had made all the right decisions for this part. The part where he messed up a little bit was with the configuration management. He had looked around him to see what everyone else was doing (Puppet) and had just copied that, thinking he could rely on ambient knowledge and experience. He probably should have trusted his intuitions a bit more and not bowed to peer pressure, but by the time I got there the system was in place and it was working, except for one little issue.
Whenever we had to make a change in the production environment related to deploying the application it was more hassle than intuitively felt justifiable to me. At the back of my mind I had a few nagging questions and thoughts: why are build/ops folks making decisions about how the application should be deployed and configured? I don’t tell them what to put in their requirements.txt, so why am I the one making decisions about how to deploy and configure their application? Something here doesn’t add up. I’m never one to just let cognitive dissonance go unresolved, and when the dissonance rose to peak “what the hell” I dropped whatever I was doing and started prototyping the kind of system I would want to use if I was writing the application.
My idea was very simple. Just like the application had requirements.txt for making the virtualenv and putting it in the Debian package, I was going to do exactly the same with the configuration that was managed with Puppet. The problem is that Puppet does not like running in standalone mode (it can be done but it's too much hassle, and I’ve since learned that if a tool takes something obvious and intuitive and turns it into a hard problem then that tool should be replaced as soon as possible). I didn’t have to write anything new because Chef combined with librarian-chef would do exactly what I needed it to do without putting up bullshit barriers. Just like there was a requirements.txt there would now be a cheffile that listed all the configuration dependencies of the application. At build time I would pull in all the recipes and hook it up to the post-install script. I forget the actual call syntax but I basically just added a line to the post-install script that called chef-solo and ran the recipes. After convincing a few folks that this was a good idea I converted all the application specific bits in Puppet to Chef recipes, pointed the developers at the recipes, and told them they were now in charge of all configuration related to their application.
At first there was some resistance but everyone warmed up to the new way of doing things quickly enough. Not only was this a much better way to do things but it was also locally testable. No one had to learn any Puppet DSL BS to figure out if what they were doing made sense. They just had to write some Ruby and then run it with chef-solo and see the results directly. If something didn’t work, the error would tell them exactly where things went wrong and instead of trying to make sense of Puppet DSL errors, they would get a callstack and the exact line where things were failing. Remember the bit about feedback loops? This reduced the feedback loop and put the people that should have been in charge of configuration in charge of configuration. It was again a win all around.
Since those two experiences I’ve worked at a few more places and I’ve seen the same mistakes repeated over and over again: centralized logic where none is required, weird DSLs and templating languages with convoluted error messages, deployment and configuration logic disembodied from the applications that required them and written by people who have no idea what the application requires, weird configuration dependencies that are completely untestable in a development environment, broken secrets/token management and the heroic workarounds, divergent and separate pipelines for development and production environments (even though the whole point of these tools is to make things re-usable), and so on and so forth.
I sometimes wonder if I’m the only sane person left in the world. How is it that these things are obvious to me but not obvious to others? I don’t think I’m that much smarter than the people that built these systems and tools, but how the hell did they screw up so badly? Why do I need a centralized store for configuring an application server? Why isn’t the default mode to run everything standalone and then throw up huge warnings telling people they should think long and hard about using a centralized configuration server? Why don’t any of these tools have an obvious local testing story? Why do I need to go through heroic efforts to set up a testing environment to verify that the changes I’m making are not gonna choke on a syntax error and then give me some convoluted error message? Why aren’t these tools built for developers first and operations folks second? What is the point of shrouding the obvious in YAML or some other weird DSL? Why can’t I package and version the configuration the same way I can package and version an application, and then deploy it with apt-get, yum, etc., and why isn’t this part of the toolset? Finally, and most importantly, why am I the only one that seems to care about these things?
So that, ladies and gentlemen, is why I hate anything and everything that uses YAML or some other weird custom DSL to do the obvious. The current top offenders are Salt, Puppet, and Ansible and they’re gaining more followers by the day. Everything that seems intuitive and obvious to me is either an anti-pattern or straight-up impossible to do with any of those tools. Somehow, I’m the only one that ended up with the right set of experiences that taught me to avoid all the things these tools champion. I’m sure the madness will stop at some point but not before a whole bunch of hair is lost trying to figure out why some snippet of YAML is not creating the right Unix user.
Published at DZone with permission of David Karapetyan, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.