It wasn’t that long ago that networking talk was all about the cost of equipment. With CapEx as the primary pain point, everyone was talking about merchant vs. custom silicon, with the primary argument being that a move to common components would provide margin relief in what has to be the most margin-sensitive industry in tech.
Now, with the whole world seemingly converged on a narrow set of silicon (congratulations, Broadcom), the conversation has been shifting. It started with a subtle expansion of the cost argument to include more than just CapEx. OpEx has always been around, but it is getting more play of late in marketing circles. And the OpEx argument itself is becoming more fully fleshed out. Where companies used to tout the easily measured stuff like rack space, power, and cooling, increasingly the discussion floats over into the more operational aspects of managing a network.
We are at the point now that automation is the new god to which we all must pay homage. But are we tossing around the automation word a little loosely?
First off, we should all be clear about something: automation is not about saving keystrokes. Sure, as a result of a highly-automated, well-orchestrated infrastructure, you might in fact put fingers to keyboard a little less frequently. But automation ought not be done with the sole objective of typing less.
The problem here is that the things that people best understand how to automate are relatively simple tasks that are annoying to execute. Typing in the same command 27 thousand times appears to be the ideal candidate for automation. And in response, some hacker in the organization figures out how to replace the command with a small shell script or equivalent. What used to take 13 minutes to execute now takes somewhere on the order of 7 seconds. Multiplied by the 27 thousand instances in a typical year, the time savings is both quantifiable and quite attractive. “We should do more of that!” proclaims the CIO.
And off the team goes to identify more of these commands.
But there aren’t that many commands that are repeated ubiquitously, uniformly, and in enough volume to really make a difference to the bottom line. Once you retire a couple of heavy hitters, eking out continual OpEx savings by “automating all the things” becomes harder and harder. Why?
This form of automation preys on the repeatably identical task. When something is done the exact same way every single time, regardless of context (either situation or environment), it is well-suited for being replaced with an easier-to-execute task. But as soon as the task requires some cognitive input from the operator (knowing when, where, or how to do something), this type of automation is far less powerful.
It is tempting to attribute this only to things that are shell scriptable, but the world of automation includes way more. We all know companies that are still managing infrastructure with expect scripts. When we bring up this type automation, whoever is speaking almost always oozes a little bit of derision, because we all know that this type of thing is primitive.
But is combing output for fields really that different from applying templates to configuration?
When you provision devices based on some template, you are really just pattern matching (isn’t that what expect does?) and then applying some formulaic logic. But somehow if you can sprinkle in the phrase DevOps or toss around one of the sexy provisioning tools (Chef, Puppet, Salt, or if you are particularly in the know, maybe Ansible), it seems a whole lot more substantial, doesn’t it?
My point here is not to put down the DevOps tools. Instead, I want to point out that how these tools are used is important. If you view tools like Chef or Ansible as a means of cutting out keystrokes (read: pushing config), then you are likely missing the point of automation.
What these types of tools are really trying to do is much more profound. The power goes well beyond putting an agent on a device and then pumping that device full of config. What these tools are doing is allowing you to create logic (some of it more sophisticated than others) to make intelligent decisions about how to provision a device.
For example, a switch might behave differently depending on what is attached to it. We all know about the role of edge policy (VLANs, ACLs, QOS, and so on) as it relates to managing traffic on the network. So if a top-of-rack switch is attached to one type of device (or VM or application), you might want one behavior, and if it is something else, maybe you want a different type of behavior. It is not just the configuration; it is the right configuration based on the particular context.
This combination of context and intelligence is what makes automation powerful. And the more context that is available and actionable, the more fancy you can get with the automation.
This means that whatever automation framework you are using (anything from shell script to full DevOps environment) must be capable of both performing an action, and pulling in information to establish context for that action. We are quite focused on the first part, but the context is what will make automation more or less powerful. The act of executing a sequence of activities is interesting, but having logic to determine what to do is paradigm-changing.
Put differently, if your automation infrastructure is only capable of making left turns regardless of what is happening on the roads, no matter how elegant or fast the turns, you will still end up going in circles.