Networks are notoriously fragile. There is a reason that the primary means of combatting downtime in many enterprises is a strict set of change controls designed to throttle any changes that might pose a risk to the network during critical times. And the more complex the network, the more draconian the controls.
But if we are to collectively get more from our networks over time, we will first have to give up some of the control to which we have become accustomed.
Hundreds of Knobs, Thousands of Devices
It's a minor miracle that networking works in some cases. Consider what it takes to get a network of even moderate sophistication up and running. You have to configure hundreds of knobs across thousands of devices just so. It's no surprise that once everything is up and running, the most common strategy is to step slowly away and try not to touch anything for as long as possible.
The problem is bad enough in the general case, but it's even worse for large enterprises. While many people probably think that the cloud providers operate the most complex environments on the planet, it's actually enterprises that wear that crown. With decades worth of applications distributed across diverse environments and supporting multiple lines of business, enterprise IT represents the most complex environment imaginable. The major cloud players are not burdened with as much legacy infrastructure. They were allowed to start clean and with total control over the application ecosystem they served. And that has allowed them to maintain greater architectural control.
It's instructive that with that control, they have opted to keep it simple. They have learned a lesson from the bumps and bruises that architects, engineers, and operators have had to endure. The only way to play for the future is to make complexity Public Enemy #1, and to fight it diligently at every turn.
While the primary weapon against complexity is removal (fewer devices, fewer applications, few protocols, fewer everything), not everything can be subtracted. So once you remove anything unnecessary, you have to do something else. That something else? Abstraction.
A simple working definition for abstraction is to make many things look like a smaller number of things. In the classic data center sense, this is why fabrics are so important. If you can make 128 devices function as a single device, then you have a single place to go to make changes, or to poll for information, or to handle integration. By reducing the number of work surfaces, you simplify the management problem.
One of the major premises behind SDN was to provide an even greater degree of abstraction, extending the boundaries of the abstracted interfaces across multiple domains. Indeed, centralized policy management extends the benefits of a fabric across a broader section of the infrastructure.
But abstraction is not only about controllers and fabrics. I would argue that what most companies consider automation is actually abstraction. Writing a script to automatically execute a sequence of 34 commands is taking 34 things and making them look like a single thing. That is the power of abstraction.
In networking, the issue with abstraction is that we have built up most of our discipline around pinpoint control over our operating environments. We have grown accustomed to tweaking behavior to handle snowflake workloads. In our desire to mitigate the frailty of networks, we have made point changes that get rolled out to the smallest subsection of the network possible (just this site, or just that application). And in doing so, we have actually made things worse.
While any one of these changes can be justified, in aggregate, we have segmented infrastructure so that there is no common expression of behavior that applies to broad swaths of the network (or networks). If everything is unique, how do you abstract anything at all?
The answer for most people is that you don't. If you cannot live with trading off control, then abstraction is simply a bridge too far. You might try to find common elements across domains that can be abstracted, but they will tend to align to tiny, uninteresting bits of configuration-things like gateway addresses, global filters, and basic protocols. These things tend to be common because they seldom change, so even though it is good to abstract them, it is not terribly difference-making.
The Lowest Common Denominator Problem
People who are skeptical of abstraction tend to point to the lowest common denominator problem. They will (rightly) call out that reducing control to the baseline capabilities that exist everywhere means removing a lot of functionality from the network. On this point, they are right.
This raises an existential question for people architecting environments: To what extent are you responsible for all the point elements of the infrastructure over the behavior in aggregate? There are no absolutes here. But which side should you favor?
If you view IT as fundamentally subordinate to everyone else, then you probably come down on the side of supporting every request. Enterprises that fund IT using a tax model (allocating funding as a percentage contribution from individual lines of business) will tend towards this model. There is nothing inherently wrong with this model, but it does mean that IT will have a necessarily difficult time driving unique strategic value within the company.
If, however, you view IT as a strategic enabler for the business, then you probably subscribe to the belief that IT needs to have a say in helping shape the direction for the company. In this case, IT might work with the lines of business to retire aging applications and upgrading sites so that the whole of the business can move with greater efficiency (either lower cost per whatever unit you are measuring, or faster time to deploy for new infrastructure components).
Boiling this down a bit, it could be that the real objective in the lowest common denominator problem ought to be to raise the lowest common denominator so that it represents something more universally useful.
Beware of Leaky Abstractions
While it seems obvious to me that companies ought to favor agility over some of the point applications that currently drag everything else down with them, I do worry about the consequences of swinging the abstraction pendulum too far.
Joel Spolsky wrote awhile back about the concept of leaky abstractions. His example is TCP, which guarantees reliability despite running over IP, which does not. You should read his blog post on the topic. But his central thesis is that every abstraction will have some amount of leakage, meaning there will be an eventual need to understand the underlying working dynamics. Relying on TCP's reliability is fine, until there are issues with it. At that point, you have to dig into the IP side of the world.
Oddly enough, this means that while we will naturally have to give up control to broaden the usefulness of abstractions in the pursuit of agility, we will find it is dangerous to give up knowledge.
Enterprise IT might be expected to deliver on the support of point applications, but broader success will ultimately be measured by IT's ability to keep pace with the company's needs. The reason Shadow IT is even a thing is because IT has been historically ill-equipped to handle the pace of change, due in part to outside forces but also to things fully within IT's power to control.
If agility is indeed the new TCO, then IT will have to adopt abstraction as a tool to get faster. And that means putting to rest some of the snowflake deployments that weigh down the infrastructure team over time. While any one of these might be important, in aggregate, they are sapping life from the company. Left unaddressed, they will render the infrastructure incapable of supporting the business.
Abstraction doesn't mean going straight to controller-based architectures (though that is an option). Minimally, enterprises need to be looking at fabrics in the data center and cloud-managed CPE across the WAN. Both offer useful steps towards a more abstracted operating environment. And they will seed the cultural change required for IT to move from obsessive control to something more useful for the company.