For folks accustomed to the original blogs we post daily, our industry is in the throes of VMworld bliss, so we will be running the Best of Plexxi this week. Of course, if you haven’t read this one yet, it’s new for you! Stop by Booth 747 if you are in San Francisco this week, and say hello!
The SDN dialogue remains fairly focused on automation. This is actually a good thing; the manual nature of networking combined with an increasingly complex solution space means that networking is in dire need of some help on the automation front. There has been a meteoric rise in the number of companies building tools for this very purpose. The increased attention gives me hope that this problem is on its way to eradication (though I admit it will take years to be fully addressed).
But what comes after automation?
There is no doubt that there are increasing demands on networks. Not surprisingly, this means that networks are adding capacity at breakneck speed. Many people are pinning their hopes on an industry trend towards cheaper port prices on higher-bandwidth Ethernet. The school of thought seems to be that if pricing can come down as aggressively as demand goes up, the whole solution ought to be affordable.
But demand has no upper limit, and pricing ultimately has a floor based on component costs and the margins that the market will tolerate. Scarily, the company pushing pricing the most aggressively (at least in the switching space) appears poised to file for an IPO, which might increase external pressure to maintain (or even grow) margins. The point is that if we think cheap bandwidth is the answer to all the capacity problems, I think we might be in for a collective disappointment.
At some point, we will want to move more toward utilization, getting more out of existing capacity and curbing the need for new capacity to some extent. In fact, Google’s much talked about SDN deployment was aimed precisely at driving utilization up between their data centers.
So the question we need to be looking at is how do we get improved utilization?
People have tried to crack this nut more than a few times. Most of us in the networking biz will immediately look at traffic engineering, but we all know that it is complex and difficult to manage. There has been success with MPLS, but that has been more about expediting critical flows and less about improving capacity utilization. MPLS has been tremendous, but more is needed.
Ultimately, the key to getting more out of the network is going to be more intelligent path decisions. That’s not to say that all traffic will have micro-optimized paths – that is an application experience issue (and one that also needs to be solved, but not for all traffic). It seems more likely that we will see a move away from the shortest path algorithms that have been around for more than 50 years (Dijkstra dates back to 1959 for those who didn’t know). I believe we will need to move to more sophisticated algorithms capable of using more than just a tiny subset of possible paths in the network.
Today, you have to build out capacity that basically accounts for an any-to-any connection. That capacity is uniformly distributed because traffic is assumed to be random and, in aggregate, uniform. I don’t believe that assumption is true, but that’s a different post. So you build all this capacity, but then you use only the shortest paths between points, relying on ECMP to load balance when multiple equal-cost paths exist. But what about all that other capacity?
If pathing decisions considered all available capacity rather than shunting everything through the same subset of paths, you could see a more even distribution of traffic. Imagine knowing all possible routes through the network and then intelligently distributing traffic across those routes, allowing critical traffic to prefer the shortest routes and offloading residual traffic to any other available path. What you would see is a more even distribution of capacity.
This isn’t a completely new idea by the way. If you ask Google what they are most interested in from an SDN perspective, I suspect they might include a protocol called Path Computation Element Protocol (PCEP, referred to commonly as just PCE). PCE basically has a central server that generates paths and pushes them down to a client on the router. I bet what Google would like is to own the algorithms that drive path selection, and then plug into an open source PCE server/client implementation. Why? Because if the network is your business, those algorithms provide tremendous competitive advantage. If you can drive costs down by driving utilization up, you get a sustainable cost advantage over everyone else.
And if it is all about algorithms, guess where the future lies? Not in protocol savants but in mathematicians. Networking is still going to rely on three-letter acronyms, but my guess is that the most important one won’t be BGP but rather MIT. Or Stanford, Harvard, Princeton, Berkeley, and Cambridge.
By the way, I will point out that Plexxi’s CEO lives in New Hampshire. And most of the software developers work in Nashua. So why would we bother opening an office in Cambridge? It’s a mystery.