The white box switching movement appears to be gaining some momentum. For some, it is a fait accompli that hardware and software will be meaningfully separated, allowing users to procure each independently in a model that more closely resembles what happens on the server side. The fires are fueled by projects like Facebook’s Open Compute Project, an effort to open source compute reference platforms.
I don’t actually want to debate the merits of a decoupled hardware/software model. The strategic implications are far more interesting, and how that might shape vendor activity over the next couple of years is downright fascinating.
The major impetus behind the move towards white box is cost. And the primary cost driver is CapEx. By separating the intelligence from the raw switching capabilities, switches can be relegated to relatively dumb boxes, optimized for price and performance. Certainly the industry consolidation on Broadcom has helped this trend out. Minimally, it has primed the pump to help streamline commodity purchases and deployments.
But beyond pure cost, there are a couple of other reasons this trend is important. The one touted most frequently is that the decomposition of the networking stack into its constituent elements allows for more choice and flexibility for customers. Theoretically, customers who are more interested in optimizing the stack for some outcome other than lowest cost could cobble together some combination of hardware and software well-suited for their specific needs.
The choice and flexibility argument, at least for now, is specious. Flexibility only matters when you can swap in two things that are functionally equivalent but meaningfully different. The fact that Cumulus software runs on a reference architecture manufactured by Quanta or Accton is not very meaningful if both are building essentially the same box. The only choice in this case is between manufacturers, but that doesn’t do much toward providing optionality. Sure, it will keep both vendors honest when it comes to cost, but customers are still ultimately locked in to that reference architecture.
That’s not to say that the Cumulus approach is without merit. Credit Suisse (see page 88) estimates that a Cumulus solution, when measured on a per-device basis, saves a whopping 70% in total costs (accounting for capital and support costs). This alone is going to be enough to win eyeballs. But if the long-term game is optionality, we should expect Cumulus to bob and weave a bit.
What might they do?
I have no real knowledge of what they will do, but in my opinion they need to add more horses to their stable. They need to find differentiated hardware solutions that run their software. If they can find even two or three hardware platforms, ideally with different value propositions (not just the cheapest), they will be able to promise customers the flexibility they want in their underlying hardware while preserving a cost structure that could be significantly different than legacy offerings.
The challenge in executing against this type of strategy is that you have to first make sure the base reference platform (and derivatives) is solid. There is a ton of work in dealing with all the interface support, for example, required to satisfy all of the commercial variations of the first reference platform. Once Cumulus burst on the scene, there was going to be a bunch of requests for very specific configurations, which requires all kinds of driver support. Not very sexy work, but building a business is ugly at times.
But what happens to the TCO advantage if Cisco lowers their prices? If Cumulus has a 70% TCO advantage over Cisco, won’t they just take over a bunch of accounts?
Forgetting for a moment that not everyone wants to (or can) roll their own deployments, there is an interesting point in that Credit Suisse report. If you look closely, they are estimating Cisco support costs to be $6k/year, comparing that to Cumulus’ combined license/support costs of $1.5k per year. It’s not just a CapEx play here. Cumulus is cutting prices on support. Essentially, they are competing on OpEx more than CapEx. In the Credit Suisse model, the hardware difference is $14k total but basically only $2k per year when you consider the life cycles they use. Compare that to the $4.4k difference in software licensing and support costs per year.
Cumulus has very cleverly allowed the industry to talk about the hardware costs while they have built a business model around OpEx. The question is going to be what happens to the total TCO costs when you include management and staffing, integration with other systems, and professional services type engagements. I don’t actually know the answer, but I love the direction, because it points the entire industry the right way. The pure CapEx focus that has dominated launches this year is nuts.
There will be an interesting phenomenon as the white box trend gets more attention. Other vendors will begin to paint their solutions in a similar light (not unlike the SDN washing that happened earlier this year). Companies that offer vertically-integrated products will likely claim their software is separate from their hardware. If they can create a plausible scenario where hardware and software are separated, they can seize on industry momentum.
But be careful here. Saying that software could run on other hardware is one thing. Actually delivering against that vision is another entirely.
For a network operating system to run on other hardware, it is about more than abstracting out the hardware. The hard bits to solve go well beyond modular software architectures. You have to build a release process that delivers a binary to your partners. If you claim that your partners can innovate on the hardware side independently, then you have to give them the ability to modify and extend the software. This is not just APIs by the way. How does the partner do things like debugging if they don’t have access to the source code? Is the code sufficiently instrumented? And then how is this software released? How are bug fixes administered? What do you do with package signing and builds? How are issues handled by support? It is not impossible, but it certainly is harder.
My point is not that this is not doable, but rather that it is much harder than just claiming that a network operating system was designed with this end in mind. The legal issues around IP protection, the licensing issues around open source or even purchased code, security issues around secure software life cycles, developer support challenges around maintaining a proper developer tool kit… the list goes on. Customers who are interested in a decoupled hardware/software offering ought to be asking a lot of questions.
So what does all of this mean?
I think the industry is poised to battle more openly on OpEx. The CapEx arguments are real, but the convergence on a small set of similar hardware is going to make the CapEx differences between platforms far less interesting than the OpEx considerations. And once we transition to OpEx, you will see a whole bunch of jockeying for position as people test out what resonates in the marketplace. Cumulus has seized on support costs. Arista and Juniper have touted provisioning automation.
Customers can get ahead of this if they instrument their own environments and find out where the real OpEx costs are. Knowing your own environment would seem to be the best way to not be unduly persuaded by what will certainly be a strong show of marketing force in 2014.
[Today's fun fact: Every year about 98% of the atoms in your body are replaced. So when people tell me I am half the man I used to be, they are really overestimating.]- See more at: http://www.plexxi.com/2013/12/network-operating-systems-white-box-switching/?utm_source=rss&utm_medium=rss&utm_campaign=network-operating-systems-white-box-switching&utm_reader=feedly#sthash.HO3O7k3y.dpuf