This article was originally written by Marten Terpstra at the Plexxi Pulse blog.
Much has been published about the Open Compute Project. Initiated by Facebook, it has become an industry effort focused on standardization of many parts and components in the datacenter. Initially focused on racks, power and server design, it has also added storage and now networking to its fold. Its goal is fairly straightforward: “how can we design the most efficient compute infrastructure possible”, a direct quote from its web site.
The focus of OCP has been mostly around hardware designs and specifications. If you look at the networking arm of OCP, you find several Top of Rack (ToR) ethernet switch hardware designs donated by the likes of Broadcom, Mellanox and Intel. By creating open specifications of hardware designs for fairly standard ethernet switches, the industry can standardize on these designs and economics of scale would drive down the cost to create and maintain this hardware. A noble goal and there are many opinions on both sides of this effort. Mostly referred to as “bare metal” and “commodity”, you can easily spend days reading up on many opinions. Mike Bushong yesterday discussed pricing implications for resellers in this blog post.
An interesting development however is that the network group within OCP has added a few software projects to its scope. Mellanox has initiated an Open Switch API specification, which attempts to standardize the lowest layer of interaction with the actual ethernet switching hardware. As a hardware vendor today, the choice of switching chipset is an important one beyond just the raw capabilities of the chipset. Each chip vendor that has a portfolio of chipsets also has a Software Development Kit or SDK that provides the initial layer of software to instruct the chip what to do. Ultimately every chip is manipulated with a (rather large) set of registers that need to be set, but that complexity is hidden in this SDK. The SDK presents a set of higher level APIs to program against that control the functioning of the chipset. This is what we (Plexxi) and all other hardware vendors write our software against, this is how we glue our software to the ethernet switching hardware.
Different chipset vendors of course have different SDKs, which makes changing chipsets a non-trivial step to take, your software has to adjust to a different SDK with all its functional and sometimes architectural differences. By creating a common switch API, life for us would be better, we would have a single API to develop against and could easily support multiple chipsets since the differences would be hidden behind this API. With so much functional difference between the chipset vendors, even these APIs will have lots of vendor specific extensions in them, but as recipients of such standardization, we can only welcome the effort.
The second software component the networking arm of OCP has taken on is the Open Network Install Environment. ONIE is a boot loader and installer based on a complete, but narrow linux distribution. ONIE was created as an open source, vendor agnostic boot loader with the desire to have these commodity switches leave their factory with only ONIE installed. Driven by Cumulus it creates a convenient and standard method to get software installed and loaded onto bare metal switches from commodity hardware providers. It’s a necessary piece for Cumulus’ business to ensure easy installation of their software onto third party hardware.
ONIE is pretty straight forward and those that use net booting using DHCP, TFTP and a variety of other means will probably say “what’s the big deal”? At a high level, it really provides the same functionality, but with some small, but very relevant differences.
ONIE is built on a full linux distribution. So out of the gate, switches with ONIE installed will boot to a complete linux before they even attempt to find the actual switching software image they ultimately will run. Having a full linux means you have full networking support, you can access the device using telnet, ssh, you name it. Once booted, ONIE will follow a set of search rules to find an image to download and install. There is no pre-conceived notion of what that image is, or what needs to be done to install it, the image itself contains the instructions of how this image needs to be installed, how the internal disk should be formatted,which partition to install into, you name it. ONIE has none of that knowledge, it just has the linux tools to do it. Once a switching software image has been installed, ONIE will reboot the switch again and now boot right through to the software image installed (with of course a chance to interrupt that).
A bit of boot loader, installer and some may even want to call it BIOS, all wrapped together in a linux distribution, ONIE is not magic, it’s not radically innovative. But it is a very convenient and open source way to get switches to boot, find, install and run software all with the safety-net of linux to get into the switch remotely if the software loading, installation or execution really goes bad. I bet most of you have had switches in crash and boot loops before, and they are hard to recover from without physical intervention.
At Plexxi we do not create standard switching hardware, that should be clear. We also don’t create standard switching software for commodity switches. But we are implementing ONIE on our switches and packaging our software to be ONIE installable, available a bit later this year.
For us this is a very convenient and (on its way to being industry) standardized way to implement a boot loading and installation environment that fits well with our open source and linux roots and beliefs. For us this is about providing convenience to our customers and logistics teams. We will have the option to ship our switches with ONIE so that the customer can simply power the switch on, plug in the management ethernet connection, connect our LightRail optical cables and the switch becomes part of a Plexxi ring loaded with the right software, visible to our controller.
It’s not a huge innovative leap, but it’s a significant convenience. And there is lots of value in creating convenience and simplicity, they are significant drivers in our overall solution. Which makes ONIE a nice fit.