This article was originally written by Marten Terpstra at the Plexxi blog.
In the past few weeks you may have seen several press releases and articles talking about 25 Gbit Ethernet. Just when you got used to ethernet speeds being a nice decimal based system where we simply add zeros every few years, someone threw in 40GbE a few years ago. And that’s ok, powers of two we can deal with, but 25? That just does not fit in our mental model of Ethernet.
The driving force behind 25GbE Ethernet is actually fairly simple and straightforward. If you open up an ethernet switch (small, large, does not really matter), you will find that all the high speed components are connected using serial links called SerDes, the rather boring concatenation of SERializer and DESerializer. The serializer is a piece of logic that takes data to be transferred and serializes it, the deserializer sits on the receiving side and reconstructs the serial stream of bits back into data for the ultimate receiver. Between the two, there are some basic encoding mechanisms to keep their clocks synchronized, some basic framing and a few other things. Google for 64B/66B encoding if you really want to understand the gory details.
Gigabit and 10Gigabit ethernet runs over these SerDes connections between components. In a typical 10GbE Top of Rack like switch, the Ethernet switching chip (everyone has heard of Trident2 as the market leading chipset in use today), the actual ethernet ports are SerDes connections coming from the chip (128 of them for Trident2, each representing a 10GbE equivalent port). These connections are then used to connect to other Ethernet of fabric chips (in the case of chassis based systems), or directly to the cages SFP+ and QSFP optics plug into. Communication between an SFP+ in the front of the switch and the switching chip runs on top of one of these SerDes connections.
As you probably figured out, the components used in today’s switches all run SerDes with a clock rate around 12.5Ghz, providing that 10Gbit transfer rate between the components across each (allowing for the encoding overhead). Until recently, that speed was about the state of the art to run these serial links across short distances (this is all inside of a single device) within acceptable signal loss and cross talk ranges. Signal integrity is not one of my strong points, so that’s about the best explanation I will give you.
10 to 40 to 100
With that 10Gbit building block we have created higher speed interfaces. When you look at a 40GbE interface, it is constructed out of 4 parallel SerDes links between the Ethernet chip and the QSFP pluggable. Even when leaving the QSFP onto fiber, it takes 4 parallel 10Gbit streams to transport this to the receiving QSFP. The short reach QSFP interfaces use 4 pair of fiber between them, and their copper Direct Attach Cable (DAC) equivalent carry the same on several copper cables inside the big cable. Longer reach QSFP interfaces put the 4 10Gbit streams onto separate Wave Division Multiplexing (WDM) waves which can be carried over a single pair of fiber. This is part of the reason why QSFP optics are fairly expensive still, especially for longer distances. Distribution of the bits across these parallel paths is done on almost a bit by bit basis by the hardware and has nothing to do with the packet based distribution we know in Ethernet.
Similarly for the 100GbE interfaces that are available today, these are really constructed out of 10 parallel paths of 10Gbit streams. Similar to the 40GbE example above, these are carried across 10 pair of fiber, or multiplexed together into a single fiber. Of course that also comes at a cost.
In the past 2-3 years, technology has advanced to the point that 25Ghz SerDes have become economically viable, and all of the usual physics problems in signal integrity have found solutions. This now means that we can push data 2.5 times as fast across those serial links, and ethernet chipsets due in the next year will start to have 25Ghz SerDes ports on them rather than 12.5Ghz ports. Once you have these ports, you can of course still run 10GbE across it, but you would not use all the capacity of that connection. 40GbE will then have the option to run across 2 parallel 25Ghz SerDes, rather than the 4 required today. And that translates into less cabling between devices. Similarly, 100GbE will move away from the current 10×10 implementations rather quickly to 4×25, for the very same reasons. Less parallel paths, less fibers, less optics, less everything.
Which then leaves the question, if there is this basic 25Ghz building block that we intend to use for 40GbE and 100GbE, why would we not want to use it in and by itself for 25GbE. As a single signal, it would provide a 2.5 performance boost in an SFP+ form factor without doing anything complicated. It’s like taking 10GbE and simply run it faster, one the hardest part has been solved, running a serial signal that fast.
And then there is 25
And there is your long winded but rather straightforward reason for 25Gbit Ethernet. Independent of ethernet, serial I/O technology has created an extremely useful building block that runs much faster than its predecessor. IEEE in its standardization of 100GbE already assumed 25Ghz serial I/O capabilities and has layered its definition of 100GbE on top it (the 10×10 available today is mostly a placeholder, make sure you ask your vendor what flavor of 100GbE they provide). But that same IEEE never went back to re-apply that 25Ghz technology to 10GbE and 40GbE and turn it into 25GbE and 50GbE. With lots of the foundational work done as part of the 100GbE specifications, this is not the tremendous 4-5 year effort that most IEEE standards efforts take.
The vendor industry has taken it on themselves to move this along outside of IEEE with a 25GbE consortium. There are several parts and components required to create complete 25GbE ethernet solutions. The ethernet chips will start to have them within a year, we then also need pluggable optics and perhaps even Direct Attach Cables to support the native 25GbE and its 50GbE sister, and of course server NIC cards need to support this as well. This is one of these efforts that requires a relatively small development across all of these components (emphasis on relatively) with a fairly quick 2.5x performance payoff at the end. As a consumer, 25GbE and 50GbE provide will provide you with a substantial performance boost in your datacenter server and storage environment with less cabling at a cost that in my opinion will get to small premiums over 10GbE fairly quickly over the next few years.
At Plexxi we fully support the 25GbE efforts, there is very little if anything negative associated with the push to productization. We will quickly embrace ethernet chipsets that support 25Ghz SerDes and the optical components that help us drive our optical fabric to higher capacities. The IEEE has always been the one and only standardization body for anything Ethernet, but it has been sent a clear message by the industry to move a lot faster. I have no doubt that that same industry will drive 25GbE to commercial success because it just makes sense.