Originally authored by Marten Terpstra
With VMWorld 2013 in the history books, we would not want to be left out talking about overlay networks. VMWare officially announced NSX with few surprises based on all the bits and pieces of information that have been provided over the past few months. The past week has seen a flurry of articles on NSX and its capabilities, or overlay networks in general. A selection of links can be found below.
I want to start by being very clear on Plexxi's position with respect to overlay networks: we are all for them and believe we have a superior physical network to complement an overlay network. We absolutely believe that pushing certain responsibilities of network functionality into the vSwitch makes perfect sense. If you look at some of the elements that have typically created the most complexity in networks, or simply the most bulk in configuration (and therefore becoming the most prone to mistakes), having them automated and orchestrated by a provisioning system that holds most of the details makes sense. The most common examples are basic packet filtering for basic security, packet markings for QoS and the far more complex virtual L2 and L3 domains can, and in many cases should, be done in a place where CPU and memory resources are plentiful.
Having worked for switch hardware vendors for almost 20 years (and most of my years in product development, support and delivery), the challenge of a switch hardware vendor is always one of cost vs performance. Non-modular Ethernet switches have 2 main cost drivers: the actual ethernet switch ASIC (or multiple) and the CPU complex with its memory. Each and every vendor makes the same trade offs each time they architect a switch. A larger CPU and more memory gives the switch more capabilities, more options to maintain large routing tables, multiple L2 and L3 domains, more interfaces, more filter rules, more of everything and at a better processing rate. A larger or more flexible ethernet ASIC also provides more feeds and speeds, larger tables and more packet transformation capabilities. Customers always expect more for less, and that combined with a shift in need sometimes brings you to an inflection point where the way things are done must change drastically to meet those needs. Server virtualization provides that inflection point for networking.
To be clear, ToR switches could provide the filtering, firewall, load balancing and L2 and L3 virtualized services on behalf of virtualized servers, all it needs is more CPU, more memory and perhaps a different ASIC plus some added software to have it talk to a controller. That is exactly what hardware gateways will do for overlay networks and I am pretty sure those could be made to scale fairly well if you are willing to add the additional resources. But the choice to push the network edge functionality into the hypervisor is less about where it could be done and more about where it should be done. Knowledge of VMs, their IP addresses, the applications and port numbers in use, who they should and should not talk to, and everything else exists on the virtualized host and its provisioning system. It just makes sense to have the hardware switches do what they do best, transport packets in the most optimal way from source to destination and let the virtual switches deal with the complexities of forwarding domains, filtering, fire walling and load balancing.
And like @ioshints in his recent blog post, whatever startup scaling issues may exist will be solved. Having a few thousand encapsulation contexts is not an unsolvable problem. We should at some point do the math to see if the resources consumed across the servers to support this functionality do not add up to be the same or more than putting this in a physical switch, but that's for another day.
At the risk of repeating what I described in previous posts, we believe that how tunneled traffic is transported matters. We believe a network should never be agnostic to what it transports. It must be aware of the load it is carrying, it should adapt to the load its carrying. Traffic patterns don't change simply by putting VM packets into a tunnel. There will still be natural relations between applications, between VMs.
Physical switches with recent ethernet ASICs will be very aware of tunneled traffic. They can (and should) participate as gateways to the many devices that will not have vSwitch capabilities, and these same ASICs also provide the switch with insights about the tunnel, its endpoints, which virtual network this traffic is for and possibly more. In the most simplistic case this information will be used to create entropy for ECMP hashing. It can also be used as input to more complex forwarding calculations to satisfy the needs of the vSwitch and the traffic it is responsible for. We could actually physically separate all traffic for a specific virtual network from another. We can ensure that particular tunnel end points (perhaps those attached to storage arrays) have the lowest latency path available to them. Or find combinations of tunnel end points that consume far more bandwidth than others and provide them with more physical bandwidth. All in ways that do not require complex configuration or explicit knowledge. Let the tools do what they do best, this is DevOps at its finest. Combine that with the optical capabilities of the Plexxi switches and there truly will be light at the end of the tunnel.
Some relevant recent blogs and articles: