Originally written by Marten Terpstraat the Plexxi blog.
Over the past few weeks I have had several conversations related to calculating network topologies and how packet forwarding is done based on those topologies. I wrote this post about a year ago explaining some of these details, but after a conversation with a customer earlier this week, I wanted to explain in a little more detail and relate it not only to Shortest Path First methods, but also to more traditional L1 traffic engineering and path creation.
Any network can be represented as a graph. The switches in the network are the vertices or nodes in the graph, the links between them the edges. The graph now represents the network and graph theory can be applied to find paths between any two nodes in the graph. Edges can have weights or metrics associated with them, or any other set of constraints that you may want to articulate for that edge (latency, bandwidth, cost, etc). The goal of any path algorithm is to find the best path or paths from a node to all other nodes.
The most common algorithms in use in networking today (OSPF, ISIS) are variations on a Shortest Path Forwarding algorithm or SPF. Published by Dutch computer scientist Dijkstra, the most popular algorithms simply find all possible ways to get from one node to all other nodes, where each edge has a metric value. The shortest path is the path for which the sum of all edge metrics is the least. That path is now the preferred path and will be used for packet forwarding. The algorithm itself guarantees that no loops in the network are created in this path selection and when all switches run this algorithm independently, they each now have a view of the shortest paths to all other switches.
Dense versus Sparse
In networks that are relatively sparse in connectivity (i.e. there are not a lot of links connecting switches together), the amount of possible paths is relatively small and determining which path is the likely one to be picked is fairly straightforward, and you can create a metric map for all the edges that will give you the results you expect.
In networks with much more dense connectivity, the amount of paths starts to increase almost exponentially to the point where no pen, paper or human brain can determine all possibilities, let alone the right ones. Even in relatively small mesh networks (which is what a Plexxi network is), there are easily 100s of potential paths between any two switches, 1000s even. Some are direct, some may be indirect. Trying to assign metrics to links to drive a specific path selection becomes pretty much impossible and you need higher level descriptions of desired paths.
In traditional long haul transport networks where the amount of links is fairly limited due to cost, you will find fairly sparse connectivity between network elements, and with it a desire to be very explicit in creating paths between these endpoints. In metro networks, connectivity is more available and the connectivity becomes denser, and the desire for hand selecting paths becomes less. Move this into a local environment like a data center, and connectivity density will drive path selection to pure math based selection.
Link diversity and characteristics
Not all links are created equal. As I mentioned above, most algorithms allow some metric to be associated with an edge or link in the graph, and the best paths are the ones with the lowest summed metric. But what if you want to articulate more than just some metric? What if I want a path that has the most bandwidth, regardless of the amount of nodes and links? What if I want to set aside certain paths for certain network flows and make sure no other traffic follows those same paths? How do you combine these metrics? The definition of “best” all of sudden becomes less obvious. This is where more complex algorithms come in, those that take other constraints into account and not just calculate the shortest path from any node to any other node.
Plexxi Network Topologies
In a Plexxi network, Control collects the graph that makes up the network fabric. Control discovers all the switches, and the switches discover all links between them, there is a simple peering mechanism that runs across links that connect Plexxi switches. The combination of switches and links is turned into a graph very similar to any other link state protocol out there. But then we take a different approach.
In traditional shortest path selections, there is no notion of the amount of traffic that is expected to flow across the paths selected. Which means that paths that flow through multiple nodes will impact each other because each node may contribute to the load on that link. The path selection has no notion of traffic or expected traffic and simply creates that path, or several parallel paths of the same cost in the case of ECMP.
The Plexxi path selection is based on traffic modeling and calculates paths based on how much traffic is expected to travel across each edge between nodes, on each link between switches. By incrementally satisfying the need from each node to each other node, the algorithms will find that one edge that is shared between multiple end to end paths may be overloaded, and will find different edges for some of these paths. End to end paths between switches in a Plexxi network may be direct, through an intermediate hop, or multiple intermediate hops. When calculating what traffic needs to go where across all these paths, paths may be claimed by a specific flow or set of flows, and removed from consideration for any other traffic. Or when you want to avoid certain links except for failure scenarios, you can provide a preference that may not entirely avoid the use of a link, but only use a specific portion of it. Except in failure conditions.
Our approach to calculating the best paths from any switch to any other switch is still graph theory. But is has evolved far beyond the basics of shortest path first methodologies. Of course we tend to use short paths over long paths, but the calculations do not end there. If the modeled traffic needs more than what that short paths provide, the answer is not “too bad, add more links along that path”, we simply start to add indirect or non shortest path options to the total set of end to end paths used to get from A to B.
It is not always easy to explain. Having a centralized view of the all switches and their links between them, allows for non traditional graph algorithms to create paths across the network, with constraints that extend well beyond a basic metric. We will happily create 16 or more paths between any two switches, direct and indirect. And use all of them, independently load balanced according to weights calculated by those same algorithms. This is not regular shortest path graph theory anymore.- See more at: http://www.plexxi.com/2014/08/graph-theory-calculating-network-topologies/?utm_source=feedly&utm_reader=feedly&utm_medium=rss&utm_campaign=graph-theory-calculating-network-topologies#sthash.m25lIYJM.dpuf