We hope you enjoyed the November 22 DemoFriday™ on multilayer provisioning featuring Brocade, Infinera, and the U.S. Department of Energy’s Energy Sciences Network (ESnet). The demonstration used ESnet’s On-Demand Secure Circuits and Advance Reservation System (Oscars) application to show how SDN can be used to provision services and automatically optimize resources across a multilayer network. If you missed it, you can watch the full presentation here.
As always, panelists heard insightful questions from participants following their presentation. We’ve expanded that Q&A below to include answers we didn’t have time for during the session. The panelists were Daniel Williams, Brocade director of product marketing, data center and service provider routing; Chris Liou, Infinera vice president of network strategy; and Inder Monga, ESnet chief technologist and area lead.
You also can check out the teaser video and other resources listed below.
The OpenFlow standard does not yet have support for optical transport . What mechanism was used to set up the transport services?
Infinera: It is true that OpenFlow extensions to address the needs of optical transport have not yet been defined, but there are standardization efforts currently underway within organizations such as ONF and OIF to extend the notions of SDN programmability and OpenFlow support to optical transport.
In order to facilitate a unified control plane and show how a third-party controller could be leveraged for managing a multilayer network, we leveraged the logical port concept within the OpenFlow “wire” protocol to enable provisioning of nodal and network flows, without adding any proprietary extensions. These logical ports were defined through a REST interface. During topology discovery, Oscars discovered these logical ports, including information it needed to formulate the appropriate ofp_flow_mod requests. Oscars was then able to send these requests down to the transport layer through FloodLight, without modification to the protocol.
What is the benefit of using SDN/OpenFlow to provision the Brocade MLXe compared to the IP/MPLS services in this type of deployment?
Brocade: First, the services in the control plane provided by the router layer are different from the services provided by the optical transport layer. Thus, in the production network today, network operations personnel must be familiar with the different service technologies provided by each layer in order to provision end-to-end. Additionally, different management applications are required to manage and provision both types of network devices. Using SDN/OpenFlow provides an open standard common control plane and management plane — i.e., control plane unification — which greatly simplifies the operations.
Second, in the current IP/MPLS solution, the circuit management relies on MPLS traffic engineering, which uses either a primary/secondary circuit mechanism or fast reroute. Regardless of what mechanism is used, this is a narrow view with course recovery which only provides a secondary path in the bandwidth change or failover scenario for the flows running on primary circuits. This requires the network infrastructure to reserve extra idle bandwidth in order to failover the path. By using this SDN solution, which provides a global perspective and central control of the network, these resources can be optimized network-wide, helping reduce both opex and capex. The network operations center can have better monitoring and higher utilization of network resources, while the need to deploy idle bandwidth and infrastructure is minimized.
How do you model the different layers of the topology?
Infinera: Oscars maintains an integrated flat topology of both the packet and optical transport layers. In order to facilitate a vendor-agnostic view of the topology, a generic topology representation was required. Oscars currently utilizes a topology representation called NMWG. In order for Oscars to discover the transport layer topology, Infinera built a configuration manager which discovered the topology of the virtual transport network, as presented by the Open Transport Switch (OTS) instances, and transformed it into NMWG format. In this format, Oscars was able to import the Layer 0/1 topology.
At the beginning of the demonstration, no bandwidth exists yet between the routers and switches – thus, there was no Layer 2/Layer 3 topology to discover. The information that Oscars required was the cross-layer topology – which OpenFlow-enabled routers and switches were connected to which virtual transport switches. As no standardized approach to auto-discovering the cross-layer topology exists today, this information was manually incorporated so that Oscars could construct an integrated, flat multi-layer topology view.
ESnet: In our topology schema, we “flatten” the layers out, and use adaptation/de-adaptation points to “stitch” the layers together.
Is there a delay when Oscars updates the circuit status?
Brocade: The update status from Oscars visible in the GUI depends on several criteria, including the relative location of the front end (web browser) and the back end (application and database). There can be delays in communication between the two due to distance. So the connection speed plays a key role in terms of the speed of status update to the GUI. In our demo, the live equipment was accessed remotely over a standard Internet connection. However, while the status update to the GUI may show a delay, the provisioning occurs in real-time.
You mentioned the option of using GMPLS within OTS — isn’t this counter to the SDN principle of centralizing the control plane?
Infinera: What we are discovering through our discussions with service providers is that they are looking for programmatic control over traffic flows at different levels of the network as well as ways to help them migrate their existing networks toward one that is SDN-enabled.
In the optical transport world, programmability of bandwidth services has always been available through various management protocols and through the creation of static cross-connections – this is in essence the TDM equivalent of OpenFlow for packet systems. For decades, carriers have been centrally computing paths at the transport layer and provisioning the data paths via requests to individual network elements.
Only in the recent decade or so has intelligence in the form of IP-centric routing protocols been added to the transport layer in order to enable operational savings benefits, such as automated topology discovery, dynamic route computation for circuits, and automated setup of circuits using robust, fault-tolerant signaling.
These advanced technologies have substantially simplified and improved operations. They maintain more consistent views of inventory resources, have enabled carriers to deploy larger scale networks, and are widely deployed today in many of the world’s largest optical transport networks.
The design of Infinera’s Open Transport Switch (OTS) includes multiple abstraction capabilities, including one which enables the SDN control layer to program a transport flow at the nodal level, as well as the ability to abstract the network domain and enable the SDN control layer to program a bandwidth service across the domain, leveraging some or all of the transport layer’s intelligent GMPLS capabilities. This broad set of abstractions is designed to meet the broad set of requirements across network and service providers and helps facilitate a hybrid strategy towards SDN while still enabling us to leverage deployed technologies.
How do you determine if you should maintain the same path but bypass upper-layer devices, or build a new optimized Layer 1 end-to-end circuit instead?
Brocade: This would be a policy issue enforced by the Oscars trigger and PCE mechanisms. The organization will create their own policies and optimization algorithms to apply to Oscars. The criteria applied can be made as sophisticated and dynamic as the policy engine that works with Oscars to trigger the circuit reprovisioning. For example, these trigger criteria can be preset thresholds based on specific bandwidth of the flow or more dynamic such as pulling in real-time analytics from the network for things such as packet loss, types of traffic, source/destination, application, etc.
What is an example of a use case for a Layer 1 virtual transport network overlay?
Infinera: Service providers are increasingly interested in the concept of creating dedicated, isolated virtual transport network overlays over a common shared physical infrastructure. One common use case is in supporting Optical VPNs (O-VPNs), which enable a network provider to create multiple dedicated network partitions at Layer 0/1, each isolated from the other both from a data plane perspective as well as an operational perspective. This could be used as an internal network partitioning scheme, and/or potentially as a new service offering, giving end users such as other carriers or large enterprises their own virtual private optical transport networks, interconnecting specific points-of-presence with dedicated transport capacity, complete with visibility and provisioning control over bandwidth services that are self-contained within the O-VPN. The isolation at this layer ensures their packet-layer traffic is unaffected by any other packet traffic that might be riding over the same physical network.
Is the OpenFlow communication from ESnet Oscars to the Brocade MLXe a secure channel?
Brocade: In this demo, we did not use the secure channel in control-plane communication between Oscars and the underlying packet-optical infrastructure. Oscars does support s secure connection to the networking infrastructure and user front-end application using HTTPS. The Brocade MLXe also does as well, providing support for both TCP and SSL connections. The Brocade MLXe SSL connection currently follows the TLS1.1 standard with plans to move to TLS1.2 to support Federal Information Process Standard (FIPS) 140-2.
What is the scalability and performance of this SDN/OpenFlow solution? Does it address requirements for production networks?
Brocade: The Brocade MLXe supports up to 128,000 OpenFlow entries in hardware with wire-speed performance. For an MPLS network, typical tier-1 provider requirements require sub-50 ms for circuit provisioning and optimization. By using SDN/OpenFlow as shown in this demo, a single circuit can converge in 2 ms, within the typical carrier requirements.
Infinera: The SDN approach taken for the transport layer is designed for carrier scale and performance. While the volume and frequency of transport flows is significantly less than that found in the packet layer, it is still important to support rapid and deterministic path-setup functions for on-demand bandwidth elasticity. With the emergence of converged packet-optical transport network (P-OTN) solutions such as Infinera’s DTN-X, deterministic digital switching becomes available at the transport layer, enabling new features such as sub-50 ms shared mesh protection and new applications such as multilayer restoration.
How is the topology information learned from the Layer 1 and 2 devices?
ESnet and Infinera: The physical Layer 1 topology is discovered via Infinera’s GMPLS mechanism. As Layer 1 virtual transport network overlays are created, the topology is learned using the topology discovery capabilities of the REST API supported by OTS. The Layer 2 topology can be learned through the OpenFlow controller (capability requests). The stitching points between Layer 1 and Layer 2 (since we flattened the topology), however, are recorded manually, as there is no dynamic topology discover between layers.
Do you have to take into consideration layer dependencies when you build the multilayer circuit?
ESnet:The primary consideration is to ensure that the head-end circuit device is the last to have its flow rules pushed down. This is to ensure that the rest of the circuit path is instantiated before traffic is injected into the head-end. (This applies to bi-directional circuits as well.)
Is the reduction in throughput illustrated related to the something like the slow-start algorithm or is it something else entirely?
ESnet: I assume this question is referring to the TCP throughput chart we put up in the slides. The reduction in throughput is due to packet loss. The dramatic reduction in throughput happens right after around an 11-ms round-trip time and is more evident in high bandwidth-delay products. The way to theoretically calculate loss is using the Mathis equation, where MSS is the maximum segment size, RTT is round-trip time, and Ploss is the probability of packet loss.
More detailed discussion can be found here.
Can the path switching occur “proactively” based on reaching certain volume thresholds (e.g., not waiting for loss levels to be achieved)?
ESnet: The path switching occurred proactively in the demonstration, when the traffic level reached a certain trigger. Even though you can trigger on loss, we did it based on aggregate throughput. This is just an illustration – most network providers will choose triggers based on their own policies and optimization algorithms.
What’s the setup time of Oscars? Can that time be optimized?
ESnet: There is no pre-set setup time with Oscars. The two steps that Oscars takes are path calculation and provisioning the devices on the chosen path. Path calculation depends on the complexity of the topology and the number of criteria being optimized, but with modern CPUs this should be in the millisecond range. The time to set up the devices depends on the number of devices being configured, the approach taken (parallel or serial), and the hardware implementation of OpenFlow/speed of provisioning on the switches.
In developing the application, we noticed significant opportunities for improvement in all these areas, including interaction with the Floodlight controller, where a lot of provisioning can be parallelized.
Brocade, Infinera, ESnet: We are not qualified to speak about other vendors’ product plans. They are involved in the ONF, where a lot of optical transport-related standards are being discussed.
Will these become carrier-grade and be deployed in large carrier networks? Carrier-class solutions typically involve a slew of features which may not currently exist in the SDN world. What are the Brocade/Infinera strategies towards productization?
Infinera: There are a number of new initiatives underway within standards organizations such as ONF and OIF to examine carrier SDN and what capabilities are needed to support the scale, availability, resilience, and security needs of carriers, to name a few. Infinera and Brocade are participating in such related efforts and working with our customers to better understand and help foster the types of features and capabilities that will help promote adoption of SDN by the carriers.
How much effective throughput are you able to optimize? Can you please give us a rough percentage or stats?
ESnet: We hope to provide fast lanes for our “elephant” flows — the extreme-scale data movement by scientific applications — at the most appropriate layer in the network for that traffic and at the time requested. As you noticed in the TCP picture shown, if we can provide loss-free transport to these large flows, we can improve the throughput around 80x over long round-trip times. You can view various tuning and optimization steps for high-speed data transfers on the Fasterdata Knowledge Base.
How do you integrate with existing IT systems in an enterprise or telecom full-service provider’s environment?
Infinera: There is a lot of ongoing discussion on the Northbound API within forums such as the ONF – this newly formed working group is looking at numerous technologies and use cases for enabling a standardized approach for integrating business applications with the network services provided by the SDN control layer. How this all integrates into existing telecom OSS/BSS environments is one of the areas for further investigation.
How does Oscars allocate scarce bandwidth among competing users?
ESnet: Unlike what’s on the commercial Internet providers’ networks, the traffic-engineered circuits created by Oscars are time-based, i.e. they have an end time. This enables network resources to be released once the application is done using the bandwidth. This enables multiple applications to use the network bandwidth when they need it, and is especially useful if the bandwidth is scarce.
The method used right now to allocate bandwidth in Oscars is first-come, first-served. We do not implement pre-emption of resources because the relative priorities between applications cannot be judged globally very easily. But when Oscars is used for applications that you control, then pre-emption can be implemented easily from a policy perspective.
One approach to get beyond the above restriction is that we do not do hard policing — i.e., if an application has to send a burst of traffic that increases the base guaranteed allocation, we mark the excess as scavenger and send it on the link. In case there is extra bandwidth on the path, the traffic goes through fine. But if there is congestion or scarce bandwidth, the guaranteed portion of all traffic is not affected, and only the extra burst of data is dropped.
When the circuit path is reprovisioned, are the criteria for doing so statically defined? Or is the reprovisioning performed with respect to available resources?
ESnet: For this demonstration, we showcased a simple use case. The criteria can be made as sophisticated and dynamic as the policy engine that works with Oscars to trigger the circuit re-provisioning.
Could a DOE site provision permanent Oscars circuits to multiple sites at low bandwidth, so that additional bandwidth could be dynamically allocated as needed?
ESnet: That is a great question and very good idea. Oscars, through use of the IDCP and NSI protocols, supports the concept of “modify,” where a user can modify the bandwidth of the circuit, dynamically, based on changing traffic profiles.
Did OTS provide the functionality you needed for this demo, or did you use vendor control interfaces to establish the connections?
Infinera: The demo was achieved using a subset of the functionality that OTS provides. OTS enables multiple levels of network and nodal abstraction, as well as multiple provisioning modes ranging from direct nodal hop-by-hop bandwidth provisioning to one that leverages the inherent GMPLS capabilities of the underlying platform for performing functions such as route computation and robust path set-up via in-band signaling. For the demonstrations shown, OTS provided all the functionality needed by Oscars.