With rare exceptions, network operators have the challenging task of managing a brownfield hodgepodge of old and new equipment. The level of complexity that wide area networks (WAN) have reached makes the challenge that much greater. Business users are asking for specific network throughput guarantees when it comes to their critical applications, legal departments require compliance with mandated regulatory frameworks, and operations are asked to do more with shrinking budgets. All these requirements do not easily align with existing network architectures; hence, network operators are continuously faced with a slew of granular parameter change requests, trying to meet ongoing network requirement changes without having the proper tools in place.
What has worked before often will not work anymore. Traditional network engineers have managed configurations through hop-by-hop or router-to-router management, a practice that is no longer feasible in light of rapid changes to hardware and software. Network complexity today requires control of the entire environment from end to end, and the ability to apply policy changes with the utmost precision across a whole network. Let’s consider the path to making these changes:
- Will changes to the device involve other operations teams (think telco and VoIP)?
- Do they need access to public or private cloud (or both)?
- What is the location of the end user’s device, be it a laptop or a PC?
- Is the device hard-wired or connecting wirelessly to the network?
- Is the device on a corporate LAN or coming from a remote site?
After these questions have all been answered, the change requests must be completed to meet the user’s request and satisfy the network performance requirement of the application while having minimal effect on the other services running on the network.
NetOps then has to stay abreast of all those changes. It’s commonplace for teams to fall behind as the changes roll in, but when operations is given a complex request only to realize there is no current network documentation, the results can be disastrous. Often, senior network architects, designers, and engineers must collaborate to minimize potential side effects, which slows the process even more.
Creating on the Fly
In corporate networks, disruptions can be catastrophic to the business. Hence, all changes must undergo stringent verification and approval processes. Add in changes that must be made across domains — for example, security, disaster recovery, video, and VoIP — and a disparate knowledge base between the different departments involved often leads to conflicting or incomplete change requests. Thus, the job of an engineer is akin to rebuilding a jet engine while in flight.
To further complicate matters, there is a lack of highly skilled “full-stack engineers,” professionals who can program software and make configuration changes with networking on the fly regardless of the application or the equipment. Operations personnel tend to be generalists, and without a detailed skill set, they must rely on subject matter experts who are well-versed in various subsets of technology. Alternatively, they must wait for configuration changes to be released by the hardware or software vendor or engage a contractor from a third party, a slow and painful process at best.
The average replacement cycle is usually dictated by the manufacturer’s support cycle, typically five to 10 years. Such a long cycle can create a significant mismatch of feature sets supported, since new firmware is issued every six to 12 months on average, and existing devices are only updated when necessary. Today, engineers can’t think of each router or device as being the same but instead must consider what version of firmware was installed, what hardware plug-in extensions have come and gone, and then mix and match configurations that work from end to end.
Easier Said Than Done
Today’s critical apps require high performance; there is no room for error, failure or downtime, bringing us back to the “jet in flight” scenario. If fact, it’s not just a single jet – it’s more like fixing multiple jet engines from multiple manufacturers while in flight.
This scenario is impossible even if enterprises try to standardize deployed hardware. With nearly 30 device types per vendor on the market, all featuring a variety of firmware, and two or three vendors in deployment, the numbers are stacking up. Given that each piece of equipment comes with a unique command line or user interface to successfully configure a device, it becomes a nightmare for even the savviest network engineer.
Security must be addressed, but the “how” is trickier than the “why.” High-profile breaches, cyber attacks, and the liabilities organizations face have made the security policy of a “vanilla” access list on a router a thing of the past. Simple firewalling and access control lists are no longer sufficient; they must be extended with more sophisticated intrusion detection and prevention systems. And using public internet transport for low-cost bandwidth requires additional layers of secure virtual networks.
Many organizations are now leveraging Software as a Service (SaaS) and Infrastructure as a Service (IaaS), such as AWS and Azure, to address capacity planning. Network bandwidth usage and flow patterns have significantly changed and are evolving rapidly with the introduction of additional services such as rollouts of Salesforce and Office 365, creating unique demands on networks that were originally designed for “internal use only” usage and security concepts.
Rome wasn’t built in a day, and adding new equipment to today’s complex networks can’t be done overnight.
Herding the Network Cats
To offer help desk functionality and auto-ticketing, some organizations have moved towards a self-service IT portal. However, these services are typically reserved for simpler, specific tasks, such as accessing a network drive or adding access to a printer, that don’t require a change to network functions.
The purpose of these portals is to assign incoming change requests to the proper engineer, but even still, requests that are handled manually can take three to five days to implement if there is no new equipment. When new equipment like a new circuit or router is involved, turnaround time can take 30 to 60 days or more and may require senior architects and networking engineers to implement.
As companies move toward automation, there are a few guidelines that break the huge elephant into manageable parts:
Go for the Quick Win
If you plan ahead and find the right tools, automation does not need to be complicated. Start by automating specific pain points such as a QoS policy first. A small success will help in the progression of applying an automation strategy when you move to the next most painful solution.
Create Standard Policies Based on Strategy
Leadership usually sets the strategy and it usually relates to network features like scalability, reliability, security, etc. Beginning to roll out standard policies in your enterprise, starting from simple enforcement of standards for DNS and NTP to routing and tunneling, will go a long way. These policies define the rules for each so that centralized control and policy management can be enforced. This is the start of taking control of your network.
Determine What Each Device Will Do
Abstraction and modeling of standard features and node and site configuration are vendor-agnostic and can be implemented no matter what device you are working on. Consider what you want the network to do, then build it into the rule set to develop your model. Having models for network features, node types, site typ, and more will help tremendously when implementing the models in an automation or orchestration platform.
Know the Actual State of Your Network
The challenging part here is getting started early. Conducting a discovery exercise on an existing network can be complex. Understanding exactly what the current configuration state is based on an internal configuration management database or existing documentation versus what is really there in terms of validating devices and firmware can add to this complexity. Once the network is known, it is possible to deploy changes against your modeled functionality and migrate it to the desired state. After this is complete, it is essential to immediately protect the network against unauthorized changes, automatically monitoring the configuration state to ensure no other changes are applied.
Manage Across the Whole Lifecycle
The network is an ever-changing entity. New sites are being deployed. Devices are being upgraded. User requirements and applications using the network are ever-evolving. To enable and maintain agility to service these requests, centralized model-driven automation and orchestration are necessary. This type of control of the network will ensure new devices are provisioned correctly. Policies and best practices must be maintained throughout the network. To enable this, engineering and operations must not be slowed down by automation and orchestration tools and must enable DevOps to quickly develop, test and deploy new features into the production network.
Finding What Works
A network may be a beautiful, pristine greenfield of shiny new parts or a mishmash of old and new trying to work in harmony. Each network has its particular configurations and attendant challenges, so no one method will work for every organization. Reduce complexity by deploying an automated orchestration strategy that is model-driven but still provides the freedom and flexibility to handle customization. Real-world experience recommends that all networks’ nodes be in a config-synchronized state that aligns with the approved network model. This will reduce NetOps headaches and create a more perfect network union.