Why Your Performance Monitoring Isn't Working for Remote User Locations
Why Your Performance Monitoring Isn't Working for Remote User Locations
Traditional user monitoring tools may not be as effective with remote workers because of their isolation from the office network.
Join the DZone community and get the full member experience.Join For Free
xMatters delivers integration-driven collaboration that relays data between systems, while engaging the right people to proactively resolve issues. Read the Monitoring in a Connected Enterprise whitepaper and learn about 3 tools for resolving incidents quickly.
Many businesses are supporting remote locations, which include any office or location that doesn't have its own data center and connects to a company data center over a network-LAN, WAN or public internet. Scaling to these locations means that network types are expanding, too. Plus, application infrastructure is becoming distributed. SaaS and cloud providers host their apps from a huge range of locations, and enterprise users are connecting to multiple off-site applications now.
Adding to this complexity, performance-hungry collaboration apps like VoIP and video are even more strongly affected by recreational app usage or other bandwidth hogs in the office.
IT still needs to see and support users at these locations. But traditional network monitoring tools rely on access to routers and switches, and can't get the data needed from cloud and SaaS application deployments, or from the public internet. When a remote user has an issue, they aren't calling the app or cloud provider—they're calling their in-house IT team. That's not sustainable at scale as remote offices and locations increase.
Why Performance Monitoring Often Doesn't Work
The issue of monitoring is often an issue of ownership. Which networks are in your IT domain? They now likely include the local networks at each office as well as the wide area network that connects them, which increasingly has optimization tools on top like SD-WAN, security through VPN tunnels or delivery via MPLS and the ever-important WiFi network.
Each network component will also have hooks to the outside world that affect the performance on the network—usually this involves ISPs or an SD-WAN vendor. Most companies have some monitoring capabilities in remote offices, even if it's just SNMP or NetFlow data, but in order to troubleshoot user issues, you need end-user experience monitoring.
That requires isolating issues that come up. Also, setting up active tests of user experience at various remote locations (even just several geographically distributed monitoring points) will give you a ton of data to understand what users are seeing. Then you can start can see trends over time and develop a baseline for performance to use at other locations, especially when you're setting up new ones.
Best Practices for Network Monitoring
Here are some tips we recommend.
1. Plan for scale. We often see that the processes around new employees or new offices don't take new applications into account. New employees need app provisioning for core applications, but we recommend building a capacity estimation for each employee, too. You should also include a cost-per-head estimation on subscription costs.
For new offices, bandwidth and network topology are important, but if you have capacity estimation for employees, you can be more precise in your estimation based on the employee departments that will inhabit the new office. Benchmarking is also a big part of managing remote offices because it allows comparison. We recommend continuous monitoring across the enterprise so that you can identify location-specific issues faster and save troubleshooting time when WiFi or a regional ISP is the culprit. Setting up alerting and packet capture scheduling can help troubleshooting if these new offices have limited or no IT on-site.
Finally, when planning for scale, consider any new application traffic that might arise in the future. The main reason we recommend this is so IT is involved or at least aware of purchasing. Large changes are often broadcast widely, but when the marketing team adds 10 new apps over the course of a few quarters, the traffic can build up. Being aware of apps also allows you to be on top of performance monitoring and alerting. This often includes Shadow IT apps as well.
2. See the applications in use. A large portion of the job of IT is being the expert on the traffic that is sent and received on your network. Identifying what applications are causing congestion or retroactively discovering the root cause of poor app performance is crucial.
The traditional solution to this is looking at NetFlow records, or JFlow or SFlow, but there are a few questions to consider:
- If you're collecting NetFlow, are you collecting just corporate data or all internet traffic?
- Are you capturing at all of your locations?
- How long is that data kept and are you responsible for the storage as well?
A failure to address any of these will jeopardize your visibility into the historical performance of the network.
On top of that, NetFlow adds noticeable overhead to the network, often around 5%, due to the movement of NetFlow records over the WAN from devices to collection to storage. Look at how much NetFlow traffic you have today and consider how that's going to scale with new locations.
3. Separate network and app issues. With remote locations you often lack on-site IT resources, but the main problem with answering complaints and closing tickets is quickly identifying the root cause of the issue. App identification is key, but what you really need is a holistic picture of the network. You need visibility into the application delivery path with an emphasis on seeing the handoff points between your app, the network and the client. Relying on just SNMP data for local networks or BGP data for the internet doesn't provide this value because you're looking at the whole network or the whole internet when you're trying to troubleshoot one connection.
So you should monitor the network path between users and apps and then monitor the apps themselves. AppNeta's monitoring technology covers all these areas, and allows our users to narrow their focus when finding the cause of issues. For example, if Salesforce is slow, IT may check local and WiFi networks at the user location to see if too many employees are using bandwidth on another app. From there, IT can isolate the network path hops that are within Salesforce's domain, then check synthetic performance against that app to see if it's an app or network problem. Big SaaS apps do load balancing of requests behind their firewall, but if you can see the hops then you can identify whether the problem is in their infrastructure.
All of these tips or best practices are best done in concert with each other. Planning for scale, identifying and prioritizing apps and ensuring fast troubleshooting with visibility into both the network and app performance are key for modern monitoring.
Published at DZone with permission of Christine Cignoli , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.