Data Triad: The Power of Three in Global Traffic Management
Data Triad: The Power of Three in Global Traffic Management
Server health, experience health, and business health are the three key measurements when looking at managing traffic to provide a better user experience.
Join the DZone community and get the full member experience.Join For Free
Learn how error monitoring with Sentry closes the gap between the product team and your customers. With Sentry, you can focus on what you do best: building and scaling software that makes your users’ lives better.
From the legendary tridents of Poseidon and Shiva to the sacred trinities of Christian and Taoist tradition, the power of three is a story from the beginning of time. We see it reflected in our symbolic systems (triangles, Celtic knots, Borromean rings), in our government institutions (Executive, Legislative, Judiciary branches), and in stories (The Three Musketeers, A Christmas Carol, and the Three Fates of mythology). Triangulation, a method for determining a location by forming triangles to it from known points, is a central concept in GPS, surveying, 3D optics, astronomy, and much more.
Likewise, we can form more precise and comprehensive network intelligence with data from multiple points. In global traffic management, we need to synthesize multiple data sources to create intelligent routing methods. It is not sufficient to know that systems are available — we need to know what the end user is experiencing. But even these two points are not the whole picture. Is it clear, for example, what resources are being used to provide that high-quality experience? How much are those resources costing? Is the current use of CapEx resources being optimized?
Internet traffic continues to grow in a relentless surge. Given the complexities of hybrid IT and multi-cloud app and media delivery, it is now impossible to monitor and control for all potential complications (not to mention optimizations) without automated intelligence. And that intelligence should be based on a triad of data perspectives.
Perspective 1: Server Health
Real-time systems health checks are the first point of the data triad for intelligent traffic routing. And they begin with accurate, low-latency, geographically-dispersed synthetic monitoring, which reliably, and in real-time, answers the question: is the server up and available? Avoiding high-latency options that can take precious minutes to notice a system-down situation takes one of the biggest headaches for any DevOps team off the table.
"On/Off" confidence, however, is necessary but insufficient: to effectively route traffic, one must know the current health of those servers that are available. And where local load balancers (LLBs) and Application Delivery Controllers (ADCs) were able to handle incoming requests, modern infrastructure and delivery require smarter, more distributed and non-proprietary solutions. When LLBs are not dynamically and centrally controlled, it can take an unacceptable amount of time to switch over when a cluster goes down. Put another way: a system that is working fine may be approaching resource limits, and a simple On/Off measurement won’t know this. Without this key piece of information, a system can cause so much traffic to flow to this near-capacity resource that it goes down – potentially setting off a domino effect as traffic floods other working resources.
LLBs running without real-time intelligence, then, are susceptible to slowdowns, micro-outages, and cascading failures, especially if hit with a DDOS attack or unexpected surge. And indeed, there are times when it’s necessary to make changes to your standard resource model: updates, repairs, natural disasters, and app or service launches. Without scriptable load balancing, you have to dedicate significant time to shifting resources around — and problems mount quickly if someone takes down a resource but forgets to make the proper notifications and preparations ahead of time. Dynamic global load balancers (GLBs) use real-time system health checks to detect potential traffic or resource problems, route around them, and send an alert before failure occurs so that you can address the root cause before it becomes a fire drill.
Perspective 2: Experience Health
The second point of the data triad is Real User Measurements (RUM), which provide information about Internet performance at every step between the client and the clouds, data centers, or CDNs hosting your application. This data should be crowd-sourced by collecting metrics from thousands of Autonomous System Numbers (i.e., ISP networks), delivering billions of RUM data points each day. This kind of traffic intelligence can’t be gathered from your own system (unless you’re Google-sized). Even if you have millions of users each day, you only have decently deep measurements from a few hundred ASNs. Community-sourced intelligence is necessary to see what’s really going on in the far-flung reaches of your growing application universe.
Community intelligence is just as important for monitoring the experience of big, messy pools of users as it is for the mysterious pockets of users on the edges of your network. Many countries have thousands of ISPs (e.g., Brazil, Russia, Canada, Australia). Most likely, these areas are important to your global delivery needs and business success. Excellent user experience data is particularly important where there are so many individual peering agreements and technical relationships, representing myriad causes for variable performance.
Combined with Server Health intelligence, RUM intelligence ensures we route traffic to servers that are up and running, not about to fall over, and are demonstrably providing great service to end users.
Perspective 3: Business Health
Which brings us to the third point of the data triad. What more could there be if you are able to dynamically control systems and user experience from global to granular levels? As long as everything is up and running and users are happy, what more is there to worry about? Quite a bit, actually. As in, quite a bit of money. Along with systems and user experience, optimizing spend is fundamental to business outcomes. Cloud overflow expenses can mount quickly. If you can’t feed cost and resource usage data into your global load balancer and automated application delivery, you won’t get traffic routing decisions that are as good for the bottom line as they are for QoE.
DevOps is increasingly responsible for business decisions in areas like cost control, product lifecycle optimization, resource planning, responsible energy use, and cloud vendor management. It’s time to put all your Big Data streams (e.g., software platforms, APM, NGINX, cloud monitoring, SLAs, and CDN APIs) to work producing stronger business results. By combining third-party data with real-time systems and user measurements, you can define your application delivery rules to prioritize datacenter utilization before expensive cloud bursting or to track green energy usage by your cloud, hosting, and CDN providers.
Every company has its own contextual business and performance priorities. Automating app delivery from the cloud or datacenter that makes the most sense (based on user experience, cloud cost/SLA, APM data, and more) is especially vital for the delivery of modern applications in a multi-cloud world. The added control layer provides the comprehensive visibility and application delivery control required to achieve cloud agility, performance, and scale while staying in line with business objectives and budget constraints.
Intelligent global traffic management isn’t just a “nice to have” feature. Increasingly, it is your protection against business disaster. Organizations of all types are under pressure to avoid the slow-drip service erosion of micro-outages, not to mention the public shaming spectacle of major outages or video streaming failures, by optimizing delivery performance. As always, these same organizations also have to make more efficient use of resources, comply with regulations, and cope with a chronic shortage of talent. It’s time to put the promise of Big Data to the test by harnessing the power of three—server, user experience, and business health data streams—into an intelligent, global load balancing platform.
Opinions expressed by DZone contributors are their own.