How Big Data Causes Big Problems for the WAN
How Big Data Causes Big Problems for the WAN
Generally, the more data you collect, the better. But when you need to deal with WAN-connected sites, lots of new data is moving over the same old networks, which slows things down exponentially.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
Who doesn't love data these days? For business leaders, the more data you can collect, the better. Some might resist certain technologies in digital transformation, but it seems everyone can agree that big data analytics makes business life easier.
Everyone except IT leaders, that is. While business leaders add ever-increasing data inputs with IoT sensors, IT teams are left to find new ways of supporting complicated big data infrastructures without much guidance or extra budget. That, coupled with increasing WAN-connected sites, means a lot of new data is moving over the same old networks.
To say that big data is stressing your WAN would be an understatement. Networking for big data is a new consideration for a lot of IT teams. However, it's not enough to just implement more WAN optimization tools to solve the problem.
Growing big data efforts are causing big problems. The answer isn't a simple appliance-it's an overhaul of your approach to WAN management and network monitoring in general.
The Problem: Exponential Growth of East-West Traffic
You typical WAN optimization solutions were built with north-south network traffic in mind. Remote and branch-office employees needed seamless application performance with direct, uninterrupted access to central servers.
This north-south optimization is still important when it comes to networking for big data, as end users rely on applications to derive insights. But when it comes to your WAN performance, the real problems are rooted in the east-west network traffic that supports big data collection and analysis.
Unlike north-south traffic that can accept a degree of latency, the east-west big data traffic must transport at high network speeds. The value of big data lies in real-time processing and data delivery. If you don't have real-time access to data, the insights you derive will be outdated by the time you can make use of them. Real-time data delivery will become increasingly important to enable agile decision making and automation, especially as the Internet of Things takes hold in your business.
The machine-to-machine communication required for big data will impact your WAN planning on multiple fronts:
- Bandwidth management: Regardless of the source, collecting ever-increasing amounts of data eats up network bandwidth. If bandwidth is strained and you can't pinpoint the precise reason, end-user experience will suffer and big data's value will disappear.
- Data replication: All the big data in the world won't help you if disaster strikes and the data is unprotected. Data replication is critical for analytics effectiveness. But at the same time, the more resources you devote to big data replication, the more bandwidth you consume. Is your WAN capable of handling both big data collection and replication?
- Data accuracy: Data corruption isn't a new challenge for IT leaders. However, it may have been easier to identify in the past when there weren't so many servers supporting east-west traffic. Big data will exacerbate the problem and if WAN managers can't identify corrupt data (and its source), the business will suffer.
The challenge that underlies each of these management pain points is that application usage must be tied to big data demands. A simple WAN optimization tool won't cut it-a greater monitoring mindset is essential.
Networking for Big Data: The Value of Visibility
One of the best things WAN managers can do as big data stresses their networks is to double-down on QoS enforcement. You want complete control over traffic prioritization to ensure end users don't feel the pain of any potential big data-related problems on the back end.
But even more important is the ability to pinpoint which big data processes and applications are causing bandwidth and integrity problems. No network is perfect, so it would be unreasonable to expect big data to never cause a network problem. However, you can still take the steps necessary to guarantee visibility throughout your application portfolio.
Published at DZone with permission of Joe Michalowski , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.