The Data Tipping Point
We all hear about how much has changed in the world of data and analytics over the last decade. All of this is driven by the explosion of data… and I don’t think we’ve seen anything yet.
Join the DZone community and get the full member experience.Join For Free
We all hear about how much has changed in the world of data and analytics over the last decade. All of this is driven by the explosion of data… and I don’t think we’ve seen anything yet. The real impact of Internet of All Things (IOaT) will be bigger than anything we’ve seen, and it’s only the beginning. The sheer volume of data driven by data-in-motion is predicted to rise from four to 44 zettabytes in this decade alone.
While the growth is important, something else is happening too. Something quite simple, but perhaps more important than just burgeoning data volumes; we’ve reached a data tipping point. In front of us is a whole new world, created by this volume, but more dramatically by the variety and complexity of new data types and how we need to capture analytic value.
Past the tipping point, the variety of data from things such as infotainment, wearables, cybersecurity, sensors, vehicles and machines, will continue to accelerate changes in the analytics industry. It has become a world where it’s no longer good enough to just think about how we capture and store data, but also how we can create schemas in real time, how we can actually use the data itself as the roadmap to analytics that can differentiate our decisions, and build the business requirements and reports we need to run the business.
With this as the backdrop, the implication is that incremental technology changes won’t be enough. Scalability is just table stakes. We also need technologies that can create the reverse funnel that will let us reach into the petabytes of data and turn it into the three megabytes of useful and actionable information that truly matters .. and .. that is in a consumable form and is available as we need it. Incremental change isn’t enough because of the requirements in this new world like: the need to build ‘living’ schemas on demand, the need for a larger variety of new analytics, and the constantly shifting center of gravity beyond the tipping point.
Driven by the acceleration of requirements, Hadoop grew up and continues to accelerate, and new technologies like Spark have entered the scene. But it also implies an entirely new set of tools. A platform that can be future proof by dealing with change. One that can connect existing RDBMS, EDW, CRM and ERP systems with new data-in-motion systems, while still providing enterprise-class security and availability. Done in an open way to keep up with the speed of change.
Which is why these attributes are so key to our technology strategy at Hortonworks. Our connected data platforms of HDP and HDF allow us to bring agile new technologies like Spark to market quickly. It means being able to capture and manage data in a connected way, with tools that can create and disseminate analytics in real time at the speed of thought. Going forward, this kind of flexibility is really going to mean the difference between successful and unsuccessful data architecture strategies.
Published at DZone with permission of Scott Gnau. See the original article here.
Opinions expressed by DZone contributors are their own.