The content of this article was originally written by Scott Jarr on the VoltDB blog.
Recall that in the world of Big Data, our fundamental assumption is that data no longer resides in a static database for its entire life. Big data demands that we squeak out the most value from the data that we have at every stage of its lifecycle. And, oh yeah, we're collecting way more today than we did yesterday, so get ready for that challenge, too.
Part 2: Putting the Pieces Together
Let’s build on the concepts we developed in Part 1 so we have a foundation on which to map different kinds of solutions onto the data value continuum.
Application Complexity Changes the RulesTo begin with, let’s add a new (vertical) axis to our data value continuum graph called Application Complexity. I'm taking a few liberties here by combining speed, analytic depth and data size into a single axis that represents Application Complexity (right axis).
Traditional Databases work- if slow, simple and small is enough
In the old days, when slow/simple/small was the status quo for most database applications, the traditional database performed just fine. It was capable of meeting the needs of interactive applications through to analytic processing applications - as long as those needs were characterized as slow, simple and/or small. 20+ years ago, one-size really did fit all in the database world.
Not today! There are endless stats about the growth of data, but I'm sure no one reading this blog needs to be sold on the changes we are all experiencing in the speed, volume and complexity of the data. And it’s only getting faster, bigger and more complex. In this new world, the traditional one-size-fits-all database just doesn't stand a chance. (This is Mike Stonebraker’s original thesis on the need for database specialization.)
Startups Filled the GapsNaturally, as the problems of data have gotten bigger, we have seen startups do exactly what they are good at - finding areas of high value that are under-served because of major technology shifts. Not surprisingly, these areas of high value are neatly represented in the value lines from our chart.
Starting at the right, the value of data is dominantly extracted by looking at the aggregate of data points. Technologies like Hadoop, R and other deep exploratory analytics are taking hold.
Moving toward the left of the graph, data warehouses continue to serve a significant need in data analysis and reporting. When using these systems, we typically know in advance what we are looking for, and we can express our queries in reasonably specific ways.
NoSQL alternatives are producing some compelling capabilities in a growing number of use cases where lookup of large sized data sets are the driving functionality.
Continuing to the left, we have recently seen NewSQL alternatives addressing the remaining white space in the value graph. This the realm of interactive, operational work loads. Increasingly, real-time analytics are playing a critical part of making the best possible decisions in an operational environment. (I'll talk more about this in an upcoming post). This market is a little younger than the others, but is solving a very important piece for the Big Data Enterprise, quite frankly one that Oracle has dominated for way too long.
If you haven’t figured it out yet, VoltDB is a NewSQL product focused on solving high performance requirements within the interactive and real-time analytic market. We strongly believe that, in the coming months and years, the deluge of data will cut across every industry, and real-time operations (both transactions and analytic queries) will become a competitive advantage for technology-savvy organizations. The value of individual data items and a company’s ability to make the best decisions on real-time data will separate the winners from the also-rans.
I'd like to leave you with a quote that I believe best signals the importance velocity will play in the future of Big Data. Perhaps most telling is the source of the quote:
"The leading edge of Big Data is streaming data"