The history of computing is punctuated by periods of disruptive innovation that play havoc with the technology landscape.
For a while database technology seemed immune to this kind of disruption, but by about 2004 it began to stumble – long before the explosion of smart phones and tablets. CPUs became multicore, enabling new approaches to scaling out databases across large grids of servers. At the same time, the Internet was producing larger collections of data than had ever been amassed before. It began, of course, with Yahoo! and Google but quickly blossomed out to what we now call social media (LinkedIn, MySpace, Facebook, etc.) and the rapidly expanding industry of multiplayer games, where data on the activity of millions of individual game players was being collected and analyzed. The traditional databases were no longer so widely applicable, and in some areas they were utterly inadequate.
The above table provides a summary of the database landscape that has emerged. The traditional databases, including open source ones like MySQL and ProgreSQL are, of course, suited to the traditional OLTP, data mart and data warehouse workloads. There are also databases like Aerospike and VoltDB that specialize in extremely high volumes of OLTP transactions. This category is very close, but not identical, to the in-memory databases like SAP’s HANA or Kognitio, which simply focus on speed and response time.
The final category consists of databases whose common characteristic is that they are built to run on Hadoop’s HDFS file system. Currently, all of them seem to target the workloads of the traditional RDBMS, but with the nuance that they scale out better. They are unlikely to ever challenge either the analytical or high volume OLTP databases in respect to scale and capability, but as they mature, they may become attractive alternatives to the traditional RDBMS.
For businesses who are selecting database products for a specific type of application, our advice is to determine which category of database they need before thinking of which products to investigate. While, as time passes, we can expect there to be some rationalization among these database categories, we expect most of them to persist with two or three products dominating each category. This is because the categories have been derived based on different types of workload, and we do not expect a database engine that is excellent in one of these categories to perform particularly well in other categories.
Like the blog? Read The Bloor Group Whitepaper: "A Database Platform for the Internet of Things"