Originally Written by Geert Meulenbelt
Remember how BI companies used to market big data a few years ago? Most of them just connected to a Hadoop cluster, made a query, visualized a report or dashboard and then hit the “Big Data Press Release” button.
This practice raises a fundamental question: Where is the data you visualize? The choices lay between two extremes:
- On the one end of the spectrum you query a data source and you generate visualizations directly with the data.
- On the other end of the spectrum you query a pre-generated and pre-aggregated data source that is dedicated to analytics and visualization. This data source used to be OLAP cubes, but now enhanced in-memory techniques are emerging, many of which use columnar database technology.
The choice you make depends on a number of factors, including the query speed you require, the volume and variety of your data sources, and your need to display information in real time. It also depends on the size of your wallet.
Although the two techniques are often used simultaneously, it is increasingly clear that the industry is moving towards real-time visualizations based upon an analytical database. This creates economies of scale, reduces errors and speeds up implementation. Close your eyes for a few seconds and imagine a BI project without data modeling.
Analytics, visualizations and databases are slowly merging together. Examples of technology in this area include SAP HANA, “software on silicon” database enhancements, and open source projects like Apache Spark.
While it is important to keep in mind where the industry is heading, for obvious reasons (including cost, security and performance), in-memory techniques will not disappear overnight.
Now let’s think about different user categories (as we defined them here) and their need for in-memory data queries. I would chart them like this:
Data Discoverers are the users whose work is most vulnerable to the industry move toward in-memory data queries, I believe. And this leads me to the following conclusion:
BI companies that have focused on smart and good-looking data discovery tools have grown tremendously during the last couple of years. But that growth is not here to stay because the underlying technology simply relies too much on in-memory data queries.
Read Geert Meulenbelt’s previous posts:
- Do You Deliver Customer Branding?
- Understanding 4 Types of Data Users
- 3 Trends in Embedded Analytics
- Moving From Big Data to Data Driven Applications