Buyers Beware: Data Visualization Is Not Data Analytics
Buyers Beware: Data Visualization Is Not Data Analytics
If all you’re looking for is a beautiful report, then data visualization tools might work for you. When it comes down to the nitty gritty of data analysis, they are definitely not enough.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
The term "business intelligence solution" can be deceiving. Many software solutions that call themselves BI can actually only offer you half of what you need.
Here it’s important to make the distinction between two types of business analysis and intelligence tools: end-to-end solutions and ones that are merely front-end. An end-to-end solution is made up of a platform backend (basically the tools and algorithms that handle preparing all the data), and a frontend that creates data visualizations and dashboard reporting.
While we like to see our data in easy to handle visualizations, platforms that only give you this are not enough to get real insights from your company’s data. With data visualization tools, as you can imagine from their name, you don’t have all the initial, background stages of preparing and joining the data. This means that users need to first have data that can be fed into the software, i.e., a pre-made central database.
When it comes to enterprise needs, the difference between these two types of software are strikingly clear. It’s also clear that visualizations, though important, cannot be the sole component of powerful business intelligence software.
Get to Know the Back Story
Dashboards are deceivingly simple and most users take for granted all the work that goes on behind the scenes to clean and link up the normally vast amounts of data that go into business reports. With lower quality data or data that is spread out over many disparate platforms and databases, even more work must be done to create a base from which to start analyzing. At the end of the day, preparing data for analysis can take up to 80% of the time devoted to a typical project.
The purpose of effective analysis is that you first need to have all your data in one central place so that you have a single version of the truth to work from. You also want to be able to update and change it, while still being able to use the same source. Unfortunately creating a data repository for a business today isn’t so simple.
The sheer number of platforms and software tools that companies use to collect data, from Excel to Salesforce and from Google Analytics to CRM software, makes it almost impossible to manually go through and create one database. In addition, with all this disparate sources and users, misnamed, outdated, and messy data are unavoidable.
With tools that lack the built-in backend components to automatically do the syncing and cleaning process, you can be sure you’ll be spending ages just trying to figure out what’s going on with any report. You’ll end up either having to repeat the same work every time you add new data, or even investing in other software to do it for you. A lot of the time you just won’t be able to get into the really interesting insights.
Updating and Collaborating in Real Time
For an analysis tool to be truly useful to an organization, it must be updated constantly to account for changes. However, this can easily lead to bottlenecks forming in businesses when updates are left to be done by a single factor or department such as IT.Visualization tools that don’t have preparation capabilities will pull their data from decentralized sources that can easily fall out of sync with a number of collaborators accessing them. Then you get a big mess of different data with unreliable dashboards and reports, because it becomes extremely hard to keep on top of who has the latest numbers. The more users you have accessing the data sources and changing or updating it, the more errors you get, and the harder it becomes to use the system.
BI software should allow for a number of people to collaborate together and change existing data sets. With an end-to-end solution you get the benefits of working with a centralized data repository that combines data any way you need it. Any queries that are run on the server, by any user, will rely on one version of the truth and end contradicting reports.
Putting the Intelligence in Business Intelligence
Once you’ve got all your data in one place, analysis comes down to solving complex calculations that involve a few sets of numbers. This can be done to a limited extent by programs such as Excel. But the problem is that you have to do a lot of manual work for each calculation to take place. For deeper analysis, you have to create multi-stage formulas that perform a number of calculations simultaneously. For example, to calculate average total sales per month you need to both calculate the sum and the average of all the items you’ve sold.
Visualization tools focus on reporting data rather than analyzing it, and so they only use a restrictive platform that limits the number of aggregations you can input per formula. To make it work you have to summarize data before you make calculations. In other words, instead of calculating sum and average at the same time, each step would have to be done separately, saved, and then calculated together.
This cumbersome process can be avoided with an end-to-end solution, as these enable users to create complex formulas that work in separate sources. The software automatically does all the necessary pre-calculations, allowing you to skip straight ahead to the information you were after.
Not Just a Pretty Face
If all you’re looking for is a beautiful report, then data visualization tools might work for you. When it comes down to the nitty gritty of data analysis, they are definitely not enough. BI software that is end-to-end and incorporates a robust backend that can handle huge amounts of messy data is essential to most businesses these days. Just don’t fall for the pretty face of tools that don’t have the necessary backend support.
Published at DZone with permission of Aya Ephrati , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.