BI vs. Big Data vs. Data Analytics by Example
BI vs. Big Data vs. Data Analytics by Example
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
I know that not everyone will agree with my definition of Business Intelligence, but my objective is to simplify things; there is enough confusion out there. Besides, who is the authority on an outdated terminology that doesn't cover the entire spectrum of value that intelligent data can bring to businesses today?
Business Intelligence (BI) encompasses a variety of tools and methods that can help organizations make better decisions by analyzing “their” data. Therefore, Data Analytics falls under BI. Big Data, if used for the purpose of Analytics, falls under BI as well.
Let’s say I work for the Center for Disease Control and my job is to analyze the data gathered from around the country to improve our response time during flu season. Suppose we want to know about the geographical spread of flu for the last winter (2012). We run some BI reports and it tells us that the state of New York had the most outbreaks. Knowing that information, we might want to better prepare the state for the next winter. Theses types of queries examine past events, are most widely used, and fall under the Descriptive Analytics category.
Now, we just purchased an interactive visualization tool and I am looking at a map of the United States depicting the concentration of flu in different states for the last winter. I click on a button to display the vaccine distribution. There it is; I visually detected a direct correlation between the intensity of flu outbreak and the late shipment of vaccines. I noticed that the shipments of vaccine for the state of New York were delayed last year. This gives me a clue to further investigate the case to determine if the correlation is causal. This type of analysis falls under Diagnostic Analytics (discovery).
We go to the next phase, which is Predictive Analytics. PA is what most people in the industry refer to as Data Analytics. It gives us the probability of different outcomes and it is future-oriented. US banks have been using it for things like fraud detection. The process of distilling intelligence is more complex and requires techniques like Statistical Modeling. Back to our examples: I hire a Data Scientist to help me create a model and apply the data to the model in order to identify causal relationships and correlations as they relate to the spread of flu for the winter of 2013. Note that we are now taking about the future. I can use my visualization tool to play around with some variables, such as demand, vaccine production rate and quantity to weigh the pluses and minuses of different decisions insofar as how to prepare and tackle the potential problems in the coming months.
The last phase is Prescriptive Analytics, which involves integrating our tried-and-true predictive
models into our repeatable processes to yield desired outcomes. An automated risk reduction system based on real-time data received from the sensors in a factory would be a good example of its use case.
Finally, here is an example for Big Data. Suppose it’s December 2013 and it happens to be a bad year for the flu epidemic. A new strain of the virus is wreaking havoc, and a drug company has produced a vaccine that is effective in combating the virus. The problem is that the company can’t produce the vaccine fast enough to meet the demand. Therefore, the Government has to prioritize its shipments. Currently, the Government has to wait a considerable amount of time to gather the data from around the country, analyze it and take action. The process is slow and inefficient. The following includes the contributing factors. Not having fast enough computer systems capable of gathering and storing the data (velocity), not having computer systems that can accommodate the volume of the data pouring in from all of the medical centers in the country (volume), and not having computer systems that can process images, such as X-rays (variety).
Big Data technology changed all of that. It solved the velocity-volume-variety problem. We now have computer systems that can handle "Big Data." The Center for Disease Control may receive data from hospitals and doctor's offices in real-time, and Data Analytics Software that sits on top of Big Data computer systems could generate actionable items that can give the Government the agility it needs in times of crisis.
Opinions expressed by DZone contributors are their own.