Addressing Big Data Failures in Predictive Analytics
Addressing Big Data Failures in Predictive Analytics
This article covers how to prevent big data failures in predictive analytics by diving into strategies for proper implementation, common mistakes, proper big data structuring, and more.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
The role of analytics has evolved considerably over the last few years. Big data has predictive capabilities that project managers could never have imagined in the early 2000s. However, implementing predictive analytics is a very time and resource-intensive process.
Big Data Strategies for Predictive Analytics Models
Helena Schwenk, Principal Analyst for MWD Advisors, previously mentioned that most brands are aware of the benefits of using big data for predictive analytics. They are invaluable for applications in healthcare, marketing, just-in-time inventory management, personnel staffing, and countless other purposes.
However, recent studies have shown that they struggle with implementation. They encounter many problems, which stem from choosing the wrong big data framework, failing to conceptualize goals, and a lack of consistent data management.
Reputation Management and Other Predictive Analytics Applications
Developing a predictive analytics and big data management campaign is very time intensive and requires a lot of resources. It’s easier to make the investment if you understand the applications and their benefits. One of the most important applications is reputational management.
Many reputation management companies have highlighted the importance of predictive analytics, including Cognizant and RenegadeWorks. Israel Kloss, the founding principal of Findable Consulting has even gone so far as to say that predictive analytics is the future of reputation management.
Big Data Mistakes in Predictive Analytics
There are a variety of reasons predictive analytics campaigns fail to meet their goals:
- Lack of realistic, measurable goals
- Inadequate infrastructure
- Inconsistent data preservations, usually caused by poor communication
- Failure to maintain strict user permissions
These problems can lead to very expensive and disastrous mistakes. The good news is that they can be easily avoided if you establish the right data protocols for your predictive analytics campaigns. According to Computer World, many projects have been turned around after these heuristics were addressed.
“This time, the business moved forward, and eventually the data scientists found a way to derive the missing target values from other data," noted John Ainsworth, Data Scientist at Elder Research.
The project is now on track to deliver major cost savings by accurately predicting failures, avoiding costly shutdowns and identifying exactly where to apply expensive preventive maintenance procedures. Had they waited for perfect data, however, it never would have happened, Deal says, “because priorities change and the data never gets fixed.”
Structuring Big Data Properly
There are a number of steps that you can take to improve the execution of your predictive analytics campaign. Follow these guidelines to avoid a predictive analytics meltdown.
More than anything else, the success of your predictive analytics projects hinges on clear, measurable goals. Too many brands establish vague goals, such as improving decision-making or reducing inefficiencies.
The problem with these goals is that they don’t specify what variables need to be measured or what trends need to be studied. It leads to a problem that Koen Havlik, Data Scientist and Partner at Algoritmica, refers to as “treasure hunting” in a post for Datafloq.
Havlik stated, “Treasure hunting is rarely useful for predictive analytics. Your company has to identify, with some help, which processes are worth optimizing. Once the relevant data sets and process are identified, the business opportunity can be reduced to a data problem. Handing over a data set in the hopes that someone can find a pot of gold is not the way to go.”
Setting clearly-defined goals helps you identify the variables that must be tracked. This allows you to create a clear data hierarchal structure that can:
- Store this data
- Access it in real-time
- Organize and present it in a format that makes predictive analytics possible
Once the goals are established, it’s necessary to structure the data hierarchy. There are a few things that you need to keep in mind at this point:
- You must rank data according to its importance. Praveenkumar Hosangadi, Product Marketing Manager for IBM, said that the data hierarchy of importance is crucial in every IBM project. He cites “A Framework to Map and Grow Data Strategy,” a study by Theresa Kushner and Maria C. Villar, which set the new standard for the data hierarchy of needs.
- Data needs to be preserved in a structure that allows for scalability. Among other things, this means that big data engineers need to prioritize data independence. Unfortunately, most big data projects fail in this regard. Volker Markl, Professor and Chair of the Database Systems and Information Management (DIMA) group at the Technische Universität Berlin (TU Berlin), claims that big data projects will continue to fall short of their goals until this is resolved.
- Data need to be easily extracted in real-time. Organizations must utilize BigQuery and other tools that enable data engineers to quickly execute SQL queries.
By this stage, you should have a clear idea of your big data needs. You can invest in the right infrastructure to begin implementing your predictive analytics projects. However, you still have a long way to go before you successfully execute your predictive analytics system.
Communicating Data Strategy to Your Team
The final process of setting up a predictive analytics system is getting everyone on board. The truth is that data is structured around human users. It is presented in a format that everyone can understand and use.
This means that a lot of problems can arise if everyone isn’t on the same page. You need to communicate your predictive analytics approach to everyone on your team. They need to understand the need to consistently preserve data in the same format. If anyone feels that the structure needs to change, then they must communicate it with the rest of the organization, because serious problems can occur if changes are made without unanimous approval.
Helena Schwenk, Principal Analyst for MWD Advisors, previously mentioned that most brands are aware of the benefits of using big data for predictive analytics. A number of organizations have found that predictive analytics is invaluable for applications in healthcare, marketing, just-in-time inventory management, personnel staffing, and countless other purposes.
Call centers often use predictive analytics to determine their capital needs. The number of headsets they need is obviously correlated to the number of call center employees they have on hand. They often have to hire new temp employees to handle demand during the holidays and other peak seasons.
American Express, Citibank and other companies with large call centers need to use predictive analytics to forecast demand. They can then estimate the number of headsets they must order from a company like HeadSetPlus to accommodate their temp workers. This helps them operate their call centers much more efficiently.
Opinions expressed by DZone contributors are their own.