Over a million developers have joined DZone.
{{announcement.body}}
{{announcement.title}}

8 Reasons Why So Many Analytics Projects Fail

DZone's Guide to

8 Reasons Why So Many Analytics Projects Fail

To better understand why analytics projects fail, we break some major challenges down and propose best practices to ensure the success of your next analytics project.

· Big Data Zone ·
Free Resource

Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.

What does data analytics really mean and how can it help you today? Search Google for the term “data analytics” and you’ll be confronted with more than 39 million results. Good luck with that.

To put it simply, data analytics can help you predict the future and make decisions with relative certainty. Though it may provide a competitive advantage today, analytics-based decisions are increasingly becoming the norm across industries. Whether in the private sector or a government agency or nonprofit, everyone is jumping on the bandwagon. Yet, analytics has its challenges and, done wrong, can quickly drain your resources without leading to actionable results.

Data analytics and predictive modeling are notoriously hard. Just look at the obstacles:

  • Too many data systems and tools. In the business intelligence (BI) arena there are dozens of comparable yet distinct tools to evaluate. How do you know which are right for your industry or business?
  • User competence. Any team embarking on data mining, predictive analytics, and predictive modeling must be very well versed in math and statistics.
  • Dirty data. Getting to a point where you have data that you can work with can suck the life out of your analytics project. Many organizations spend 80% of their time cleaning up data.
  • Misplaced priorities. It’s imperative, but often overlooked, that analysis is done early on to reconcile what business leaders think they want, versus what they need, versus what they can achieve. Typically, this happens too late during the discovery and project phase.
  • Analytics is boring. The C-suite has a short attention span. To keep them engaged, you must continuously provide them with insights to ensure their full support throughout the project.

To better understand why analytics projects fail, we’ve broken these challenges down and proposed some best practices to ensure the success of your next analytics project.

Data Analytics: Challenges and Misconceptions

There are several challenges and misconceptions that lead data analytics projects to fail. Here are eight of the most common ones we encounter.

1. Data Silos

Any vendor who’s trying to sell you a BI tool will invariably start their pitch talking about your in-house data challenges, most of which have their genesis in disparate data sources and data silos.

Here at EastBanc Technologies, we see customers with tens of thousands of internal databases that don’t talk to each other. Many lack the robust API layer needed to get programmatic access to data. Instead, they gravitate towards the concept of the data warehouse, where big data is collected and stored in one place before it’s used for analysis. Although they have a role to play, data warehouses are overrated. Not only do they take years and cost millions to implement (something your BI service provider won’t tell you at the outset), they also add to your data glut and delay actionable insights.

Data warehouses can also up your storage costs and make the data analytics process more laborious. Because warehouses extract data into data cubes each time you need to use it, duplicate data is created (one terabyte can quickly become 50 terabytes). Plus, the data cube data isn’t updated automatically, potentially leaving you with stale data that creates noise in your system and making the data cleaning process even more laborious.

Tip: An API eliminates this problem by automating data sharing so that you can start showing results with the data sampling that you have, rather than wait for all your data to become available.

2. Dirty Data

The second data analytics challenge facing organizations is dirty or inconsistent data. Common examples of dirty data include basic hygiene issues like misspellings and empty data fields which can cloud your insights. But the problems can be more systemic. For example, a company’s CRM system may define the northeast U.S. in a way that includes Washington, D.C. However, the HR department may not consider Washington, D.C. to be a part of the northeast. This creates inconsistencies. If you merge CRM data with HR data, you have a problem. While tools exist that can pull data out of disparate systems, they don’t always flag data conflicts.

3. Lack of a Plan

To get to predictive analytics nirvana and rein in data sprawl, you need a plan – an analytics roadmap. Since internal teams rarely have the expertise for this task, organizations, quite rightly, look to outside experts. It’s very hard to generate the needed experience (amount and diversity of projects) that outside experts can contribute.

4. Fear and Fiefdom

Which comes to the next challenge – fear and fiefdom. Data owners may fear they’ll lose relevance or control of data if they are forced to share data sets with other departments, agencies, or external expertise is brought in.

Tip: Counteract fiefdom through evangelism. It behooves your organization to prove how sharing data creates value across the board.

5. The Old School Mindset

Moving up the internal organizational hierarchy, the leadership team also has a role to play in why analytics projects fail. Many business leaders have a “green bar report” approach, meaning they trust the old way of doing things. Instead of basing their decisions on actionable results, they want to review each bar, a time-consuming process.

In addition, there’s a lack of continuous involvement or sponsorship by executives in the analytics process. It’s the very antithesis of Agile IT.

Consider a typical analytics project. During the kickoff meeting, everyone is in the room. Yet, a meeting with leadership involvement may not reconvene for another three months. They ask for an update and find that the outcomes are not what they were looking for or priorities have shifted, but the analytics team doesn’t know because no one bothered to tell them.

Tip: The solution to this problem is to adopt a lightweight, agile approach. Your business sponsor doesn’t need to be involved all the time, but you do need periodic, short feedback loops. Feed your sponsor minuscule pieces of fast progress. This will ensure they see immediate, incremental results, get exactly what they need, and drive greater engagement.

6. The Big Bang Approach

If you lack the internal resources to support your data analytics project and decide to bring in an external firm, that’s fine. No one ever got fired for choosing Accenture or Deloitte. But, these industry leaders often promote the concept of an all-in, big bang, multi-million-dollar approach. The approach may work, but it’s not the only way to go.

There is another lower risk, fast results route that embraces agile and focuses on achieving Minimal Viable Predictions. After all, an analytics tool, like a piece of software, is a product and product/software development best practices still apply. Most companies that provide analytics services, don’t have a product mentality. Instead, they focus on maximizing billable hours which stymies the transfer of knowledge and intellectual property to the product owner who is going to run the tool, aka the customer.

This approach also plays on the common misconception that analytics is a one-time project. That you set-up everything and reap the benefits. It doesn’t work that way. It’s also not a black box effort. Everything must be transparent with immediate, disciplined, and regular feedback loops.

7. Focusing on Big Data Rather Than the Right Data

Another misconception is that you need to start big. Starting small is a much more fruitful (and lower cost) exercise than bringing in heavyweight tools and consultants. Time and again we have seen some of the most valuable business insights derived from surprisingly small data sets. It’s time to focus not on big data but on the right data at the right time and find a way to ask the right questions of that data. Starting small also minimizes risk.

Is there a tool for this? Again, that’s another misconception. You can’t just buy a tool, install it, and hope it will magically predict the future. As your business needs evolve and digital transformation matures your choice of tools will also change.

8. Expecting Certainty

Finally, don’t expect 100% certainty in your predictive analytics. It’s simply not possible. In some cases, we’ve seen margins of error of 30-40%, and that’s ok.

Predictive Analytics Done Right: Breaking Through the Obstacles

How can you overcome these obstacles and misconceptions and ensure data analytics success?

Start by addressing the number one challenge facing internal teams – the data silo problem. How do you start analyzing data when you can’t even integrate it cleanly?

We came up with a concept to break out of these data silos without data owners fearing they’ll lose control of their data. By packing all relevant technology into a container and giving each department access to the data that’s relevant to them (or that the data owner feels comfortable sharing) via an API – silos are easily overcome. We call it an “API-in-a-box” and it can be spun up in minutes, eliminating the time-consuming data integration problem. Plus, after data errors are found and one department’s data is merged with another, actionable insights start to emerge and the barriers of fear and fiefdom start to break down.

But a best practice approach to predictive analytics done right doesn’t just hinge on technology. How do you focus on the right data? How do you start small and see actionable results in weeks or months not years?

Part of the reason that many projects fail is that organizations rush to accumulate and analyze as much data as possible, all at once, and without much of a plan. This usually ends terribly. It’s not unusual anymore for organizations to house multiple petabytes of data. But starting big can lead to an overwhelmed team, not to mention huge costs. These organizations won’t be able to see the insights for the data, so to speak. With more data and more capabilities to rationalize data across silos, it can feel like there are more opportunities than there are resources to exploit them.

Our best practices approach to this challenge borrows from agile software development. We call it the Minimal Viable Prediction (MVP) approach.

Instead of eating the entire elephant at once (your mouth is too small and you’ll choke), the MVP is a low-risk approach to data analytics that a successful start-up might take. Disregard the noise and assemble only the data that correlates with your number one problem, as fast as you can, and iterate from there. It also ensures your business sponsor will see the kind of actionable results that drive greater engagement and ultimately, ensures the long-term success of your predictive analytics projects. Read more about how to implement MVP here.

Another best practice approach to better data management is to rely less on your own internal data. There is a tendency to want to own everything in-house (no matter how dirty). Instead, try to create and maintain as little data as is necessary and always look for external sources. Data can be readily leased or purchased. For example, weather data is readily available for purchase and is much cheaper than collecting your own.

This monetization of data is increasingly commonplace. Another example is an EastBanc Technologies client who assembled sports data from national leagues, packaged it into an API and licensed it. All the data was created and maintained by the leagues, the company simply utilized that data and provided it to others for a fee.

Hortonworks Community Connection (HCC) is an online collaboration destination for developers, DevOps, customers and partners to get answers to questions, collaborate on technical articles and share code examples from GitHub.  Join the discussion.

Topics:
data analytics ,big data ,dirty data ,data silos ,predictive analytics

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}