Top 5 Business Intelligence Mistakes
Top 5 Business Intelligence Mistakes
Learn about the top five business intelligence mistakes, from using incorrect data or deleting it prematurely to asking the wrong questions.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
These days, companies large and small have an insane amount of data to help with decision making.
A small mom-and-pop restaurant with a cloud-based reservation system can forecast how much ingredients to order for the week. Yet we all still make bad decisions. Why?
First of all, let’s not blame the data. By itself, data can’t do anything.
If there’s anyone to blame, it’s us. That’s right: The human beings behind the data.
We are the ones that decide what data to record, how to record it, how to analyze it, and how to look at it. Between the moment we have a question and the moment we make a decision, there are numerous chances of misusing data and arriving at the wrong conclusion. It’s like walking through a minefield.
Working in the analytics field, I’ve seen hundreds of data analyses go nowhere, wasting thousands of hours of effort. So I’m going to share five of the most prevalent mistakes I’ve seen.
What’s the Actual Problem?
I once helped an e-commerce company analyze their top 10 sources of new visitors. After seeing the results, they were ecstatic to find that both their paid campaigns and their blog were on the list. These were sources that they could actively control and scale. So they did just that: They invested more money in their paid campaigns and kept their blog active.
Yet a few weeks in, they started to wonder why their effort didn’t translate into higher revenue. A lot of new people were visiting the site, but not buying.
The simple answer is that the analysis they wanted answered a specific question: Which sources brought the highest number of new visitors? It did not answer which sources brought the highest number of paying customers, or high lifetime revenue customers, which would both have been more helpful to their actual problem of growing new revenue. So to avoid wasting time, effort, and money, I recommend asking the right questions.
Is the Sample Statistically Significant?
I once observed a sales team cancel a process change after 10 prospects failed to convert under a new process (they handled on average 200 prospects a month). By no means was that sample size significant enough to draw any conclusions yet, scientifically speaking. It was not a data-driven decision. It was an emotional decision.
I’ve also witnessed a case where a company made product decisions based on half-a-dozen phone interviews with select clients that they had good relationships with. This particular company had 500+ clients. Half-a-dozen people among a population of 500+ clients does not represent an accurate view of growth opportunities. In addition, the quality of the sample was also questionable. All clients interviewed had a good relationship with the company, which indicates that the opinion of unhappy customers and churned customers were not acknowledged.
Sampling problems, including selection bias and lower than optimal sample size, abound in business intelligence. Startups are especially prone to take shortcuts and use poor samples. Sometimes, it’s because there is simply not enough data; for instance, if a company just started acquiring customers, there may not be enough customers to make the analysis statistically significant. Other times, it’s because of pure impatience: Teams want to take decisions now, not in two weeks, so they often fail to wait for their experiments to fully complete.
The result is a decision based on bad data.
Are the Numbers Relevant?
I’ve also witnessed many companies set their sales goals based on historical results, but then change their entire sales process and expect the same goals to be hit.
How can one expect the outcome to be the same when all input variables have changed? It’s like expecting to fly from New York to Los Angeles in six hours, but then change our plane for a car and still expect to get there in six hours.
Let’s recognize that the analysis or forecast that we do is only good for the scenario that we considered. Should we decide to tweak or change our scenario, a new analysis needs to be performed.
Are You Sure the Numbers Are Right?
NASA once lost a $328 million satellite in space because one of its components failed to use the same measurement units as the rest of the machine. Target lost $5.4 billion in Canada partially because its inventory system had incorrect data.
Time and again, huge mistakes were made because the underlying data fueling these projects was bad to begin with.
So to make sure that my analysis is accurate, I often ask a second party to check the numbers. One should never review their own essay. The rule applies to analyses as well.
What Does This Mean?
Having access to information doesn’t mean that we know what to do with it. I’ve seen many people confused by data reports and unsure of what decision to take.
I once helped a B2B company evaluate which customer group to target for an advertising campaign. Their product was used by customers from three different industries, but they didn’t have the resources to tailor their sales processes and marketing content to all three groups yet.
So they began by looking at revenue generated by the three industries. Then they looked at revenue growth over time, profitability, and lifetime revenue. The results showed that 50% of their revenue came consistently from one industry, but that another industry was the fastest growing, going from 10% to 35% of their revenue over the past year. Both were potentially good choices to target and they didn’t know which one to pick.
I thus asked them to divide the total revenue by the number of clients/companies in each industry, effectively giving us the average revenue per client. My logic was that their sales and marketing efforts were going to be spent on a select number of prospects, so targeting prospects with higher individual revenue may yield a better ROI (e.g. between a $500/year client and a $5,000/year client, I’d advise to choose the $5,000/year client assuming that cost of support is similar). Based on the analysis, we saw that the fastest growing industry was also the one with the highest paying clients. This thus made the decision a little easier.
The point is that looking at the right information is important, not just information. This requires people that can interpret data, explain caveats, and tell a story. I thus highly recommend for all managers, data analysts, and data scientists to read Cole Nussbaumer’s Storytelling with Data book.
We Deleted What?
I once tried to help a SaaS company understand their user churn trends, only to discover that they delete customer account information three months after deactivation. This meant that there was only data on recently churned clients. The sample proved to be too biased to draw any useful conclusions.
Developers may delete data because they are running out of room on their hard disk, or because they think that a certain piece of data is unimportant. Regardless of what developers think, from an analytical perspective, we should never ever ever delete data.
Published at DZone with permission of Blake S. , DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.