Issues Affecting Big Data Projects

DZone 's Guide to

Issues Affecting Big Data Projects

“Lack of vision” was expressed in several ways by respondents. Lack of talent and security were also mentioned by multiple respondents.

· Big Data Zone ·
Free Resource

To gather insights for DZone's Big Data Research Guide, scheduled for release in August, 2016, we spoke to 15 executives who have created big data solutions for their clients.

Here's who we talked to:

Uri Maoz, Head of U.S. Sales and Marketing, Anodot | Dave McCrory, CTO, Basho | Carl Tsukahara, CMO, Birst | Bob Vaillancourt, Vice President, CFB Strategies | Mikko Jarva, CTO Intelligent Data, Comptel | Sham Mustafa, Co-Founder and CEO, Correlation One | Andrew Brust, Senior Director Marketing Strategy, Datameer | Tarun Thakur, CEO/Co-Founder, Datos IO | Guy Yehiav, CEO, Profitect | Hjalmar Gislason, Vice President of Data, Qlik | Guy Levy-Yurista, Head of Product, Sisense | Girish Pancha, CEO, StreamSets | Ciaran Dynes, Vice Presidents of Products, Talend | Kim Hanmark, Director, Professional Services, TARGIT | Dennis Duckworth, Director of Product Marketing, VoltDB.

We asked these executives, "What are the most common issues you see affecting big data projects?"

Here's what they told us:

  • Big data is moving quickly but the lack of tools and enterprise solutions is limiting use. Oracle is the stickiest platform because they provided open Tableau for visualization. We’re building out our solution for the next 10 to 15 years.
  • The lack of qualified talent. Companies don’t know what they don’t know. There’s limited management understanding of how to use data to manage decisions.
  • Companies tend to be vague about what they are trying to accomplish. Big data is a common buzz word and executives are giving CIOs money to solve big data problems. There’s a need to define where the business value will come from. Companies are confusing operational and non-operational problems and situations. Clients need to determine what’s affecting the bottom line.
  • Not knowing the right solution to use to solve the problem at hand.
  • 1) Doing big data for big data’s sake. Don’t know how to make it actionable. Ensure we start with a pragmatic use case. Set expectations of what you want to achieve. Solid value plan for the business. Agreed upon set of use cases and metrics. How to ingest the data? 2) Understand what big data and Hadoop can and cannot do. It’s not built for large user seat environment. Think about how to support use cases. Help customers understand what to surround Hadoop with to solve business problems.
  • Data drift is a big source of headaches. A common issue is delayed or false insights and unknown trust of the insights. Companies are using old algorithms on new data. If they don’t trust what the app is telling them, they need to run a forensics exercise to determine if the data can be trusted. There are a lot of false positives and false negatives with no governance processes.
  • 1) The skillset shortage is not going away. We can use 10,000 more Java developers today. 2) Need for project governance around data management with the tools and services to manage. 3) Isolation – consolidating nodes and clusters into single clusters lead to questions of multi-tenancy – resource use in isolation. 4) Security – who can use which data. Need metadata with data stored in Hadoop. No one has addressed security yet. Need to be able to read and cross-check against the user profile before granting access.
  • 1) Security – moving data around is not safe. Need someone who understands the infrastructure and the security protocols. Microsoft is making a significant effort to store information in the cloud securely. 2) Companies are slow seeing how big data can provide them with a business advantage. We’re still a couple of years away, especially for mid-sized companies. The price of infrastructure is going down. Start with small goals to see results and prototype with fast iterations. 3) Need developers in the business intelligence community using the scrum methodology. Too many structured patterns in the B.I. community.
  • Getting adoption of big data within an organization. Every big data project is fighting the hype and needs to earn credibility. Our reports show how actions are aligned with the P&L, revenue, profits, or bonuses – the KPIs of the company.
  • Goals and expectations of data quality are unrealistic given the inconsistency of the data and the preparation required for analysis. Capture more/all data and get value from it is important. The ability to store data and then go back and look at it.
  • How to get 100% coverage of the data with granularity. How to scale to all business and IoT data. It used to take a company several days to identify a problem and now they’re able to identify the problem in real time. Ecommerce has the ability to see the granularity of an incorrect price and correct it before it does significant damage to the brand.
  • It depends. We have sophisticated clients that know exactly what they want. Others have been collecting large amounts of data and need guidance on what to use to ingest into a cluster and how to do computations, cleaning, and transformation of high volume formats. Need help figuring out which paths to take to make calculations or ask questions.
  • Some companies are still embarking on new big data technology investments without a business or use case. Companies need to understand the customer journey and be thinking about how to use data to improve customer engagement and customer experience.
  • Now we have actual problems to solve. How to make things happen – customer engagement, customer experience, and customer journey. Get the organization to work together to provide the best customer experience.
  • Displaying meaningful results to a client within minutes, if not seconds.  Clients want things to work and to work fast.  It’s our job to make sure that happens with just the click of a button.

What issues do you see affecting the big data projects you're working on?

big data

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}