Concerns About the State of Big Data Today
Concerns About the State of Big Data Today
Three primary concerns: 1) privacy and security; 2) lack of collaboration and an ecosystem mentality; and, 3) the need to deliver business value and solve problems.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
Here's who we talked to:
Uri Maoz, Head of U.S. Sales and Marketing, Anodot | Dave McCrory, CTO, Basho | Carl Tsukahara, CMO, Birst | Bob Vaillancourt, Vice President, CFB Strategies | Mikko Jarva, CTO Intelligent Data, Comptel | Sham Mustafa, Co-Founder and CEO, Correlation One | Andrew Brust, Senior Director Marketing Strategy, Datameer | Tarun Thakur, CEO/Co-Founder, Datos IO | Guy Yehiav, CEO, Profitect | Hjalmar Gislason, Vice President of Data, Qlik | Guy Levy-Yurista, Head of Product, Sisense | Girish Pancha, CEO, StreamSets | Ciaran Dynes, Vice Presidents of Products, Talend | Kim Hanmark, Director, Professional Services, TARGIT | Dennis Duckworth, Director of Product Marketing, VoltDB.
We asked these executives, "What are your biggest concerns around the state of big data today?"
Here's what they told us:
- Lack of tools. Lack of an ecosystem. Lack of Hadoop engineers and marketing executives that understand big data. Smarter people can solve smarter problems but they need the right tools to do so. Advance education around Cassandra and MongoDB for big data and cloud administration. The knowledge gap is disconcerting.
- Privacy concerns and bypassing legal mechanisms. Big data is a great tool for making a positive impact; however, it can also be used for the negative. Mapping the genome is ultimately for the good but could be used for the bad.
- The notion of moving from business intelligence software to big data engines is not realistic.
- Too developer and data scientist centric versus operations. Unable to deliver continuous business value. Need data operations center with well understood roles, responsibilities and skills that will promote collaboration between data engineering, scientists, consumers and operations.
- More agreement to work together on core platforms that are not significantly different. Competition is great but there needs to be collaboration. Work as a collective and set standards so everyone is part of a big data ecosystem.
- Big data is not a sufficiently high priority. The org chart needs to adapt to the way we work. B.I. needs to collaborate with IT. Think Agile, not a rigid B.I. approach.
- 1) Operational elements – clashes between protocols, transfer of data, disparate data sources. Need to connect but you get stuck with a middleware layer. 2) Cybersecurity is a huge problem and its impact is yet to be realized. Someone will colonize a big data structure and hop between instances of different customers. 3) Traffic and bandwidth as a function of cheap storage. While we’ll have algorithms to handle data more efficiently, it will be difficult to transfer data to storage or to analysis. We need to push processing into the data stack and analyze on the edge.
- Hype – that’s why we wrote “Big Data in Small Bits.” Big data is overly used and under-defined. We ingest big data and provide associates with the information they need to do their job rather than reams of reports to go through.
- Expectations of big data systems capture data but don’t solve problems. It is complicated to use the data. People think it’s a magic bullet but it really requires a lot of knowledge and work.
- Immature – haven’t thought through the implications of the amount of data generated. Put in data lakes but unable to do anything. End up wasting data. Deleting, eliminating, or losing what was valuable data. We’re three years away from having a level of education to learn how to work with high volumes of data. Distributed systems will result in an order of magnitude increase in data.
- Security and privacy of the data and the transaction continue to be a concern. When we’re connecting data across different entities we need to ensure the connections are secure on both ends.
- Privacy, security and sharing models. We are fortunate to partner with a company like Salesforce that takes such great pains to ensure security. Privacy and security of personal and corporate data will be some of the biggest political battles that people all over the world will wrestle with for the foreseeable future. Where is the line and where is the acceptable place to move it to?
Do you have any concerns around the state of big data?
Opinions expressed by DZone contributors are their own.