What Scares You About Big Data?
What Scares You About Big Data?
Security is becoming more important as we become more reliant on data and the data becomes more distributed.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
To gather insights on the state of big data in 2018, we talked to 22 executives from 21 companies who are helping clients manage and optimize their data to drive business value. We asked them, "What are your biggest concerns regarding the state of big data today?" Here's what they told us.
- Security becomes more important as we become more reliant on big data and it's more distributed. A higher percentage of the data becomes critical.
- Haven’t seen the speed of materialization as promised.
- The ecosystem is complex and messy, and therefore hard to learn. Alluxio with its partners. The data is stored in various storage systems and hard to manage and retrieve. The applications are hard to replicate. Application vendors and cloud vendors. There are many types of environments, and it's hard for vendors to make their software run smoothly in different environments.
- Not realizing the possibilities to add value to your business with cognitive and AI/ML. Companies relying on current information governance and intelligence archives rather than being flexible. AI/ML will make your current structure obsolete. Retrieval of data is painful — AI/ML makes it easy.
- The technology has matured. There are many viable options: open source and proprietary. It’s easy to fall into the trap of cluster sprawl and building another monolith. You are not limited by design and will go back to Conroy’s Law if you’re not careful.
- Nothing is going to stop data. It’s the primary asset and it will continue to grow. The tech stack will continue to innovate to keep up with the growth of data.
- Developers are building applications within self-contained teams and need infrastructure that supports this model to make the process as efficient as possible. Orchestration platforms like Kubernetes along with container technologies help to enable this.
- Customer experience is now the most important digital initiative, followed by building a single customer data view and customer journey management. Enterprises are indeed switching their focus from internal resource management to external customer experience. Enterprises now perceive their customers and their ecosystem to be a key source of co-innovation rather than their traditional internally focused R&D organizations. This implies big data to be more collaborative than ever by encouraging participation, sharing of data, and co-innovation. But too often, businesses get in their own way by refusing to create a culture around data and not prioritizing the proper funding and staffing for data management. Also, it's still a real challenge to create a trusted environment with the enterprise's ecosystem in order to capture valuable data from partners, customers, or other stakeholders to improve the customer journey. In a customer experience network, the synergistic value created by the network is greater than the sum of its parts. It provides enterprises with the means to “crack the code” to deliver superior customer experiences.
- Slow, manual, one-off efforts that are discarded and require rework — an AI-driven data platform that allows you to reuse the rules, policies, and business logic that cleanses, masters, governs, and secures data ensures that each iteration of your big data effort can leverage the learnings and investments from the prior. Too much time spent finding data — an AI-driven enterprise data catalog automatically discovers data assets, minimizing this challenge. No common authoritative set of data assets for everyone to use; ensuring the data in your data lake is cleansed, mastered, and governed confirms it’s certified “fit for use” across the organization. Preparing and cleaning data takes weeks, leaving insufficient time for analytics. AI-driven data prep capabilities fast-track analysts’ ability to get insights from raw data. Data lakes become data swamps from data that is inaccurate, incomplete, and without context — a governed data lake ensures the data remains fit for use.
Here’s who we spoke to:
Opinions expressed by DZone contributors are their own.