Executive Insights on the State of Big Data
Executive Insights on the State of Big Data
Want to make your Big Data strategy successful? Know your use case, hammer out the operations, plan your strategy, and focus on talent.
Join the DZone community and get the full member experience.Join For Free
Hortonworks Sandbox for HDP and HDF is your chance to get started on learning, developing, testing and trying out new features. Each download comes preconfigured with interactive tutorials, sample data and developments from the Apache community.
To gather insights on the state of Big Data in 2018, we talked to 22 executives from 21 companies who are helping clients manage and optimize their data to drive business value.
There are several keys to having a successful big data strategy:
Know the business problem you are trying to solve
Governance and operations
Strategy and structure
Speed of delivery
Start with key use cases that can benefit from big data technology before moving to broad, organization-wide projects. Be driven by a pragmatic approach, look at the requirements and the problems you want to solve. Focus on outcomes and results.
Data operations must ensure that data is moving across the enterprise securely and transparently. Back-up, recovery, and protection are critical with the growth of ransomware and the criticality of data for business. Data governance becomes extremely important since data is sensitive and must be guarded appropriately.
Identify the data fabric strategy you are going to use to pursue new technologies – multi-cloud, hybrid processes, and microservices. Build common practices and architecture framework around the concept of a data lake. Provide fast-data processing and real-time analytics. Empower management to make informed decisions in real time.
There is a talent factor which must be considered to make any initiative successful. Have the right people in place that understand both the technology and the business goals. Provide resources for non-technical people to clean, work with, and garner insights from data.
We asked how companies can get more out of big data, since it seems like they are not seeing the return or business value that was initially anticipated. Several responses spoke to the difficulty, complexity, and time required to implement a big data initiative. This is consistent with the wisdom expressed by the late economist Rudi Dornbusch, things take longer to happen than you think they will, but then they happen much faster than you thought they ever could.
The keys for companies to get more out of big data more quickly is to:
Focus on delivering value with data quickly.
Use the cloud and new toolsets to accelerate the process.
Identify specific business problems to solve.
Unify and process data in real-time from disparate data stores to provide unique insights. Place greater near-term priority on carrying out investigations with business models that deepen insights into market segments and geographies.
Recognize data resources for their highest uses and ingest into business operations to take more intelligent action to drive topline revenue growth. This requires real-time operations, analytics, and transactional integration.
While the uptake of big data projects may be slow, based on the case studies shared by our respondents, it is being successfully implemented in at least 10 vertical industries with financial services, healthcare, and retail being the most prevalent. There are myriad applications including:
Reduced fraud through more precise detection.
More targeted, relevant sales and marketing activities.
Improved customer experience and proactive churn detection.
Analysis of IoT data using machine learning.
Improved compliance with regulations through proactive identification of non-compliant activity.
The most common issues preventing companies from realizing the benefits of big data are:
Inability to evolve from legacy technology.
Insufficient knowledge of big data and skillset.
Failure to define the business problem to solve.
Data quality and management.
Unwillingness to embrace the cloud and not realizing is it not feasible to keep supporting legacy enterprise systems is a major issue. Some businesses want to use existing infrastructure rather than setting up the right backbone infrastructure with storage, transport, compute, and failover capabilities.
The people aspect is real, as expertise is necessary to understand the best technology needed to get the most from data. Companies don’t know where to start, and they need someone on board who knows how to approach big data projects.
You need to start with a defined purpose in mind. Identify the application and the use cases and work from there. Understand that big data analytics are sets of tools and technologies that must be selected and applied for measurable outcomes.
Inability to get heads around the data and the need move the data from storage to compute and back again as needed is the last issue. Preparing and cleaning the data can take weeks, which leaves insufficient time for analytics. Data lakes become data swamps thanks to data that is inaccurate, incomplete, and without context.
The biggest opportunities in the continued evolution of big data are using artificial intelligence (AI), machine learning (ML), and cognitive learning (CL) to provide higher level services that drive business value. Healthcare will use AI/ML for disease diagnosis detecting patterns humans never could. There will be more opportunities to use AI/ML to augment business resources while providing self-service to more users. Businesses will succeed by using AI/ML to make customers’ lives simpler and easier. There will be a natural evolution in AI/ML voice interfaces that reduce the friction with which people interact with machines.
Companies will be more responsive to customers in real-time. All data types and sources will be integrated to provide real-time intelligence, and that data will be the number one business driver in every industry. Data protection and privacy by design will be fully integrated. There will be greater speed and reuse of data management processes with greater trust in the data and less manpower required to manage it.
The only concern regarding the state of big data that was mentioned by just four of 21 respondents was security. Security becomes more important as we become more reliant on data and it becomes more distributed. Security is a function of human failure to follow best practices. Ultimately this will be automated to reduce threats. While security is a big deal, some feel it cannot be regulated. Consumers may stop doing business with companies who are deemed to be unsecure or unethical.
As usual, our respondents suggested a breadth of topics about what developers must be knowledgeable. Understanding the business problem you are working to solve was most frequently mentioned, followed by an understanding of deployment architectures and data.
Understand the use case and figure out the best solution stack to achieve your goals. Have a clear understanding of the range of business objectives within the company and how those align with the capabilities of various technologies, as well as the business value of the datasets you are working with.
Be cognizant of the cloud, microservices, geographic distribution, and security. Leverage the data fabric to simplify processes, and learn about open source options for AI/ML, such as Apache Spark.
Understand the basic data vocabulary of structure, dimension, and variables. Know that data is decentralized and distributed by nature. Know how to work with data at scale, how to handle the concurrency of multiple users. Understand how the data ecosystem works. While it’s rare to find individuals with a perfect combination of skills, certain toolsets and systems can alleviate the need for serious programming experience, help with the data modeling part and even reduce the reliance on deep understanding of the mathematical models behind the predictions.
• Emma McGrattan, S.V.P. of Engineering, Actian
• Neena Pemmaraju, VP, Products, Alluxio Inc.
• Tibi Popp, Co-founder & CTO, Archive360
• Laura Pressman, Marketing Manager, Automated Insights
• Sébastien Vugier, SVP, Ecosystem Engagement & Vertical Solutions, Axway
• Kostas Tzoumas, Co-founder and CEO, Data Artisans
• Shehan Akmeemana, CTO, Data Dynamics
• Peter Smails, V.P. of Marketing & Business Development, Datos IO
• Tomer Shiran, Founder and CEO, Dremio
• Kelly Stirman, CMO, Dremio
• Ali Hodroj, V.P. Products & Strategy, GigaSpaces
• Flavio Villanustre, CISO & V.P. of Technology, HPCC Systems
• Fangjin Yang, Co-founder and CEO, Imply
• Murthy Mathiprakasam, Dir. of Product Marketing, Informatica
• Iran Hutchinson, Product Mgr. & Big Data Analytics Software/Systems Architect, InterSystems
• Dipti Borkar, V.P. of Products, Kinetica
• Adnan Mahmud, Founder & CEO, LiveStories
• Jack Norris, S.V.P. Data & Applications, MapR
• Derek Smith, Co-founder & CEO, Naveego
• Ken Tsai, Global V.P., Head of Cloud Platform & Data Mgmt., SAP
• Clarke Patterson, Head of Product Marketing, StreamSets
• Seeta Somagani, Solutions Architect, VoltDB
Published at DZone with permission of Tom Smith . See the original article here.
Opinions expressed by DZone contributors are their own.