Big Data 2019 Predictions (Part 2)
Data moves to the cloud and Data-as-a-Service gains traction.
Join the DZone community and get the full member experience.Join For Free
Given the speed with which technology is evolving, we thought it would be interesting to ask IT executives to share their prediction for 2019. Here's what they told us about big data:
I really think streaming is going to be the big thing in big data for 2019. Companies are rapidly realizing that processing and analyzing data as it arrives — rather than churning through it after it has been saved somewhere — provides huge advantages, both in time to get answers and with respect to the agility of the analysis they can perform.
Reducing the final 1% of IT operational noise. When we apply automation technologies to IT operations today, we're able to see up to a 99 percent reduction in noise. Over the next decade, experts and innovators will continually be pushing to eliminate that extra 1 percent. While at first, that might sound like such a minimal accomplishment, the exponential increase in data and information makes this a monumental task.
Other digital spaces have been wildly successful leveraging predictive analytics: retail has used targeted advertisements on websites and social media to increase click rates and drive sales, for example. With the massive amounts of data generated daily, it’s time for healthcare to adopt the same practices. Healthcare providers who adopt a data-driven culture can quickly learn how to analyze data in a way that can improve operational efficiency.
Blockchain Will Become a Commodity. Vendors are fighting for a share of a rapidly increasing market for blockchain applications, but the reality is it’s a race to the bottom. As standardization continues, there will be little differentiation and blockchain will slip into the background of applications, taking place behind the scenes. Industries like data management will begin adopting this technology as well since it offers a way to validate and trust the data as records are pulled into other resources.
The Autonomous Car Creates Data Center Chaos. There is a massive investment right now in autonomous and connected cars and soon this investment will need to cascade to the data center. The success of autonomous cars relies on telemetry data from vehicles to inform driving decisions, but how do you properly archive this data for compliance. With so many data points becoming created every minute, how do you properly isolate necessary data, such as from any accidents or incidents and retain it for the multiple years necessary? Proper data management architectures will be key to ensuring success.
“Data Protection” is Giving Way to Data Management. The ability to protect data and restore backed up files is no longer sufficient for modern business. Data has become the fuel for company success, driving insights, customer targeting, and business planning and even training AI and machine learning models. Any way to extract additional value from it is critical to business success and the shift to data management is key, where data is not only protected, but properly archived, easily searchable, can be leveraged for analytics, and is compliant the entire time.
Unrecovered Data Loss on the Rise. 90 percent of respondents to Druva’s 2018 State of Virtualization in the Cloud survey noted they will be using public cloud in 2019; however, many companies are still backing up their IaaS/PaaS/SaaS with manual processes. Even more concerning, some are not backing up their IaaS/PaaS/SaaS environments at all, based on the assumption that protections offered within the service itself are “good enough.” These protections, in Office365 for example, do not mitigate risks associated with hackers, ransomware, malicious users, or typically anything deleted more than 60 days or so.
Over the next year, we’ll see a fundamental shift in exactly “who” is the primary consumer of enterprise data. While the primary focus of data handling to date has been getting useful and timely insights into the hands of people – business analysts, data scientists, decision-makers, etc. – we will see a rapid shift in which intelligent applications will become the ultimate consumers of data. That shift will accelerate adoption of microservices and data-driven architectures in order to meet the demands of these applications.
Analytics at the edge (versus just capturing and relaying data) grows exponentially as the tools and applications to perform analytics in place mature.
1. The CIO strikes back. The days of forgetting that the "I" in CIO stands for "information" are over. The CIO role will become more identified with leading a company’s data and information strategy rather than infrastructure and security. Much like digitization and data have transformed the CMO role, the CIO role will be unrecognizable from its current form in a few years. We can expect this process to pick up steam in 2019.
2. Data science claims its place as Strategic Data Command. Traditional intelligence tools and platforms do a good job of providing operational insight and reporting. Data science has an opportunity to help with weak signal processing — information that comes from the market, from field personnel, and from customer support, and which can guide company strategy towards new opportunities or avoid potential disasters.
3. The Cloud is coming for your data and analytics. First it was your applications moving, but the cloud isn’t satisfied. It’s now coming for your data and analytics – and you’ll be glad it did. Metered, scale-out solutions and proximity to application data that's now based in the cloud make this an easy one to swallow. Universal access to data, complete browser-based analytics, and high-performance, flexible architectures will make us wonder how we ever functioned before.
Data consumers take the spotlight. As companies evolve their business through digital transformation, data becomes ever more important, and the companies that do the best job of extracting value from data will outpace their competition. It is data consumers—data scientists, analysts, BI users, statisticians—who are in the trenches, finding this value, and making discoveries that advance strategic interests. Globally there are approximately 200 million data consumers, and in 2019 companies will start to recognize that improving the productivity of this worker class will drive massive value to their bottom lines. Expect to see significant efforts to study the daily workflow of these individuals, and investments to improve productivity, increase training, and provide greater retention offerings.
Data-as-a-Service is the next evolution in analytics. We are now 10 years into the AWS era, which began with on-demand infrastructure billed by the hour and has now moved up through the entire stack to include full applications and every building block in between. Now companies want the same kind of “on-demand” experience for their data, provisioned for the specific needs of an individual user, instantly, with great performance, ease of use, compatibility with their favorite tools, and without waiting months for IT. Using open source projects, open standards, and cloud services, companies will deliver their first iterations of Data-as-a-Service to data consumers across critical lines of business.
More big data deployments will move to the cloud as organizations adopt a hybrid cloud strategy. Organizations will continue to realize big data is better suited for the cloud due to its elastic compute requirements.
Stream processing will be adopted for complex data management. Rather than using relational databases, stream processors will be able to process ACID transactions directly across streams and states. Like event streams hold the source-of-truth for changes in the world, ACID-compliant stream processing can resolve many streams with overlapping and conflicting changes into a consistent state of the world, at a fraction of the cost and with significantly greater flexibility and ease of deployment.
The explosion of AI applications will make distributed stream processing a necessity. Aside from pure streaming ML techniques, stream processing will become a central piece to assemble complex feature vectors that are input to the sophisticated ML predictors. Distributed, high-performance Stream Processing frameworks will become a necessity to efficiently model and preprocess increasingly complex real-time data at scale for ML models and algorithms.
5G and the proliferation of sensors and IoT devices will create more real-time streaming data and more use cases that need instant reaction to events. Stream processing will be used as an efficient way to realize "edge computing." Stream processing is a great match both for pre-processing data on devices or gateways and for running event-driven logic on the edge.
More companies will realize stream processing is an easy way to build a GDPR compliant data infrastructure. Classical "data at rest" architectures make it extremely complex to reason about where sensitive data exists. Streaming data architectures work on the data in motion directly (and do not require long-term storage or result in data replication) and make it easy to keep sensitive information isolated in application state for a limited time, making them naturally compliant.
Cybersecurity will continue its rise as one of the most important topics in information technology. To detect and prevent security breaches, cyber security solutions need to look for anomalies in the metrics and usage patterns across network infrastructure, applications, services, etc. Use of stream processing will continue to expand in cybersecurity because the technology is a great match for these requirements: real-time gathering and aggregating of events, tracking complex patterns, evaluating and adjusting ML models over the real-time data.
Kubernetes (plus Amazon’s S3 and open source Apache Kafka) is going to be recognized as the default platform for AI workloads. On the platform side, the expansion of intelligent applications is driving a shift from the traditional model of Big Data as a siloed system to a converged data-centric application platform. Linux and Kubernetes are emerging as the common core technologies in this shift, along with S3 as the interface for data at rest and Kafka for data in motion
The data skills gap will increase – but so will data literacy: Data is both the problem and the answer for businesses. It’s a problem because businesses manage to collect more data than they know how to use, yet it’s the answer because it can predict forecasts and offer insight into how the business should run. The next year will see the data skills gap continue to increase — users need to be able to analyze properly where data comes from and how to use it, and it only gets more complicated as more data is made available and as algorithms enter the fray. But at the same time, business users will also grow more data literate as they seek to approach data as a team, and help one another get what they need from their data.
Opinions expressed by DZone contributors are their own.