{{announcement.body}}
{{announcement.title}}

My 2019 Predictions for Big Data in the Enterprise

DZone 's Guide to

My 2019 Predictions for Big Data in the Enterprise

An expert in all things data discusses what he thinks lies ahead for the field in the coming year.

· Big Data Zone ·
Free Resource

If 2018 was about anything, it was about preparing big data for its next big phase. After several years of being stalled, stymied, and even charged with being one big bust, big data is about to live up to its hype. Many of the CDOs and other data professionals I’ve spoken with in recent months agree we’re on the cusp of something truly “big.” These are the same organizations that have continued to make big investments in data, learning from their mistakes, applying those lessons, and adopting the technologies that are allowing them to blow past many of their remaining obstacles. It’s this new optimism, investment, and innovation that’s bringing about the actual execution everyone’s been expecting for some time now, and it’s the same trifecta of opportunity that’s behind my top five big data predictions for 2019.

#5: Now Proven, AI and ML Will Dig Deeper Into Enterprise

In 2018, we watched as the time, cost, and labor-intensive manual processes that have been holding up the big data initiatives within organizations began to melt away. Automation, AI, and ML — proven now not just in terms of speed, but also accuracy — is now being applied to more and more business functions. This fits into a general trend of moving away from hard-coding business process and operations into software — and adjusting people and physical operations to match the predefined and rigid business processes — and toward dynamically adapting business processes and operations to the physical realities and historical learnings. For example, universities are measuring historical admission and acceptance trends to determine who is likely to accept admission and how much scholarships would affect their decision. Alternative credit risk analysis is being performed to determine the creditworthiness of first-time or low-income borrowers. Customer churn predictions are being gleaned from sentiment analysis of social media. Key to all these applications is the ability to create good stable models and the key to building good stable models is being able to find the right data and create the right features. In 2019, AI and ML will play a big role in finding and understanding the data needed to build those models.

#4: Say Hello to Hybrid Environments

Last year, I predicted broad adoption of the cloud would finally force object stores to be hardened and properly governed, and that the new standards would require data governance that’s cloud, location, and platform agnostic. In 2019, you will see more organizations that are now comfortable with the cloud rowing a hybrid, heterogeneous data estate that includes multiple fit-for-purpose big data, relational, and NoSQL data stores, both on-premise and in the cloud. With a hybrid model in place, applications that work best on the public cloud can reside there. Those that need to remain on-premises can do so. While this seems like it would create greater complexity, in 2019, you will see more and more solutions that abstract this complexity through location and compute transparency. From file systems like MapR’s data fabric that creates a single name space to AIOps, which addresses complexity in virtual data centers, end users will be increasingly shielded from the complexity of hybrid architectures while getting the full benefits of fit-for-purpose, elastic solutions that it offers.  

#3: It’s the Data Lake’s Great Return

While organizations have been traditionally focused on the mechanics of creating and hydrating data lakes, but frequently creating data swamps instead, 2019 will see a renewed focus on data lake adoption. This is very similar to what we experienced with data warehousing where the initial generation of data warehouses was often misguided and lacked adoption, but they taught the organizations what was really required to create value and achieve broad adoption. I believe we are at the same stage with data lakes and in 2019 the focus will turn from the mechanics of the data lake to making the data in the lakes findable, usable, and governed at scale and in an automated manner, powered by the new spate of AI-driven data catalogs and governance solutions. Even new data lakes will get rolled out in a much more deliberate manner with clear initial use cases, usage, and governance policies. We will also see more data lakes being built or migrated to the cloud to take advantage of managed infrastructure, elastic storage and compute, and rich ecosystems as more organizations begin adopting Virtual Data Lakes that span multiple systems.

#2 Big Data Becomes Little Data

No, organizations won’t be dumping all the stockpiles of their data, but, well, they will in a limited scope. With greater visibility into the data they have will come opportunities to rationalize and consolidate for significant savings in storage costs and even more accurate analytics now that organizations know which data is corrupted and can be jettisoned. But “becoming little” also speaks to large volumes of data that used to choke the organization now becoming manageable enough to put to use, thanks to the automation of key processes like cataloging.

#1: Explainability Will Emerge as a Key AI Requirement

As more and more business (and government) is run using AI and ML algorithms, there will be more focus on transparency and explainability. Why was a mortgage denied? Can a bank prove that none of the illegal demographics (like race, gender, and so forth) were used to make the decision or train the model that made the decision. Finding the appropriate data sets and documenting their lineage and quality is the first step to such transparency and explainability. If we do not know where data came from or what it means, we will not be able to explain the model or insure it’s proper and legal operations.  

Topics:
big data ,machine learning ,data lake ,hybrid data management ,ai ml

Published at DZone with permission of

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}