DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Relational DB Migration to S3 Data Lake Via AWS DMS, Part I
  • The Evolution of Database Architectures: Navigating Big Data, Cloud, and AI Integration
  • Introduction to NoSQL Database
  • AI-Powered Knowledge Graphs

Trending

  • AI, ML, and Data Science: Shaping the Future of Automation
  • Java's Quiet Revolution: Thriving in the Serverless Kubernetes Era
  • Building Scalable and Resilient Data Pipelines With Apache Airflow
  • Microsoft Azure Synapse Analytics: Scaling Hurdles and Limitations
  1. DZone
  2. Data Engineering
  3. Big Data
  4. Data Fabric vs. Data Lake: Operational Comparison

Data Fabric vs. Data Lake: Operational Comparison

In this article, we will focus on which is the most appropriate big data store for high-scale, real-time, operational use cases – data fabric vs data lake.

By 
Ian Tick user avatar
Ian Tick
·
Oct. 21, 21 · Review
Likes (3)
Comment
Save
Tweet
Share
9.4K Views

Join the DZone community and get the full member experience.

Join For Free

This article will focus on which is the most appropriate big data store for high-scale, real-time, operational use cases – data fabric vs data lake. It will also discuss data warehouses, as well as relational, and non-relational, databases.

What Are Operational Use Cases?

Data-intensive enterprises are driven by a broad array of real-time use cases requiring a high-scale, high-speed data architecture that can support millions of concurrent transactions. Examples include:

  • 360 customer view from many different legacy systems (to a self-service IVR or mobile/web portal, customer service reps, chat agents/bots, and field technicians).
  • Churn prediction.
  • Credit scoring.
  • Fraud prevention.
  • Payment card transaction security, and more.

Operational Use Case Requirements

Operational use cases need a big data platform capable of performing complex data queries in milliseconds while dealing with:

  • Live data, which is continually being updated from operational systems (with millions, to billions, of updates each day).
  • Terabytes of fragmented data, spanning many different databases or tables, typically in different formats and technologies.
  • A specific instance of a business entity, such as a single customer, product, location, etc.
  • High concurrency, representing thousands of requests every second.

Big Data Storage Options

Today, the most used storage options that data teams rely on include:

  1. Data Lake

According to an analyst at Gartner, a data lake is a collection of storage instances of various data assets. These assets are stored and maintained as an exact, or near-even exact, replica of the structured or unstructured source format – in addition to the original data stores. Examples of data lake providers include Amazon S3, Apache Hadoop, and Azure Data Lake.

  1. Data Warehouses (DWH)

A data warehouse refers to a storage architecture designed to persist data extracted from operational data stores, transaction systems, and external sources. It combines the data in an aggregated form appropriate for enterprise-wide data analysis and reporting. Examples of DWH providers include Amazon Redshift, Google BigQuery, and Snowflake.

  1. Database Management Systems (DBMS)

A database management system stores and organizes data with defined formats and structures. A DBMS is categorized by its basic structure and by its use or deployment. 

  • A relational DBMS, which usually includes a Structured Query Language (SQL) API, is organized and accessed via the relationships between the data entities. Examples of relational DBMS providers include MS SQL, Oracle, and PostgreSQL.

  • A non-relational (NoSQL) DBMS is often used in big data and real-time web applications. Although optimized for high-scale use, a non-structured database can’t enforce relationships between data entities. Examples of non-relational DBMS providers include Cassandra, MongoDB, and Redis.

  1. Data Fabric

A data fabric can be defined as an integrated layer of connected data, that's ingested and normalized from an enterprise's data sources – regardless of the data’s format, technology, or source system. It holds the processed data in its own data store, delivering it to big data stores, consuming applications, and AI/ML/real-time decision-making engines – on demand. Examples of data fabric providers include IBM Cloud Pak, K2View, Denodo, Talend and Informatica.

Storage Options – Pros and Cons

The following summarizes the strengths and weaknesses of data fabric vs data lake/DWH, as well as relational, and non-relational, databases.

  1. Data Lake/DWH

Strengths 

  • Support for complex data queries, across structured and unstructured data.

Weaknesses 

  • No support for single entity queries, with resultant slow response times.
  • No support for live data, so data that needs to be constantly updated is unreliable or delivered at unacceptably slow response times.
  1. Relational Database

Strengths 

  • Support for SQL, broad adoption, and ease of use.

Weaknesses 

  • Non-linear scalability, needing expensive hardware to perform complex queries, on Terabytes of data, in near real-time.
  • High concurrency, resulting in unacceptably slow response times.
  1. NoSQL Database

Strengths 

  • Distributed data store architecture, with support for linear scalability.

Weaknesses 

  • No support for SQL, needing specialized skills.
  • In order to support data queries, indexes need to be predefined – or complex application logic needs to be embedded – hampering development agility and time to market.
  1. Operational Data Fabric

Strengths 

  • Full support for SQL.
  • Distributed data store architecture, with support for linear scalability.
  • Support for high concurrency, with high performance.
  • Support for complex queries for single business entities.

Weaknesses

  • No inherent support for querying across multiple Micro-Databases, but Elasticsearch resolves this issue satisfactorily.

Conclusion

In the data fabric vs data lake comparison, the architecture of choice for real-time operational use cases is obviously data fabric. But data fabric solutions and data lakes are actually complementary in that data fabric can prepare trusted data for data lakes, while data lakes can provide operational intelligence to data fabric for immediate use.

Big data Data lake Relational database Database Comparison (grammar)

Opinions expressed by DZone contributors are their own.

Related

  • Relational DB Migration to S3 Data Lake Via AWS DMS, Part I
  • The Evolution of Database Architectures: Navigating Big Data, Cloud, and AI Integration
  • Introduction to NoSQL Database
  • AI-Powered Knowledge Graphs

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!