DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • NoSQL for Relational Minds
  • Unveiling the Clever Way: Converting XML to Relational Data
  • Keep Calm and Column Wise
  • SQL Data Manipulation Language (DML) Operations: Insert, Update, Delete

Trending

  • Accelerating AI Inference With TensorRT
  • Why Documentation Matters More Than You Think
  • Zero Trust for AWS NLBs: Why It Matters and How to Do It
  • Performance Optimization Techniques for Snowflake on AWS
  1. DZone
  2. Data Engineering
  3. Databases
  4. Row vs. Columnar Storage for Cloud-Based Data

Row vs. Columnar Storage for Cloud-Based Data

Choosing the right type of database is essential for businesses. In this post, we'll do an in-depth analysis of row vs. columnar storage for cloud-based data.

By 
Hitesh Jethva user avatar
Hitesh Jethva
·
Feb. 09, 22 · Analysis
Likes (3)
Comment
Save
Tweet
Share
4.0K Views

Join the DZone community and get the full member experience.

Join For Free

Data warehouses or enterprise data warehouses are databases optimized for OLAP or online analytical processing, offline transaction processing, and business intelligence. Most data warehouses have a common database, mainly for analytical uses. 

One of the most common types of data warehouses is the cloud data warehouse. All data warehouses-as-a-service are referred to as cloud data warehouses. The purpose of data warehouses is to assist traditional relational databases offload analytics as they are already overburdened. 

In technical terms, online transactional (OLTP) databases, such as Oracle and MySQL, are row-based storage systems. However, when general analytics is in the picture, columnar storage is usually preferred. We shall be discussing in detail row vs. columnar storage for cloud-based data. Let’s get started.

Choosing the right type of database is essential for businesses as they process tonnes of data every day. Certain patterns require row-based storage, whereas columnar storage is perfect for the rest.

Row Storage for Cloud-Based Data

Row storage is a traditional form of storage only because they have been around longer than their counterparts. The majority of row-based storage databases are known for online transaction processing. Row storage works perfectly for simple tasks such as adding, deleting, or updating tiny bits of data.

Row databases are partitioned horizontally. Hence, only one row is written at a time. This is perfect for OLTP users since one row takes up the equivalent of one chunk or more. Thus, when multiple rows are accessed simultaneously, it takes more time to transfer the data than it takes to seek it. This is when columnar storage is given more preference.

Row storage is beneficial when entire rows need to be accessed. Not only are they perfect for indexing purposes, but they are also useful for point lookups. Indexing optimizes queries so that an entire table is not analyzed. If the value you require is in the index, you can get it right away.

While columnar storage includes indexing algorithms to optimize scans of the entire table, an index on the proper columns is more efficient at lowering seek time for individual record retrieval. When lots of indices are constructed, it results in a lot of duplication of data. Usually, using row storage turns out to be an expensive affair when only one field from an entire database of records needs to be accessed. What’s more, the risk of irrelevant data being read increases as well. This is why columnar storage works best for cloud-based data.

The best example of row storage is a relational database. This database serves as a refined query engine that is structured in a fine manner.

Columnar Storage for Cloud-Based Data

Columnar storage is usually better for OLAP uses. Data can be aggregated with the help of analytical apps since only a subset is usually needed. Data in columnar storage is always positioned vertically. This leads to faster transactions as a lesser volume of data needs to be loaded since only individual values are read. How does this work? Columnar storage allows users to ignore irrelevant data that does not pertain to a query. This way, users can access specific data. 

Data gets compressed much better in columnar storage than in row storage since the columns are uniform. This makes columnar storage a good option when you expect unpredictable queries.

While columnar storage may have a lot of advantages for being used in cloud-based data, it is not always suitable. While it does increase the reading performance, its efficiency in writing data is low. For this, row-based storage works best when single records need to be inserted. If all the data needs to be written in a columnar fashion, it will use up way more computing resources since there is no room for error. HBase is the best example of a columnar storage database. It allows for scalability, easy partitions, data retrieval, etc. 

Which Is Better?

Both row storage and columnar storage need to work together for different sets of actions. Row-based storage databases are essential for OLTP, whereas columnar storage databases are essential for OLAP. For a handful of fields, columnar might still be okay, but if many fields need to be accessed, then row storage works best. 

Frequently Asked Questions (FAQs)

How Is the Data Stored?

Row: The data is stored in rows. One row is retrieved at a time. This can lead to unimportant data being read if multiple rows are selected at once. The records in row storage are comparatively easier to read and write than in columnar storage.

Columnar: In columnar storage, the data is stored in columns. This way, only relevant data is read during retrievals. The read and write actions happen much slower as compared to row storage.

What Are They Best Suited For?

Row: Row-based storage works best for online transactional systems.

Columnar: Columnar storage works best for online analytical processes.

When Does It Work Best?

Row: Row-based storage works best for one dataset at a time. It is inefficient in managing operations for entire datasets. Thus, it is an expensive affair. Due to its fewer compression mechanisms, the results are less efficient as compared to columnar data stores. 

Columnar: Columnar storage systems can perform operations on the entire dataset. Hence, aggregation is enabled over all rows and columns. Columnar data stores allow for high compression rates.

Conclusion 

Both row storage and columnar storage have their pros and cons. It is not possible to use only one type of storage for cloud data warehouses. It is recommended that a perfect blend of both types of storage be used, depending on the situation.

Database Relational database Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • NoSQL for Relational Minds
  • Unveiling the Clever Way: Converting XML to Relational Data
  • Keep Calm and Column Wise
  • SQL Data Manipulation Language (DML) Operations: Insert, Update, Delete

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!