DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workkloads.

Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • NoSQL for Relational Minds
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Architecture and Code Design, Pt. 1: Relational Persistence Insights to Use Today and On the Upcoming Years
  • Making Your Life Easier Around Data With Java and Jakarta EE

Trending

  • Scaling Mobile App Performance: How We Cut Screen Load Time From 8s to 2s
  • Chaos Engineering for Microservices
  • Testing SingleStore's MCP Server
  • Simplify Authorization in Ruby on Rails With the Power of Pundit Gem
  1. DZone
  2. Data Engineering
  3. Databases
  4. NoSQL Database Types

NoSQL Database Types

In this article, excerpted from Introducing Data Science, we will introduce you to the four big NoSQL database types. Read on for more info.

By 
Arno Meysman user avatar
Arno Meysman
·
Mar. 05, 16 · Analysis
Likes (21)
Comment
Save
Tweet
Share
88.2K Views

Join the DZone community and get the full member experience.

Join For Free

Image title

By Davy Cielen, Arno D. B. Meysman, and Mohamed Ali

In this article, excerpted from Introducing Data Science, we will introduce you to the four big NoSQL database types.

There are four big NoSQL types: key-value store, document store, column-oriented database, and graph database. Each type solves a problem that can’t be solved with relational databases. Actual implementations are often combinations of these. OrientDB, for example, is a multi-model database, combining NoSQL types. OrientDB is graph database where each node is a document.

Before going into the different NoSQL databases, let’s look at relational databases so you have something to compare them to. In data modelling, many approaches are possible. Relational databases generally strive toward normalization: making sure every piece of data is stored only once. Normalization marks their structural setup. If, for instance, you want to store data about a person and their hobbies, you can do so with two tables: one about the person and one about their hobbies. As you can see in figure 1, an additional table is necessary to link hobbies to persons because of their many-to-many relationship: a person can have multiple hobbies and a hobby can have many persons practicing it.

Image title

Figure 1 Relational databases strive toward normalization (making sure every piece of data is stored only once). Each table has unique identifiers (primary keys) that are used to model the relation between the entities (tables), hence the term relational.


A full-scale relational database can be made up of many entity and linking tables. Now that you have something to compare NoSQL to, let’s look at the different types.

COLUMN-ORIENTED DATABASE

Traditional relational databases are row-oriented, with each row having a row-id and each field within the row stored together in a table. Let’s say, for example’s sake, that no extra data about hobbies is stored and you have only a single table to describe people, as shown in figure 6.8. Notice how in this scenario you have slight denormalization because hobbies could be repeated. If the hobby information is a nice extra but not essential to your use case, adding it as a list within the Hobbies column is an acceptable approach. But if the information isn’t important enough for a separate table, should it be stored at all?

Image title

Figure 2 Row-oriented database layout. Every entity (person) is represented by a single row, spread over multiple columns.


Every time you look something up in a row-oriented database, every row is scanned, regardless of which columns you require. Let’s say you only want a list of birthdays in September. The database will scan the table from top to bottom and left to right, as shown in figure 3, eventually returning the list of birthdays.

Image title

Figure 3 Row-oriented lookup: from top to bottom and for every entry, all columns are taken into memory.


Indexing the data on certain columns can significantly improve lookup speed, but indexing every column brings extra overhead and the database is still scanning all the columns.

Column databases store each column separately, allowing for quicker scans when only a small number of columns are involved; see figure 4.

Image title

Figure 4 Column-oriented databases store each column separately with the related row numbers. Every entity (person) is divided over multiple tables.


This layout looks an awful lot like a row-oriented database with an index on every column. A database index is a data structure that allows for quick lookups on data at the cost of storage space and additional writes (index update). An index maps the row number to the data, whereas a column database maps the data to the row numbers; in that way counting becomes quicker, so it’s easy to see how many people like archery, for instance. Storing the columns separately also allows for optimized compression because there’s only one data type per table.

When should you use a row-oriented database and when should you use a column-oriented database? In a column-oriented database it’s easy to add another column because none of the existing columns are affected by it. But adding an entire record requires adapting all tables. This makes the row-oriented database preferable over the column-oriented database for online transaction processing (OLTP) because this implies adding or changing records constantly.

The column-oriented database shines when performing analytics and reporting: summating values and counting entries. A row-oriented database is often the operational database of choice for actual transactions (such as sales). Overnight batch jobs bring the column-oriented database up to date, supporting lightning-speed lookups and aggregations using MapReduce algorithms for reports. Examples of column-family stores are Apache HBase, Facebook’s Cassandra, Hypertable, and the grandfather of wide-column stores, Google BigTable.

KEY-VALUE STORES

Key-value stores are the least complex of the NoSQL databases. They are, as the name suggests, a collection of key-value pairs, as shown in figure 5, and this simplicity makes them the most scalable of the NoSQL database types, capable of storing huge amounts of data.

Image title

Figure 5 Key-value stores store everything as a key and a value.


The value in a key-value store can be anything: a string, a number, but also an entire new set of key-value pairs encapsulated in an object. Figure 6 shows a slightly more complex keyvalue structure. Examples of key-value stores are Redis, Voldemort, Riak, and Amazon’s Dynamo.

Image title

Figure 6 Key-value nested structure.


DOCUMENT STORES

Document stores are one step up in complexity from key-value stores: a document store does assume a certain document structure that can be specified with a schema. Document stores appear the most natural among the NoSQL database types because they’re designed to store everyday documents as is, and they allow for complex querying and calculations on this often already aggregated form of data. The way things are stored in a relational database makes sense from a normalization point of view: everything should be stored only once and connected via foreign keys. Document stores care little about normalization as long as the data is in a structure that makes sense. A relational data model doesn’t always fit well with certain business cases.

Newspapers or magazines, for example, contain articles. To store these in a relational database, you need to chop them up first: the article text goes in one table, the author and all the information about the author in another, and comments on the article when published on a website go in yet another. As shown in figure 7, a newspaper article can also be stored as a single entity; this lowers the cognitive burden of working with the data for those used to seeing articles all the time. Examples of document stores are MongoDB and CouchDB.

Image title

Figure 7 Document stores save documents as a whole, whereas RDMS cuts up the article and saves it in several tables. The example was taken from the Guardian website.


GRAPH DATABASES

The last big NoSQL database type is the most complex one, geared toward storing relations between entities in an efficient manner. When the data is highly interconnected, such as for social networks, scientific paper citations, or capital asset clusters, graph databases are the answer. Graph or network data has two main components:

Node—: The entities themselves. In a social network this could be people.

Edge: The relationship between two entities. This relationship is represented by a line and has its own properties. An edge can have a direction, for example, if the arrow indicates who is whose boss.

Graphs can become incredibly complex given enough relation and entity types. Figure 8 already shows that complexity with only a limited number of entities. Graph databases like Neo4j also claim to uphold ACID, whereas document stores and key-value stores adhere to BASE.

Image title

Figure 8 Graph data example with four entity types (person, hobby, company, and furniture) and their relations without extra edge or node information.


The possibilities are endless, and because the world is becoming increasingly interconnected, graph databases are likely to win terrain over the other types, including the still-dominant relational database. A ranking of the most popular databases and how they’re progressing can be found at http://db-engines.com/en/ranking.

Image title


Figure 9 Top 15 databases ranked by popularity according to DB-Engines.com in March 2015.


Figure 9 shows that with nine entries, relational databases still dominate the top 15 at the time this book was written, and with the coming of NewSQL we can’t count them out yet. Neo4j, the most popular graph database, can be found at position 23 at the time of writing, with Titan at position 53.

Database Relational database NoSQL Data (computing)

Opinions expressed by DZone contributors are their own.

Related

  • NoSQL for Relational Minds
  • Architecture and Code Design, Pt. 2: Polyglot Persistence Insights To Use Today and in the Upcoming Years
  • Architecture and Code Design, Pt. 1: Relational Persistence Insights to Use Today and On the Upcoming Years
  • Making Your Life Easier Around Data With Java and Jakarta EE

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!