DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • Using JSON in MariaDB
  • Why Should Databases Go Natural?
  • SQL Interview Preparation Series: Mastering Questions and Answers Quickly

Trending

  • Blue Skies Ahead: An AI Case Study on LLM Use for a Graph Theory Related Application
  • How to Practice TDD With Kotlin
  • Immutable Secrets Management: A Zero-Trust Approach to Sensitive Data in Containers
  • AI, ML, and Data Science: Shaping the Future of Automation
  1. DZone
  2. Data Engineering
  3. Databases
  4. Database Architectures and Use Cases - Explained

Database Architectures and Use Cases - Explained

By 
Margo McCabe user avatar
Margo McCabe
·
Jul. 28, 20 · Analysis
Likes (34)
Comment
Save
Tweet
Share
29.8K Views

Join the DZone community and get the full member experience.

Join For Free

With over 300 databases on the market, how do you determine which is right for your specific use case or skill set?

We continue to see the common debate of SQL vs. NoSQL and other database comparisons all over social media and platforms like dev.to. In most cases, it’s not that one database is better than the other, it’s that one is a better fit for a specific use case due to numerous factors.

A couple of months back, our CTO Kyle Bernhardy, led an awesome talk titled A Deep Dive Into Database Architectures. You can watch this talk at the link, but since this is such a prominent discussion topic we thought it might be helpful to summarize. This article will provide an overview of database architectures, including use cases and pros & cons for each of them.
Let’s start with general considerations when selecting a database. It’s important to understand things such as data type / structure, data volume, consistency, write & read frequency, hosting, cost, security, and integration constraints. The more you know about these factors, the easier it will be to pick the right database for your project.

You may already know that there are generally 3 database hosting options:

  • On-Premises
    Database fully maintained by organization on servers running within their data center(s)
    More control, but usually more expensive and time consuming

  • Cloud Hosted
    Servers are maintained by cloud providers, organizations maintain database software and operating system running on the machine
    Flexible scaling and no server upkeep, but no control over physical server and potential network limitations

  • Database-as-a-Service (DBaaS)
    Database maintained by service provider, organizations only charged for usage of service
    Cost effective and zero upkeep, but data stewardship and potential network limitations

Now, For the Part You’ve Been Waiting For: Database Architectures.
Relational Databases

We’ll start with the most commonly used. Relational (SQL) Databases such as Oracle, MySQL, PostgreSQL, Microsoft SQL Server, and SQLite, organize data into tables with columns, each with a specified name and datatype. Additionally:

  • Rows are identified with a unique attribute, or grouping of attributes, called a primary key (typically a single column)
  • Relationships between tables are defined through foreign keys which reference primary keys
  • Strict schema (data model) enforcement
  • Data accessed via Structured Query Language (SQL)
  • ACID compliance (Atomicity, Consistency, Isolation, and Durability)
  • Extra features like Triggers & Stored Procedures

On the plus side, relational databases use mature technology that is widely understood and well-documented, SQL standards are well-defined, defined constraints enforce data integrity, they avoid data duplication and are highly secure and ACID-compliant. However on the negative side, SQL databases cannot handle unstructured or semi-structured data, their tables don’t necessarily map to objects, they require complicated ETL (Extract, Transform, Load) and maintenance, have row locking, and pricing for some products (Oracle, SAP) are out of reach for developers and some organizations.
Note: While some RDBMS systems can now handle JSON, they are not purpose built to do so.

There are several awesome use cases for relational databases; situations where data integrity is absolutely paramount (financial applications, defense and security, private health information), highly structured data, and automation of internal processes.

Then Comes Everything Else…

Relational databases are the most common database in production today, but they were not designed for the scale and agility of modern applications. About 10 years ago the NoSQL movement caught on to address these concerns and changed the database landscape forever. Note:

  • NoSQL doesn’t mean anything (Non-SQL, Non-relational SQL, Not Only SQL)
  • Designed to address structure, performance, data volume, and scalability
  • Relational databases have since addressed many of these concerns

While databases are typically categorized as SQL or NoSQL, there are many intricacies to NoSQL databases. Let’s get into it.

Key-Value Stores

Key-value stores are often used as the underlying storage for higher level databases. For example, MongoDB uses a key value store called WiredTiger as their default storage engine. Key-Value Stores, such as Redis, DynamoDB, and Cosmos DB, are:

  • Simple, basic, & schemaless
  • Provide basic functionality for retrieving arbitrary data via a specific key
  • Values can be anything: single values, arrays, objects, files, etc.
  • Database does not evaluate the data it is storing
  • Data structure can be referred to as a dictionary or hash table

For pros, key value stores provide fast, low-complexity access to data, are flexible, and can scale quickly and cheaply. However, they have extremely limited functionality, cannot handle complex structures or query or search by anything other than key, do not scale well as data models grow, and they require more programming overhead for complex implementations.

Example use cases for key value stores would be embedded systems, URL shorteners, configuration data, application variables and flags for web applications, state information, and data represented by a dictionary or hash.

Document Stores

Document stores, such as MongoDB, DynamoDB, Couchbase, and Firebase, are similar to key-value stores, but the value is a document.

  • Typically document formats are JSON, BSON, or XML documents
  • Schemaless, no data structure enforcement (documents can be different)
  • Data accessed and modified via NoSQL (or proprietary language)
  • Well-suited for unstructured and semi-structured data
  • Seen as easier for development

The pros of document stores include flexibility and scalability, schemaless, fast writes, ideal for semi-structured and unstructured data, and developers do not need to know data structure ahead of time / it can change overtime without downtime. The cons are that they are not ACID compliant, limited to querying within a document, relationships/cross references are not enforced, slow searching, cannot join documents/collections in a single query, lack of database enforcement requires developer discipline and vigilance for application level enforcement, and they typically result in data duplication.

Great use cases for document stores are unstructured or semi-structured data, content management, rapid prototyping, and collecting of high traffic data.

Graph Databases

Graph databases, such as Neo4j, OrientDB, and TitanDB, are ideal for when relationships or connections are top priority.

  • Based on mathematical graph theory
  • Represent data as a network of related nodes, edges, and properties
  • Database stores data items within nodes and relationships in edges that connect nodes
  • Nodes are connected by relationships and grouped according to labels
  • Facilitate data visualizations and graph analytics
  • Each node contains free-form data

On the plus side, graph databases have advanced features for relationship querying, traversing, and tracking, are optimized for querying related data, and they avoid row locking. As for the negatives, graph databases have a large ramp up time for developers, high overhead for simple use cases, lack of standardization, poor performance of aggregate queries, and devs typically need to learn a custom query language.

Graph databases are great for analysis of heterogeneous data points, fraud prevention, advanced enterprise operations, social networking, payment systems, and GeoSpatial routing/visualization.

Time Series Databases

We’re almost there! Next up is time series databases, such as InfluxDB, Kdb+, and Prometheus, which are:

  • Focused on datasets that change over time
  • Heavily write oriented
  • Designed to handle constant streams of data
  • Typically append-only (no modification after ingestion)
  • Rollup/aggregation/down sampling features to lower archive data footprint

On the positive side, time series databases are designed for dealing with linear data over time, can handle high ingestion rates, have built-in features specifically for dealing with time-based data, a schema optimized for time-series arrays, and batch delete features. As far as negatives, time series databases only deal with time-series data, do not support full SQL, their read speed suffers compared to writes, they have no transaction capability and are append-only (not optimized for updates).

Great use cases for time series databases are managing infrastructure, IoT sensor collection, and log monitoring and alerting.

Search Engines

Last but not least is search engines, such as Elasticsearch, Splunk and Apache Solr, which are:

  • Built for non-relational, document-based data
  • Arranged and optimized for storage and rapid retrieval of data
  • Indexes data across a variety of sources including: file systems, intranets, document -Management systems, e-mail, and databases

Search engine pros include their focus on optimized searching, highly scalable and schemaless, and they have advanced search options like full text search, suggestions, and complex search operations. The cons are that they are expensive, have low durability and poor security, have no transaction support, are not efficient for writing and retrieving data outside of searching, and are difficult to manage.

Search engines are great when search results are top priority, logging, product catalogs, and blogs.

Multi-Model Considerations

It should be noted that many real life implementations choose “polyglot persistence,” which is the concept of using different data storage technologies to handle different data storage needs within a given software application. Many database technologies implement this within a single software, referred to as multi-model or data lake technology. This can be ideal for specific use cases, but poses inherent risk such as instability, data inconsistency and corruption, expensive / resource intensive, and data replication on disk and memory.

A few examples of multi-model technologies are:

  • Hadoop: Software tools running on top of multiple databases
  • MongoDB: Multiple data storage options in a single database software platform
  • PostgreSQL: Row-oriented, column-oriented, key-value, and document-oriented data storage options


Finally, As a Database Company...

It probably makes sense to discuss where HarperDB fits into the mix.
If for no other reason than to highlight that these database “categories” are not all black and white, and some databases take more of a “hybrid” approach by subscribing to several methodologies. Instead of falling into a single bucket, HarperDB can be considered as a structured object store with SQL capabilities. It features:

  • Built-in REST API
  • NoSQL & SQL operations including joins
  • Dynamic schema
  • Advanced publish-subscribe data replication
  • Self-service management studio
  • Traditional drivers/interfaces

We built HarperDB from the ground up to expand and blend the best capabilities of SQL, NewSQL, and NoSQL products because we felt there were certain use cases that could be better served with another solution. Some examples where we feel that HarperDB might be a better fit include cases where you need both NoSQL & SQL, rapid application development, integration, edge computing, distributed computing, and real-time operational analytics.

Hopefully this database architecture overview is helpful in finding the right database for your use case. Please reply below with questions or comments - we would love to discuss!

Database Relational database Data integrity Architecture sql

Published at DZone with permission of Margo McCabe. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • Using JSON in MariaDB
  • Why Should Databases Go Natural?
  • SQL Interview Preparation Series: Mastering Questions and Answers Quickly

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!