DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Schema Change Management Tools: A Practical Overview
  • How to Build a Full-Stack App With Next.js, Prisma, Postgres, and Fastify
  • Using Slash GraphQL to Create InstaMeme—A Meme Sharing App
  • Manage Hierarchical Data in MongoDB With Spring

Trending

  • Modern Test Automation With AI (LLM) and Playwright MCP
  • Software Delivery at Scale: Centralized Jenkins Pipeline for Optimal Efficiency
  • Creating a Web Project: Caching for Performance Optimization
  • Understanding the Shift: Why Companies Are Migrating From MongoDB to Aerospike Database?
  1. DZone
  2. Data Engineering
  3. Databases
  4. Introducing the Database Selection Matrix

Introducing the Database Selection Matrix

By 
Francesca Krihely user avatar
Francesca Krihely
·
Feb. 09, 15 · Interview
Likes (1)
Comment
Save
Tweet
Share
11.0K Views

Join the DZone community and get the full member experience.

Join For Free

Originally Written by Mat Keep

For the better part of a generation, the database landscape had changed very little. No one could say “this is not your father’s database.” They had become, in a word, boring.

Then a combination of factors catalyzed an era of innovation in database technologies: cheap storage and compute resources; pervasive connectivity; social networks; smartphones; the proliferation of sensors; open source software. Data volumes grew (and are growing) at exponential rates. Over 80% of today’s data no longer fits neatly into the normalized row and column table formats of the past. And so developers began engineering solutions to a new set of problems with a very different set of resources and assumptions. Today these new options include a variety of database architectures built around diverse data models – from key-value to document to wide-column and graph. And of course you still have the option of the venerable relational database.

For the enterprise these new technologies hold great promise. They open the door to new applications that could not be imagined before, or to more efficiently solve existing problems. They attract new technical talent. They facilitate the migration of systems to more cost effective infrastructure based on commodity hardware and cloud platforms. But at the same time, evaluation of these new options requires careful consideration.

Selecting the appropriate database for a new project requires evaluation against multiple criteria, including:

  • Development considerations: includes the data model, query functionality, available drivers, data consistency. These factors dictate the functionality of your application, and how quickly you can build it.
  • Operational considerations: performance and scalability, high availability, data center awareness, security, management and backups. Over the application’s lifetime, operational costs will contribute a significant percentage to the project’s Total Cost of Ownership (TCO), and so these factors constitute your ability to meet SLAs while minimizing administrative overhead.
  • Commercial considerations: licensing, pricing and support. You need to know that the database you choose is available in a way that is aligned with how you do business.

Each these considerations need to be evaluated in context of specific application requirements as well as internal technology standards, skills availability and integration with your existing enterprise architecture.

So, where to start? The Database Selection Matrix is designed to serve as a decision framework by teams responsible for database selection. It has been developed in collaboration with several large enterprises who have the choice of running multiple databases in production, and who wanted to institute a systematic methodology for database evaluation. Responses to questions in the matrix helped them identify key requirements and guide selection. And it can do the same for you.

Lets illustrate how the Database Selection Matrix can be used by working through a practical example.

The Database Selection Matrix in Action!

ACME Retail Corporation runs a large vehicle fleet to distribute produce to its nationwide network of stores. The CEO is intent on improving distribution efficiency and so tasks her enterprise architects to build a new platform that can utilize sensor data generated by the company’s trucks. By capturing and analyzing this data, the organization believes it can optimize route planning, improve delivery times, cut wastage and reduce business interruptions caused by breakdowns.

ACME Retail Corp is typical of many enterprises that see the opportunity to unlock new efficiencies by leveraging the “Internet of Things”. As Morgan Stanley stated in the “Internet of Things is Now” research “We do not believe traditional data storage architectures are well- suited to accommodate the volume, velocity, and variety of IoT data”. For this reason, enterprises are looking beyond traditional RDBMS technology to the swathe of new database options available to them. Bosch SI did exactly this when it took the decision to use MongoDB to power the Bosch IoT Suite.

Of course, MongoDB may not be a perfect fit for every IoT project. There are many choices available – as there for every new type of project – and the ACME architecture group needs a way to navigate the complex landscape of modern databases. Using the Database Selection Matrix, they have the framework to ask the key questions that will guide their technology decisions. So lets put it into practice.

Development Considerations

In this opening phase, the architects need to evaluate how their shortlisted database options meet the functional requirements of the app that is being built. This is impacted directly by multiple factors – and these are the questions they will need to ask.

The Data Model:

  • Will the application need to handle data of varying structure and types?
  • How large can each data type be – is our data made up of simple integers, strings and timestamps or can it also be large binary files such as images or videos?
  • Can our data just be represented as a set of opaque values, or does it need to be typed so other applications can make sense of it?
  • Do we know the data structure will remain constant, or will it vary as we introduce new sensor data and as the business updates application requirements?
  • Does the application require its data to be strongly consistent (i.e. read our own writes), or can eventually consistent data be tolerated (and do our developers know how to handle the complexity it introduces?). Do we end up trading performance and availability if we configure the database to only return the freshest data?

The Query Model:

  • What sort of queries are we going to run against the database? Is it simple key-value lookups that we know in advance or do we need to execute ad-hoc queries and complex aggregations to support real-time analytics that the business wants to see?
  • Can we run analytics directly against the database, or do we need to replicate data to dedicated search or analytics engines?
  • Will the application be handling geospatial queries and text search?
  • Does the data need to be integrated with our BI & analytics tools, and what about our new Hadoop cluster, or the data warehouse?
  • Which languages will our engineers be using to develop the application, and does the database have drivers available for them?

Operational Considerations

In this second phase, the ACME Retail architects need to evaluate how each database would run in production. No-one wants to hand-feed a custom technology, so they need to understand if the database can meet the availability, scalability and security needs of the business, and interoperate with the existing management frameworks.

Service Availability:

  • What is the application’s availability SLA? What are our RTO and RPO objectives?
  • Will our operations teams manage failure recovery, or is this something that should be fully automated by the database?
  • What capabilities does the database offer to maintain availability during routine maintenance? Are there tools available to manage this or do we need to script something ourselves?
  • Are there specific requirements to replicate data between our data centers to support disaster recovery?

Scalability:

  • How do we expect this application to grow? Will the database need to scale beyond the limits of just a few servers?
  • If data is to be distributed across multiple nodes, will it be partitioned in such a way that it is still optimized for the application’s query patterns?
  • Do we need to scale this across data centers? Can we write and read data locally to reduce the effects of geographic latency?
  • Can we scale storage capacity and I/O by compressing the data, and are different compression algorithms available to optimize compression ratio to CPU overhead?

Security:

  • What types of data access control do we need? Can we just use authentication controls within the database or do we need to integrate with our existing LDAP infrastructure?
  • What type of authorization controls are available, and how granular can we get? Do these controls needs to extend down to the level of individual attributes within a document?
  • Is encryption needed, and will those pesky compliance officers need to audit every action taken against the database?

Administration:

  • How are going to run this thing?
  • Does the database provide tools to automate provisioning and upgrades or do we need to create our own scripts?
  • How about backups? Can we get incremental backups. How about point in time backups?
  • And then monitoring. We need to know, for example, if disk utilization is peaking above 60% so we can take action before we hit a problem. Can we add these alerts into our existing operational workflow tools?
  • Can we integrate the database’s management platform into our own operational tooling so we don’t need to leave our single screen?

Commercial Considerations

Once the ACME architects have profiled their technology requirements, they will need to understand how the database is licensed and priced, before legal and procurement come knocking at the door:

  • Licensing: what license is used, and is this acceptable to our legal team? Are commercial licenses available?
  • Support: What support options are open to me? Can I get support SLAs from my vendor, even if I use a community version of their product?
  • What is the SLA I can expect if I do hit an issue?
  • What sort of training is available? Is my only option to send my engineers to public classes, or can we get trained on-demand, at our own pace?

What’s Next?

The ACME example is designed to illustrate some of the key questions engineering teams need to ask. It is true that the database landscape is more complex than ever. But it needn’t be bewildering – the Database Selection Matrix is designed to help you identify and compare what is most critical as you build your next app, so go ahead and download it now.

Looking for additional information about database selection? Learn why organizations choose MongoDB to deliver applications and outcomes that were never previously possible. Download the white paper below:

THE VALUE OF DATABASE SELECTION


Database Relational database Data (computing) Matrix (protocol) application

Published at DZone with permission of Francesca Krihely, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Schema Change Management Tools: A Practical Overview
  • How to Build a Full-Stack App With Next.js, Prisma, Postgres, and Fastify
  • Using Slash GraphQL to Create InstaMeme—A Meme Sharing App
  • Manage Hierarchical Data in MongoDB With Spring

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!