DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • Manage Hierarchical Data in MongoDB With Spring

Trending

  • DZone's Article Submission Guidelines
  • How to Submit a Post to DZone
  • A Complete Guide to Modern AI Developer Tools
  • Top Book Picks for Site Reliability Engineers
  1. DZone
  2. Data Engineering
  3. Databases
  4. Why We Moved From NoSQL MongoDB to PostgreSQL

Why We Moved From NoSQL MongoDB to PostgreSQL

It started with small problems... and then came the knockout punch! Finally, Postgres came to the rescue. Read on to live the adventure!

By 
Avi Cavale user avatar
Avi Cavale
·
Nov. 21, 17 · Opinion
Likes (48)
Comment
Save
Tweet
Share
138.8K Views

Join the DZone community and get the full member experience.

Join For Free

A couple of years ago, we moved our code base to a monorepo, which helped us scale tremendously in terms of code reuse and overall speed of development. We are extremely proud of our ability to run a resilient service that has 99.99% availability with zero downtime upgrades.

From the beginning of this journey, I made a decision to go all-in on JavaScript as our default coding language. The most important reason for this was that I wanted to hire full-stack developers who could work on every aspect of the product, so we chose Angular.js for UI, Node.js for API, and a schema-less JSON database (NoSQL MongoDB). We made all technology decisions based on this one philosophy (another blog coming about what I learned and why I am no longer a fan of full-stack development) and it worked beautifully...for a while.

It Started With Small Problems...

Even though we had the ability to add features at a lightning fast pace, we started seeing occasional downtimes which always seemed to come down to MongoDB. For instance:

  • We were very happy to have 24x7 availability with primary and secondary instances of MongoDB. However, our performance suddenly deteriorated one day and retrieval started taking more than a second per document. We tried using many tools and profilers, but could not figure out what was happening. Finally, we rebuilt a new server, switched that over as primary, and rebuilt our secondary. Retrieval times dropped to 150ms again. This is still an unsolved mystery!
  • Our Mongo instance reached 4TB and we were proud of our growing adoption. Due to the lack of tooling around managing large DBs, we relied on indexes to keep the search times low. When NoSQL DBs first became popular, there was no way to create uniqueness, so these features were built as an afterthought. Some of the bloating of our MongoDB was actually due to indexes, but rebuilding them was primitive and the entire DB would lock down.
  • At one point, we needed to reboot our DB server — and it took MongoDB four hours to come back online. This led to an extended downtime for our service, and we had very little visibility into the MongoDB process and status.

And Then Came the Knockout Punch!

The biggest advantage — and disadvantage — of MongoDB is that it has a flexible schema. This means that documents in the same collection (AKA "table" in the old world) do not need to have the same set of fields or structure, and common fields in a collection's documents may hold different types of data. In a nutshell, there are no strict schema rules and this opens it up to a lot of cowboy tinkering.

While many developers love the flexibility, it also puts a very high degree of responsibility on their shoulders to get things right.

For example, let's consider a simple schema that stores information about a Git repository:

Field name added on
provider 12/1/2012
repoOrg 12/1/2012
repoName 12/1/2012
isPrivate 7/17/2014
hasTeams 2/23/2016

As you could guess, the schema is updated over a period of time as fields are added to meet new requirements of an evolving product. This means that depending on when a repository was added to this document, it may or may not have the isPrivate and hasTeams fields. Our backend and frontend services needed to handle both cases gracefully, which led to code like this:

if exists(repo.hasTeams) and repo.hasTeams === true
{
# do something
} else {
# do something
}

In every single place where repo.hasTeams is used, we needed to add this code. With many microservices, many schemas and 40+ developers adding these blocks all over the place, our codebase was starting to look ugly. Also, every time we saw failures across our system, it was always a spectacular crash with no easy way to recover. I would wager that 90% of our crashes were due to the fact that some piece of code expected a field that didn't exist for that document. Internal brainstorming brought up the idea of building a schema validator and all sorts of hacks, but isn't this what a database should provide out-of-the-box?

One big black mark against Mongo (or any equivalent NoSQL database)!

The straw that broke the camel's back was when we introduced a critical field that absolutely needed to be present for each document in our most important collection. To ensure that every document included the field, we had to retrieve every single document one by one, update it, and then put it back. With millions of documents in the collection, this process caused the database performance to degrade to an unacceptable degree and we had to accept 4+ hours of downtime.

And that's when I decided that NoSQL wasn't going to work for us. To each his own, but we were done struggling with it and causing our customers (and ourselves) unnecessary heartache.

Postgres to the Rescue!

After this last incident, which happened about a year ago, we migrated to PostgreSQL. I can explain the step-by-step process of migration in another blog if you're interested. We have no regrets, and the following factors have greatly improved our availability and resiliency:

  • Postgres has a strongly typed schema that leaves very little room for errors. You first create the schema for a table and then add rows to the table. You can also define relationships between different tables with rules so that you can store related data across several tables and avoid data duplication. All this means someone on the team can act as a database architect and control the schema which acts as a standard for everyone else to follow.
  • You can change tables in PostgreSQL without requiring to lock it for every operation. For example, you can add a column and set the value to NULL quickly without locking the entire table.
  • Postgres also supports JSONB, which lets you create unstructured data — but with data constraint and validation functions to help ensure that JSON documents are more meaningful. The folks at Sisense have written a great blog with a detailed comparison of Postgres vs. MongoDB for JSON documents.
  • Our database size reduced by 10x since Postgres stores information more efficiently and data isn't unnecessarily duplicated across tables.
  • As was shown in previous studies, we found that Postgres performed much better for indexes and joins, and our service became faster and snappier as a result.

We have been very happy with Postgres since we migrated and we are no longer struggling with managing our database. As a result, we have seen our NPS go up significantly as customers are happier with a platform that "always works."

PostgreSQL Database MongoDB NoSQL Document Schema

Published at DZone with permission of Avi Cavale, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • How to Store Text in PostgreSQL: Tips, Tricks, and Traps
  • MongoDB to Couchbase for Developers, Part 1: Architecture
  • MongoDB to Couchbase: An Introduction to Developers and Experts
  • Manage Hierarchical Data in MongoDB With Spring

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!