DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
View Events Video Library
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Integrating PostgreSQL Databases with ANF: Join this workshop to learn how to create a PostgreSQL server using Instaclustr’s managed service

Mobile Database Essentials: Assess data needs, storage requirements, and more when leveraging databases for cloud and edge applications.

Monitoring and Observability for LLMs: Datadog and Google Cloud discuss how to achieve optimal AI model performance.

Automated Testing: The latest on architecture, TDD, and the benefits of AI and low-code tools.

Related

  • Common Performance Management Mistakes
  • Use Ketch to Deploy Apps on Kubernetes Without YAML
  • GraphQL Revisited: Adoption in Blockchains
  • In-house Tool for Performance Testing

Trending

  • AWS Lambda vs. Fargate: The Battle of Cloud Giants
  • 6 Proven Kubernetes Deployment Best Practices for Your Projects
  • Future Skills in Cybersecurity: Nurturing Talent for the Evolving Threatscape
  • AWS Amplify: A Comprehensive Guide
  1. DZone
  2. Data Engineering
  3. Databases
  4. Planning for Chaos with MongoDB Atlas: Using the ''Test Failover'' Button

Planning for Chaos with MongoDB Atlas: Using the ''Test Failover'' Button

From running out of disk space to network connectivity failure, chaos engineering helps you test your application by introducing failure in a controlled way.

Jay Gordon user avatar by
Jay Gordon
·
Feb. 01, 18 · Tutorial
Like (1)
Save
Tweet
Share
4.49K Views

Join the DZone community and get the full member experience.

Join For Free

When building an application, it's smart to consider chaos. Chaos can be introduced into an application in many different ways; some examples are:

  • Running out of disk space
  • Utilizing all connections to the cluster
  • Oversaturating the available IOPS
  • Network connectivity failure

To help you prepare for such an event, MongoDB Atlas has introduced a new feature called "Test Failover" that you can use to introduce some chaos for testing purposes.

Welcome to Chaos Engineering

One of the more popular terms to come out of the open source community has been "Chaos Engineering." On the "Principles of Chaos Engineering," you'll find the following definition that really encapsulates why the "Test Failover" feature in MongoDB Atlas exists:

Chaos Engineering is the discipline of experimenting on a distributed system in order to build confidence in the system's capability to withstand turbulent conditions in production...

Chaos Engineering strives to eliminate the pain points in a distributed system by introducing a failure of one of the components in a test environment and reviewing the output. The harder it is to introduce chaos that will cause an application to no longer operate, the more confidence we can place in the infrastructure our app lives on.

The team at Netflix had to ensure that their massive distributed application would still survive if chaos was introduced. Based on the demand of their customer base and the distributed nature of their systems, the engineers at Netflix needed to ensure they could handle a failure of a production system. They created an open source tool called "Chaos Monkey" which you can read more about in this blog post.

The main intent of this chaos is to ensure that if part of production fails, you don't end up with a completely out of service application. For this reason, we've nicknamed the "Test Failover" feature the Chaos Button.

Chaos Checklist

One of the more important concepts of pre-production application architecture testing is ensuring that your application will continue to work during cases of unplanned outage. I like to create pre-deployment checklists to make sure I have considered all the potential ways my app could fail. These checklists typically consist of things like backups, restore testing, and disaster recovery.

Some questions I like to have answered prior to going to production are:

  • Do I know how my app will respond when access to my data is temporarily interrupted?
  • When my database recovers, will the application work as expected?
  • Did I configure my application to utilize the full connection string to ensure failover?
  • If an issue occurs with my data, will I need to do any form of intervention?

By testing your application before going to production, you're able to review how your app will survive an incident or planned maintenance where a failover may occur. You enable the best practice of ensuring you survive chaos, much like the team at Netflix did.

How "Test Failover" Works

The "Test Failover" button will reboot the instance your primary lives on. Your cluster will perform an election and select one of your secondaries that has the most complete oplog to become your new primary.

Once failover is completed, the former primary instance is placed back into your cluster with the same hostname. Your connection string will not require modification as MongoDB drivers are smart enough to instantly know which members of your Atlas cluster are now primaries.

Begin Your Test

Note: In order to test failover, you need to be using a dedicated MongoDB Atlas cluster. This means that clusters on multi-tenant architecture will not have this feature.

To begin adding some "chaos," go to the "Clusters" menu for your organization, then find your project you'd like to work with. In the example shown below, I will use project "jg-MongoDB-Atlas-2017" to perform the chaos test.

Once you get to the main window your cluster is listed in, you can then find the ellipsis menu, select it, and find "Test Failover."

Once you select "Test Failover," you'll be brought to an information box that will inform you of what actions are about to happen:

Now click "RESTART PRIMARY," which will initiate the failover test as described above. You'll be shown a new window which informs you the test is underway.

You'll be able to tell what is going on by clicking on the cluster's name and reviewing the process as it occurs:

You are able to see that the primary is moved to a new node and the failed over instance is having its data resynced from the new primary. At this time, if you are reviewing an application's stability, you may run some form of selenium test or a curl script that hits an endpoint to confirm a connection to your database is occurring as expected.

When completed, you'll see a new primary selected and the continuation of normal service:

That's it - there's no need to modify connection strings or edit your app. Your cluster's backup, replication, and other services will continue with no required intervention from you.



If you enjoyed this article and want to learn more about MongoDB, check out this collection of tutorials and articles on all things MongoDB.

Chaos engineering Testing MongoDB application Database cluster Open source app planning

Published at DZone with permission of Jay Gordon, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Related

  • Common Performance Management Mistakes
  • Use Ketch to Deploy Apps on Kubernetes Without YAML
  • GraphQL Revisited: Adoption in Blockchains
  • In-house Tool for Performance Testing

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends: