DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Data Engineering
  3. Databases
  4. How to Configure Your MongoDB Replica Set for Analytics

How to Configure Your MongoDB Replica Set for Analytics

Chris Chang lays out an excellent strategy for running analytics against MongoDB using analytics only replicas of your data for performance of your queries and to isolate active nodes.

Chris Chang user avatar by
Chris Chang
·
Oct. 31, 16 · Tutorial
Like (2)
Save
Tweet
Share
3.89K Views

Join the DZone community and get the full member experience.

Join For Free

MongoDB replica sets make it easy for developers to ensure high availability for their database deployments.

A common replica set configuration is composed of three member nodes: two data-bearing nodes and one arbiter node. With two electable, data-bearing nodes, users are protected from scenarios that cause downtime for single-node deployments, such as maintenance events and hardware failures.

However, it may be tempting to read from the redundant, secondary server to scale reads and/or run queries for the purpose of analytics. We strongly advise against secondary reads when there are only two electable, data-bearing nodes in the replica set.

The main reason for this recommendation is that relying on secondary reads can compromise the high availability replica sets are meant to provide. While occasional use of the secondary for non-critical ad-hoc queries is fine, if your app requires both the primary and the secondary to shoulder the database load of your application, your system is no longer in a position to handle this load if one of the nodes in the cluster goes down or becomes unavailable.

This is discussed in more depth in the following resources:

  • Can I use more replica nodes to scale?
    • http://www.askasya.com/post/canreplicashelpscaling/
  • Reasons to not use secondary reads to provide extra read capacity
    • https://docs.mongodb.com/manual/core/read-preference/#counter-indications

Run Analytics Queries Against Hidden, Analytics Nodes Instead

If you would like to run more than the occasional, ad-hoc or analytics query, we highly recommend that you properly configure your replica set to handle analytics queries.  In particular, we recommend adding a node designated for analytics as a hidden, non-electable member of the replica set.

Hidden members have properties that make them great for analytics. A hidden replica set member:

Maintains a copy of the primary’s data set – Querying on a hidden member will be nearly identical to querying the primary node (minus some replication delay).

Cannot become primary and is invisible to your application – It’s important to isolate analytics traffic from production application traffic. If the analytics node became the replica set primary, it may be unable to handle the combined analytics and production application traffic.

Can be useful for disaster recovery as well if a slaveDelay is configured – See advanced configuration considerations below.

If you’re interested in adding an analytics node to your mLab deployment:

  1. Email us at support@mlab.com to request that the node be added.
  2. mLab will add the node seamlessly into your replica set as a hidden member and provide you with its address.
  3. You will then be able to start to create single-node connections using that address for your analytics queries.

Advanced Configuration Considerations

Enabling SlaveDelay on the Analytics Node for Replica Set Disaster Recovery

MongoDB’s slaveDelay option allows you to configure a replication delay on a hidden replica set member. Configuring a delay is helpful for recovering from disaster scenarios such as accidentally dropping a collection or database.

For example, imagine that you configure a one-hour delay on an analytics node. If a developer accidentally drops/deletes data from the primary node, the changes will be applied to the analytics node an hour later (as opposed to immediately). This allows you to query the analytics node to retrieve the deleted data.

Having Multiple Analytics Nodes for High Availability and/or to Scale Reads

If you would like your analytics queries to be able to withstand one node failure and/or to have more read capacity, it could make sense to have multiple, analytics nodes.

In this case, consider a Read Preference with Tag Sets to ensure that analytics queries are directed at analytics nodes only, and that non-analytics queries are directed at electable nodes only.

Reading From Secondaries in a Sharded Cluster

If you are running a Sharded deployment and would like to read from the secondary members of your shards, there are important considerations you should be aware of.  We will be publishing a blog post on this advanced topic in the future.

Analytics Database MongoDB

Published at DZone with permission of Chris Chang, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • Debugging Threads and Asynchronous Code
  • Web Application Architecture: The Latest Guide
  • How To Generate Code Coverage Report Using JaCoCo-Maven Plugin
  • A Complete Guide to AngularJS Testing

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: