DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Because the DevOps movement has redefined engineering responsibilities, SREs now have to become stewards of observability strategy.

Apache Cassandra combines the benefits of major NoSQL databases to support data management needs not covered by traditional RDBMS vendors.

The software you build is only as secure as the code that powers it. Learn how malicious code creeps into your software supply chain.

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

Related

  • Tracking Changes in MongoDB With Scala and Akka
  • Data Pipeline Using MongoDB and Kafka Connect on Kubernetes
  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  • Avoid Cross-Shard Data Movement in Distributed Databases

Trending

  • Resolving Parameter Sensitivity With Parameter Sensitive Plan Optimization in SQL Server 2022
  • PostgreSQL Performance Tuning
  • Creating a Web Project: Caching for Performance Optimization
  • Distributed Consensus: Paxos vs. Raft and Modern Implementations
  1. DZone
  2. Data Engineering
  3. Databases
  4. Composing a Sharded MongoDB Cluster on Docker Containers

Composing a Sharded MongoDB Cluster on Docker Containers

Learn about writing a Docker file and cluster initiation scripts that will allow us to deploy a sharded MongoDB cluster on Docker containers.

By 
Ayberk Cansever user avatar
Ayberk Cansever
·
Aug. 04, 17 · Tutorial
Likes (6)
Comment
Save
Tweet
Share
55.4K Views

Join the DZone community and get the full member experience.

Join For Free

In this article, we will write a docker-compose.yaml file and a cluster initiation scripts which will deploy a sharded MongoDB cluster on Docker containers.

Initially, let's look what kind of components we are going to need for a sharded MongoDB cluster. If we look at the official documentation, we need three main components which are obviously defined:

  1. Shard: Each shard contains a subset of the sharded data. Each shard can be deployed as a replica set.

  2. Mongos (router): The mongos acts as query routers, providing an interface between client applications and the sharded cluster.

  3. Config servers: Config servers store metadata and configuration settings for the cluster.

In the official documentation, the main architecture of a sharded cluster looks like this:

Image title

Now, we are beginning to build a cluster that consists of a shard that is a replica set (three nodes), config servers (three nodes replica set), and two router nodes. In total, we will have eight Docker containers running for our MongoDB sharded cluster. Of course, we can expand our cluster according to our needs.

Let's start to write our docker-compose.yaml file by defining our shard replica set:

version: '2'
services:
  mongorsn1:
    container_name: mongors1n1
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27017
    ports:
      - 27017:27017
    expose:
      - "27017"
    environment:
      TERM: xterm
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /mongo_cluster/data1:/data/db
  mongors1n2:
    container_name: mongors1n2
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27017
    ports:
      - 27027:27017
    expose:
      - "27017"
    environment:
      TERM: xterm
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /mongo_cluster/data2:/data/db
  mongors1n3:
    container_name: mongors1n3
    image: mongo
    command: mongod --shardsvr --replSet mongors1 --dbpath /data/db --port 27017
    ports:
      - 27037:27017
    expose:
      - "27017"
    environment:
      TERM: xterm
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /mongo_cluster/data3:/data/db

As you see, we defined our shard nodes by running them with the shardsvr parameter. Also, we mapped the default MongoDB data folder (/data/db) of the container, as you see. We will build a replica set with these three nodes when we finish writing our docker-compose.yaml file.

Now, let's define our three config servers:

mongocfg1:
    container_name: mongocfg1
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27017
    environment:
      TERM: xterm
    expose:
      - "27017"
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /mongo_cluster/config1:/data/db
  mongocfg2:
    container_name: mongocfg2
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27017
    environment:
      TERM: xterm
    expose:
      - "27017"
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /mongo_cluster/config2:/data/db
  mongocfg3:
    container_name: mongocfg3
    image: mongo
    command: mongod --configsvr --replSet mongors1conf --dbpath /data/db --port 27017
    environment:
      TERM: xterm
    expose:
      - "27017"
    volumes:
      - /etc/localtime:/etc/localtime:ro
      - /mongo_cluster/config3:/data/db

Our config servers are running with the configsvr parameter, as you see. 

Finally, we are going to define our mongos (router) instances:

mongos1:
    container_name: mongos1
    image: mongo
    depends_on:
      - mongocfg1
      - mongocfg2
    command: mongos --configdb mongors1conf/mongocfg1:27017,mongocfg2:27017,mongocfg3:27017 --port 27017
    ports:
      - 27019:27017
    expose:
      - "27017"
    volumes:
      - /etc/localtime:/etc/localtime:ro
  mongos2:
    container_name: mongos2
    image: mongo
    depends_on:
      - mongocfg1
      - mongocfg2
    command: mongos --configdb mongors1conf/mongocfg1:27017,mongocfg2:27017,mongocfg3:27017 --port 27017
    ports:
      - 27020:27017
    expose:
      - "27017"
    volumes:
      - /etc/localtime:/etc/localtime:ro

These mongos are dependent on our config servers. They take the configdb parameter to obtain metadata and configuration settings.

At last, we built our docker-compose.yaml file. If we compose it up, we will see eight running docker containers: 3 shard date replicate set + 3 config servers + 2 mongos (routers):

docker-compose up
docker ps

Image title

But we're not finished yet. Our sharding cluster needs to be configured. For this purpose, we will run some commands, which will build our cluster on related nodes.

First, we will configure our config servers replica set:

docker exec -it mongocfg1 bash -c "echo 'rs.initiate({_id: \"mongors1conf\",configsvr: true, members: [{ _id : 0, host : \"mongocfg1\" },{ _id : 1, host : \"mongocfg2\" }, { _id : 2, host : \"mongocfg3\" }]})' | mongo"

We can check our config server replica set status by running the below command on the first config server node:

docker exec -it mongocfg1 bash -c "echo 'rs.status()' | mongo"

We are going to see three replica set members.

Secondly, we are going to build our shard replica set:

docker exec -it mongors1n1 bash -c "echo 'rs.initiate({_id : \"mongors1\", members: [{ _id : 0, host : \"mongors1n1\" },{ _id : 1, host : \"mongors1n2\" },{ _id : 2, host : \"mongors1n3\" }]})' | mongo"

Now, our shard nodes know each other. One of them is primary and two are secondary. We can check the replica set status by running the status check command on the first shard node:

docker exec -it mongors1n1 bash -c "echo 'rs.status()' | mongo"

 Finally, we will introduce our shard to the routers:

 docker exec -it mongos1 bash -c "echo 'sh.addShard(\"mongors1/mongors1n1\")' | mongo "

Now our routers, which are the interfaces of our cluster to the clients, have the knowledge about our shard. We can check the shard status by running the command below on the first router node:

 docker exec -it mongos1 bash -c "echo 'sh.status()' | mongo "

We see the shard status:

Image title

We see that we have a single shard named mongors1 , which has three mongod instances. But we do not have any databases yet, as you see. Let's create a database named testDb:

docker exec -it mongors1n1 bash -c "echo 'use testDb' | mongo"

This is not enough; we should enable sharding on our newly created database:    

docker exec -it mongos1 bash -c "echo 'sh.enableSharding(\"testDb\")' | mongo "

Now, we have a sharding-enabled database on our sharded cluster! It's time to create a collection on our sharded database:

docker exec -it mongors1n1 bash -c "echo 'db.createCollection(\"testDb.testCollection\")' | mongo "

We created a collection named testCollection on our database, but it is not sharded yet again. We must shard our collection by choosing a sharding key. Let's assume that we have decided to shard our collection on a field named shardingField then:

docker exec -it mongos1 bash -c "echo 'sh.shardCollection(\"testDb.testCollection\", {\"shardingField\" : 1})' | mongo "

The sharding key must be chosen very carefully because it is for distributing the documents throughout the cluster. It is a must to read the official documentation about shard keys.

At the end, we have a sharded cluster, a sharded database, and a sharded collection. If we need to expand our cluster architecture in the future, we can add some new nodes as demanded!

Docker (software) Database MongoDB Shard (database architecture)

Opinions expressed by DZone contributors are their own.

Related

  • Tracking Changes in MongoDB With Scala and Akka
  • Data Pipeline Using MongoDB and Kafka Connect on Kubernetes
  • Manual Sharding in PostgreSQL: A Step-by-Step Implementation Guide
  • Avoid Cross-Shard Data Movement in Distributed Databases

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!