Understanding MongoDB Replication: A Step-by-Step Replica Set Creation.

Published in

ITNEXT

6 min readJan 25, 2024

Discover more on YouTube: If you’re looking for video content on technology, programming, and other topics, check out my YouTube channel! Subscribe to stay updated and learn more about these subjects.

Within this article, complemented by an accessible video linked here, we explore key facets of MongoDB Replication, including high availability and data redundancy. Follow our guidance as we take you through the step-by-step process of establishing a Replica Set cluster.

The article is organized as follows:

Understanding Replication
Creating the Replica Set Cluster
Inserting data
Simulating Disasters
Connecting to the Replica with MongoDB Compass
Conclusion
References

Understanding Replication

Guys, replication exists primarily to provide data redundancy and high availability. When we talk about data redundancy, it means that MongoDB will create copies on physically isolated servers, which is beneficial for backup in disaster recovery scenarios.

High availability ensures that we always have a node operational, ensuring continuous access to the database. This is crucial for maintaining seamless operations even in the face of node failures or planned maintenance.

Additionally, replication enhances scalability by allowing read operations to be distributed among multiple nodes. This not only improves performance but also supports the growing demands of data-intensive applications.

For more details, please access my video:

✂️ Improving MongoDB Read Operations

58 seconds · Clipped by Ricardo Mello · Original video "Understanding MongoDB Replication: A Step-by-Step Replica Set…

youtube.com

Creating the Replica Set Cluster

All right, folks, our replica set will have 3 nodes, 27017, 27018, and 27019:

To create it, first, I’m using Windows, and I already have MongoDB installed on my machine. First of all, we need to create three folders, one for each node.

After creating the three folders, let’s set up the first node, 27017, using the following command inside the ‘bin’ folder:

mongod --dbpath ../27017 --port 27017 --replSet "rs0"

We’ll use mongod, specify the folder, set the port, and provide the replica set name (rs0).

Sure, now let’s execute the same command, just changing the folder and port for nodes 27017 and 27018 while keeping replSet "rs0". After this, we'll have all three nodes running and waiting for the cluster initialization:

Great, everyone! Our replica set is almost ready. We just need to initialize it. To do this, let’s connect to node 27017 and start the replica set. Using the following command inside the ‘bin’ folder:

mongosh --port 27017

In the image above, as you can see, I executed the command rs.status() to check the status of our replica set, and it has not been configured yet. So, let's do that:

rs.initiate({_id: "rs0", version: 1, members:[{_id:0, host: "localhost:27017"}, {_id: 1, host: "localhost:27018"}, {_id: 2, host: "localhost:27019"}] } )

Great. Our replica set is initialized. Now, let’s open another prompt and connect to node 27018 using the command mongosh --port 27018 and run rs.status() to analyze some information:

As you can see, the node 27017 is marked as primary, as indicated by the arrow, while the circled nodes 27018, 27019 are secondary.

Inserting data

Now, let’s insert some data to see replication in action.

Remember, MongoDB always writes data to the primary node

So, let’s connect to our 27017 node, which is a primary, and run the following commands:

1 — To create a database with the name “new_database”:

use new_database

2 — This command will insert a document into the ‘person’ collection

db.person.insertOne({name: 'Ricardo', age: 34})

Great. Now, to check if replication has worked, let’s go to node 27019 and verify if the ‘person’ collection exists there:

Excellent, our collection has been replicated. Now, if we want to read this data directly from node 27019, we need to inform MongoDB that this node is capable of performing direct reads.

This topic is crucial, and you can find more details about scalable reads in my video.

To achieve this, we need to set the read preference to ‘secondary’:

db.getMongo().setReadPref('secondary')

Simulating Disaster

Now it’s time to simulate a disaster in our cluster. To do this, I’ll stop node 27017, the primary node where MongoDB always writes operations:

Great, as you can see, our primary node has been shut down. Now, let’s return to the 27019 terminal, run rs.status(), and observe that MongoDB has automatically elected 27018 to be the new primary node:

Connecting to the Replica with MongoDB Compass

Great, everyone, we’re nearing the end. Now, let’s connect to this replica set that we’ve created.

To do this, remember to bring up node 27017 again and have all 3 nodes running.

Open MongoDB Compass and execute the following command:

mongodb://localhost:27017,localhost:27018,localhost:27019/replicaSet=rs0

Great, as you can see, the ‘person’ collection is created in the ‘new_database’ database. Now, let’s insert some records to perform some aggregation pipeline:

Sample:


{
  "_id": {
    "$oid": "65b2bff5a85f246b7dc95846"
  },
  "name": "Maria",
  "age": 59
}

Very well, now enter 3 more records: Henrique 20, Jose 62, Maria 59:

Very well, now let’s go to the Aggregation tab and filter only the people who are over 50 years old:

To do that, let’s click on Stage, select Match, and include:

{
   age: { $gt: 50 } 
}

In summary, exploring MongoDB Replication offers a robust solution for data redundancy and high availability. We covered the essentials, from creating a Replica Set to simulating disasters and observing automatic handling. For a deeper dive and detailed guidance, check out my video.

Do you like MongoDB? Check out my other articles:

I hope this has been informative and helpful. Until next time! 👋