In large-scale deployments, syncing data between MongoDB clusters across regions or environments becomes critical. Whether for disaster recovery, migrations, or high availability, Cluster-to-Cluster Sync ensures data consistency between clusters. In this blog, we will explore how cluster-to-cluster synchronization works in MongoDB, the architecture, and practical steps for setting it up.
Cluster-to-cluster synchronization is a process that enables two separate MongoDB clusters to mirror each other, ensuring that data is consistently synced across multiple geographic locations or cloud providers. Starting from MongoDB version 6.0, this feature is available natively, facilitating disaster recovery, hybrid cloud setups, and high-availability architecture.
The architecture for MongoDB cluster-to-cluster synchronization typically involves the following components:
MongoDB Cluster-to-Cluster Sync helps keep two clusters (databases) in sync, making sure both have the same data. Here’s a simple explanation of how it works:
In this section, we will focus on configuring Cluster-to-Cluster Sync using mongosync. There are three options available for synchronization:
In this blog, we will concentrate on connecting two self-managed clusters.
Stay tuned for upcoming blogs, where we will cover the other options for cluster synchronization, including connecting two Atlas clusters and connecting a self-managed cluster to Atlas.
You can obtain the Cluster-to-Cluster Sync tool (Mongosync) as a .tgz tarball from the official MongoDB website. Make sure to download the correct version based on your operating system.
Once downloaded, follow the detailed installation steps outlined in the MongoDB Mongosync Installation Guide. It includes platform-specific instructions to ensure a smooth installation process.
There are two common ways to initialize Mongosync, offering flexibility based on your preferences: via command-line or configuration file. Let’s explore both methods.
You can initialize Mongosync directly from the command line using the following command:
mongosync \
--cluster0: "mongodb://mafadmin:maf313@172.17.0.13:27017,172.17.0.13:27018" \
--cluster1: "mongodb://mafadmin:maf313@172.17.0.14:27017,172.17.0.14:27018"Alternatively, you can configure Mongosync using a configuration file for more control. Here is an example configuration file:
cat /etc/mongosync.conf
cluster0: "mongodb://mafadmin:maf313@172.17.0.13:27017,172.17.0.13:27018"
cluster1: "mongodb://mafadmin:maf313@172.17.0.14:27017,172.17.0.14:27018"
logPath: "/var/log/mongosync/mongosync.log"Once the config file is ready, initialize Mongosync with the following command:
mongosync --config /etc/mongosync.conf
You can verify if Mongosync is running correctly by checking the log file:
tail -f /var/log/mongosync/mongosync.log
Look for entries similar to this to confirm initialization:
{"time":"2024-10-16T05:10:19.115377Z","level":"info","message":"Running webserver."}
To start syncing data between the clusters, use the following curl command to initiate the process via Mongosync’s API:
curl localhost:27182/api/v1/start -X POST \
--data '{ "source": "cluster0", "destination": "cluster1" }'If successful, you should see:
{"success":true}
You can monitor the sync process by querying the current status using this command:
curl localhost:27182/api/v1/progress -X GET
The response will provide detailed insights, including whether Mongosync is actively syncing:
{"progress":{"state":"RUNNING","canCommit":true,"canWrite":false,"info":"change event application","lagTimeSeconds":0,"collectionCopy":{"estimatedTotalBytes":3953412,"estimatedCopiedBytes":3953412}}}There might be situations where you need to pause the synchronization process. Use the following command to pause Mongosync:
curl localhost:27182/api/v1/pause -X POST --data '{ }'
If successful, Mongosync will transition to the “PAUSED” state:
{"success":true}
Tip: If you plan to pause for an extended period, consider increasing the size of the source cluster’s oplog to prevent issues during resumption.
To resume the synchronization process after a pause, execute the following command:
curl localhost:27182/api/v1/resume -X POST --data '{ }'
A successful response will indicate that the sync has resumed:
{"success":true}
Once the synchronization process is complete, it’s crucial to commit the changes to ensure that everything is properly applied to the destination cluster.
Before committing, verify that the synchronization is ready for a commit:
curl localhost:27182/api/v1/progress -X GET
Check the canCommit flag in the response:
{"progress":{"canCommit":true,"info":"change event application"}}
Once you’ve verified that the sync is ready, issue the commit request:
curl localhost:27182/api/v1/commit -X POST --data '{ }'
A successful commit will return:
{"success":true}
By following these steps, you can smoothly install, initialize, monitor, and manage Mongosync for your Cluster-to-Cluster synchronization needs, ensuring efficient data replication between MongoDB clusters.
At Mafiree, we simplify MongoDB Cluster-to-Cluster Sync, ensuring smooth, secure, and hassle-free data migration, disaster recovery, and cross-region replication. With our in-depth experience, we offer end-to-end services tailored to your business needs from initial consultation to full implementation and ongoing support.
Ready to elevate your database infrastructure? Let Mafiree handle the complexities so you can focus on growing your business. Contact us today and let’s start building a seamless data future together!
Miru IT Park, Vallankumaranvillai,
Nagercoil, Tamilnadu - 629 002.
Unit 303, Vanguard Rise,
5th Main, Konena Agrahara,
Old Airport Road, Bangalore - 560 017.
Call: +91 6383016411
Email: sales@mafiree.com