MongoDB backup recovery notes 07/03 Update SLTechnology News&Howtos

MongoDB backup recovery notes

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

1. Full logical backup / recovery Mongodump/Mongorestore

For scenarios with a small amount of data, using the official mongodump/mongorestore tool for full backup and recovery is sufficient. Mongodump can connect to a serving mongod node for logical hot backup. Its main principle is to traverse all collections, and then read the documents one by one, support concurrent dump multiple collections, and support archiving and compression, can be output to a file (or standard output) (interested in the principle can see my previous two articles Mongodump (archiving) pattern principle parsing and Mongorestore (archiving) pattern recovery principle parsing). Similarly, mongorestore is connected to a serving mongod node for logical recovery. The main principle is to write the backed-up data back to the database one by one.

Impact on performanc

The execution process of mongodump will affect the performance of MongoDB because it traverses all the data. It is best to execute it on the slave node (preferably hidden, to check whether the data synchronization of the slave node is normal).

Get consistent data snapshots

Because there are new changes to the database during the execution of mongodump, the result of running dump directly is not a consistent snapshot. You need to use a "--oplog" option to dump the oplog in this process (use the-- oplogReplay option to replay the oplog when using mongorestore for recovery). Because the oplog of MongoDB is a special collection of fixed size, when the oplog collection reaches the configured size, the old oplog will be rolled out to make room for the new oplog. When using the "--oplog" option for dump, mongodump will get the latest oplog time point before the dump collection data, and check whether the oplog at this time point is still there after the collection data dump is completed. If the dump process is long and there is not enough oplog space, dump will fail if the oplog is rolled out. Therefore, before dump, it is best to check the configuration size of oplog and the current growth of oplog (you can roughly estimate the number of business writes and the average size of oplog) to ensure that dump will not fail. Currently, our Aliyun MongoDB service has optimized the elastic expansion and expansion of oplog to ensure that the oplog will not be rolled out during the logical backup process, and the backup will be successful.

Backup and recovery of index

For aggregate data, the result of mongodump is bson files. For the index of the collection, it is described in a json file of metadata, which also contains the options used to create the collection. When using mongorestore for recovery, the corresponding index is created after the collection data has been recovered.

2. Full physical backup / restore

For scenarios with a large amount of data, it may take a long time to use mongodump/mongorestore for backup and recovery. The main problem for backups is that the longer the backup takes, the more likely it is that the oplog will be rolled off and the backup will fail. For recovery, because the recovery process also involves the creation of indexes, if there are many indexes in addition to a large amount of data, it will take even longer. When it comes to a data disaster like hearthstone, the shorter the recovery time, the better. after all, the minute-by-minute flow in the game industry is considerable. At this time, you need a physical backup to come out. Physical backup, as the name implies, is achieved by physically copying data files. During the recovery, you can directly use the data files copied from the physical backup and start mongod directly. The biggest advantage of physical backup is its high speed and no need for indexing during recovery.

Implementation method

Physical backup is achieved by copying data files, which requires that all copied data files must be a consistent data snapshot. Therefore, the implementation method of physical backup is related to the storage engine used by MongoDB, and there will be some differences in the implementation details depending on whether MongoDB is configured to open Journal. For more information, please refer to the official documentation. "regardless of the storage engine used, after version 3.2, physical backups can be achieved in the following ways:"

Execute the following command through mongoshell to ensure that all writes are flush to disk and disable new writes:

Db.fsyncLock ()

Use the snapshot function of the underlying file system layer or logical volume to snapshot the data directory of MongoDB, or directly copy the data directory through cp, scp, tar and other commands.

Again on the mongoshell (you need to make sure it is the same connection as just now), execute the following command to re-allow new writes:

Db.fsyncUnLock ()

Because executing db.fsyncLock () adds a global write lock to the database, and the database is in an inaccessible state, it is best to perform a physical backup on the standby node (preferably hidden, note that you also need to ensure that the oplog of the node can catch up with the primary node after the physical backup is completed). At present, our Aliyun MongoDB team has developed a physical hot backup method that does not need to stop writing service. I believe it will be available to everyone soon. Please look forward to it!

Incremental backup

The incremental backup of MongoDB can be achieved by continuously fetching oplog. There is no ready-made tool available at present, and you need to implement it with your own code. The main challenge of crawling oplog is the same as using mongodump for full backup, making sure that the oplog to be crawled is not rolled off. At present, our Ali Cloud MongoDB service implements the function of automatic incremental backup, which can be restored at any point in time combined with full backup.

3. Backup / restore of Sharding

Hearth is independent, so it is possible to use a distributed database behind it. For distributed databases, backup and recovery are more complex than stand-alone databases. A distributed database contains multiple nodes and usually contains nodes with different roles. Take MongoDB's Sharding cluster as an example, which contains a config server for metadata and several shard for data. The most important metadata is the distribution of data among shard. For the backup of multiple nodes, one of the difficulties is to ensure that the data backed up by all nodes is at the same point in time. The conventional means is to stop external writing and backup, which is obviously unacceptable in Internet services. Second, you can make a backup on the standby node that stops accepting synchronization, so that you can get a backup that is about the same time. Another difficulty is that there is usually data migration between data nodes, and data migration involves data modification of at least two data nodes and data modification of metadata nodes. If data migration occurs in the backup process, it is difficult to ensure that the backed-up data and metadata are in the same state. Therefore, you usually need to turn off data migration during the backup process. The official documentation guide for MongoDB adopts this idea, first shutting down the balancer responsible for data migration, and then backing up on the backup nodes of config server and each shard in turn. The biggest problem of shutting down data migration is that the cluster can not achieve data balance during the shutdown period, which will not only affect the access performance of the cluster, but also cause a waste of resources, which may have a greater impact when the amount of data is large and the backup time is long.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.