This is how Fundebug backs up data. 07/15 Update SLTechnology News&Howtos

This is how Fundebug backs up data.

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Summary: the data still needs to be backed up, what if the database is deleted?

This article code repository: introduction to fundebug-mongodb-backup

In August this year, Tencent Cloud unexpectedly lost the customer's cutting-edge data. Fundebug conducted some simple technical analysis at the first time:

On the one hand, Tencent Cloud bears unshirkable responsibility for this. At first, they said it was the hard disk firmware version bug (this statement has been deleted), but later admitted that it was caused by human error.

On the other hand, cutting-edge data does not back up business data, which is also a very unprofessional behavior, which makes the business unable to recover and must be completely restarted.

Therefore, all developers should learn from this incident, do not be lazy, strictly back up business data, otherwise if the data goes wrong, the loss will be irreparable.

Fundebug data backup scheme

We also share Fundebug's data backup scheme for your reference:

Backup scheme time granularity details MongoDB replication set build a MongoDB replication set with 3 nodes (1 Primary and 2 Secondary) in real time, and synchronize data in real time. Ali Cloud disk Snapshot automatically snapshots all disks, including system disks and backup data disks, in the early hours of every day. Mongodump Export Core data exports MongoDB core data to a server disk outside the replication set (which takes daily snapshots) in the early hours of each morning. Ali Cloud object Storage uploads the data exported by mongodump to the object storage of Aliyun Shenzhen data center in the early hours of every morning using gpg asymmetric encryption, sets cross-region replication, and automatically synchronizes the data to Hangzhou data center, and each data is retained for one month. Local hard disk backup downloads encrypted backup data from Ali Cloud object Storage at noon every Saturday and stores it to the local disk.

Probably because we did not release the technical details of the backup plan, we were questioned:

Or multiple backups are fake.

With regard to this kind of accusation, my principle is that we must go back against it. So, in this blog, let me introduce our data backup plan in detail ~ all the source code is in the GitHub warehouse Fundebug/fundebug-mongodb-backup, welcome star.

MongoDB replication set

It is basically impossible for a production environment to use a single-node MongoDB database unless the traffic is very low or does not care about service availability, not for the rest of your life. Single node MongoDB has a single point of failure (single point of failure). Once it is down, the whole application is dead. To make matters worse, if the data is corrupted, recovery will be very troublesome.

There are many possibilities for MongoDB to fail, the most common is the surge in peak memory usage, causing Linux's Out of Memory (OOM) killer to kill the mongod process. Fundebug has encountered this situation many times, so how do we get through it safely? The answer is replica set.

A replication set consists of multiple MongoDB nodes whose data is synchronized in real time, so the data is almost identical. When one node dies, the application can automatically switch to another node, which ensures the availability of the service.

The MongoDB of Fundebug runs in the Docker container, and its Docker Compose configuration file is as follows:

Version: '2.2'services: mongo: image: mongo:3.2 network_mode: "host" restart: always cpus: 7 mem_limit: 30g command:-- replSet rs0-- oplogSize 25600 volumes:-/ mongodb/data:/data/db logging: driver: "json-file" options: max-size: "5g" oplog

A very important parameter of the replication set is the size of the oplog, which can be specified using the-- oplogSize option. The value we set is 25600MB, or 25GB. Oplog (operation log) is the key for replication set nodes to synchronize data. Primary nodes record database writes to oplog, and Secondary nodes copy oplog from Primary nodes and apply it to the local database. Therefore, the oplog size determines the maximum "time difference" of data that can be accepted by Primary and Secondary nodes. Use rs.printReplicationInfo () to view oplog information:

Rs.printReplicationInfo () configured oplog size: 25600MBlog length start to end: 11409secs (3.17hrs) oplog first event time: Sat Sep 22 2018 12:02:04 GMT+0800 (CST) oplog last event time: Sat Sep 22 2018 15:12:13 GMT+0800 (CST) now: Sat Sep 22 2018 15:12:13 GMT+0800 (CST)

It is known that the last 3.17 hours of database write operations have been recorded in oplog. If a node in the replication set does not synchronize data for 4 hours due to downtime, restarting the node will not be able to synchronize with other nodes! There will be a "too stale to catch up-entering maintenance mode" error, and you can only synchronize the data manually.

Therefore, we suggest that the value of oplog should be set as high as possible, otherwise it will be troublesome to modify oplog later. In fact, the oplog size of 25GB is no longer enough for Fundebug's MongoDB replication set, and we need to modify it.

Fundebug's MongoDB replication set consists of 1 Primary node and 2 Secondary nodes, which plays a key role in ensuring the availability of our services! The backup solutions I introduced later are all redundant measures, we have never really used those backup data, and the replication set has "saved" us many times, so I strongly recommend that everyone configure it.

For more technical details about the MongoDB replica set, I'll elaborate on it separately later. Welcome to the official account of Fundebug Wechat.

Ali Cloud disk Snapshot

Snapshots can retain the state of disk data at a certain point in time, so they can be used as a way to back up data. It's simple, just configure the automatic snapshot policy:

I backed up the system disk so that I could at least roll back the disk in case of data loss such as deletion. Take a snapshot once a week and keep it for 7 days. Because all the services are run in Docker, the server itself is basically not configured, there is little need for backup, and in fact we have never rolled back the disk.

In addition, I did not take a snapshot of the MongoDB data disk directly because I found that the data after the snapshot could not be recovered (this remains to be further confirmed).

I just took a snapshot of the disk where the core data exported by mongodump is located. Take a snapshot once a day and keep for two days. Doing so ensures the security of the core data.

Mongodump exports core data

Using the mongodump command, you can export all MongoDB data. Accordingly, you can then import the backup data into MongoDB using the mongorestore command.

The script dump-data.sh for exporting the data is as follows:

#! / bin/sh# deletes the data exported the previous day rm-rf / data/mongodb_backupDIR= `date +% Y% m% d% H% M`OT = / data/mongodb_backup/$DIRmkdir-p $DEST# full export MongoDB data (excluding partial collections) mongodump-- host "rs0/192.168.59.11:27017192.168.59.12:27017192.168.59.13:27017"\-- db fundebug-production\-- excludeCollection events\-- out $OUT

Using the-- excludeCollection option, you can exclude some collections that do not require backups. For example, Fundebug accumulates more than 600 million error events, which are stored in the event collection, because we have already aggregated, so there is no need for backup, and the amount of data is too large, backup is not realistic.

Use the crontab script to execute the dump-data.sh script periodically:

# Export data 0 4 * / root/fundebug-mongodb-backup/dump-data.sh Ali Cloud object Storage at 4 am every day

The data exported using mongodump is stored on the data disk of the test server. Geographically speaking, it is all in the same place, that is, Aliyun Shenzhen data Center. If you want to achieve remote backup, you can use the cross-region replication feature of Aliyun's object storage service to automatically synchronize the backup data to Aliyun Hangzhou data center.

Before uploading backup data, use the gpg command for asymmetric encryption to ensure data security. The script encrypt-data.sh script to encrypt the exported data is as follows:

#! / bin/bashDIR= `find / data/mongodb_backup/-maxdepth 1-type d!-path / data/mongodb_backup/ `source = $DIR/fundebug-productioncd $source# will export data encryption for file in *; do gpg-- batch-- yes-v-e-r fundebug--output $source/$file.gpg-- always-trust $filedone

In addition to encryption, gpg also has a certain compression effect, which can reduce the amount of backup data and kill two birds with one stone. For details on the gpg command, check out the reference blog.

Using the Node.js client ali-oss provided by Aliyun, you can upload encrypted .gpg files to Aliyun's object storage service. You can use the multipartUpload method. The upload.js code is as follows:

/ / upload a single file async function uploadFile (fileName, filePath) {try {const result = await store.multipartUpload (fileName, filePath, {parallel: 4, partSize: 1024 * 1024, progress: function (p) {logger.info ("Progress:" + p);}}) If (result.res.statusCode = 200) {logger.info (`upload file success! ${fileName} `);} else {const message = `upload file fail! ${fileName}`; logger.error (message); logger.error (result) Fundebug.notifyError (new Error (message), {metaData: {message: message, result: result}});} catch (error) {const message = `upload file fail! ${fileName} `; logger.error (message); logger.error (error) Fundebug.notifyError (error, {metaData: {message: message, error: error});}}

The code runs in the Docker container. Use the curl command to access the HTTP API / upload to trigger the upload operation, and execute it periodically using crontab:

# backup data 0 4 * / root/mongodb-backup/dump-data.sh & & / root/mongodb-backup/encrypt-data.sh & & docker restart mongodb-backup & & sleep 1m & & curl http://127.0.0.1:9160/upload at 4: 00 a.m. Every day

Backup data is mapped to the container through data volumes (volume), and the container needs to be restarted every day to access the new data exported every day.

It is very convenient to configure cross-region replication for the storage space of backup data on Ali Cloud to achieve automatic remote backup. Other object storage cloud services should also support this feature.

Local disk backup

In fact, the backup methods mentioned above are all COPY data within Ali Cloud. So the question is, what if Ali Yun dies? Of course, this kind of thing is basically impossible, after all, we have many backups, even remote backups.

Now that the backup data has been uploaded to Ali Cloud object storage, it is not difficult to download it locally. This can be achieved by using the list and get methods of ali-oss. The download.js code is as follows:

/ / get the list of files uploaded to Ali OSS on the same day async function listFilesToDownload (day) {const result = await store.list ({prefix: day}); return result.objects;} / / download the files from Aliyun OSS to the local async function downloadFile (fileName, path) {try {const file = fileName.split ("/") [1]; const filepath = `${path} / ${file}` Await store.get (fileName, filepath);} catch (error) {const message = `download file fail! ${fileName} `; logger.error (message); logger.error (error); fundebug.notifyError (error, {metaData: {error: error, message: message}});}}

The code runs in the Docker container and is deployed on the local machine. The download operation can be triggered by using the curl command to access the HTTP API / download, and executed periodically using crontab:

# download backup data 012 * * 6 curl http://127.0.0.1:9160/download conclusion from Aliyun at noon every Saturday

All the data backup methods mentioned in this article are completely automated, there is no technical difficulty, the cost is not high, and can greatly improve data security.

Refer to MongoDB by Linux OOM Kill [understand and configure OOM Killer under Linux] (understand and configure OOM Killer under Linux) MongoDB document-Replication Ali Cloud MongoDB backup and restore function description and principle introduction MongoDB document-mongodumpGPG Encryption Guide-Part 1GPG Encryption Guide-Part 2 (Asymmetric Encryption) about Fundebug

Fundebug specializes in real-time BUG monitoring of JavaScript, WeChat Mini Programs, Wechat, Mini Game, Mini Program, React Native, Node.js and Java. Since the official launch of Singles Day in 2016, Fundebug has handled a total of 800 million + errors, which has been recognized by many well-known users such as Google, 360and Kingsoft. Welcome to try it for free!

Please indicate the author's Fundebug and the address of this article when reprinting:

Https://blog.fundebug.com/2018/09/27/how-does-fundebug-backup-data/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.