Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are RDB and AOF?

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly introduces "what is RDB and AOF". In daily operation, I believe that many people have doubts about what is RDB and AOF. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful to answer the questions of "what is RDB and AOF?" Next, please follow the editor to study!

Redis persistence

Redis provides different levels of persistence options:

RDB mode, Redis database backup file (Redis Database Backup) persistence mode, provides periodic point-in-time-based snapshot backup of datasets, such as generating a snapshot backup per hour.

AOF mode, which only appends to AppendOnlyFile persistence mode, records log files every time the database service receives a write operation, and automatically plays back the log to reconstruct the original data set when the service is restarted. Redis's own protocol is used in the log, and it is recorded by append-only method according to the uniform format. When the log file is too large, Redis can rewrite the log in the background to generate a minimized version of the log file.

You can also disable persistence completely, for example, as long as the service is running or cached data can be generated automatically.

You can also combine AOF and RDB persistence methods on the same Redis instance. Please note: when Redis restarts, the AOF file will be used to rebuild the original data set, because AOF is considered to be a more complete data backup than the RDB cycle snapshot, for example, it can make a quasi-real-time backup (only 1 second of data lost).

Next, let's compare the advantages and disadvantages of RDB and AOF:

Advantages of RDB

RDB uses a compressed single file to represent point-in-time Redis data, and the RDB file is a perfect backup. For example, you can keep hourly snapshot backups for the past 24 hours, and daily snapshot backups for the past 30 days. When data is lost, you can easily recover datasets from different backup granularities (versions).

RDB is very good for disaster recovery, because compact single files are very easy to transfer between remote data centers or Amazon S3 (object storage, which can be encrypted).

RDB maximizes Redis performance, because the Redis parent process only needs to start one child process to complete the snapshot backup, and the parent process does not execute the disk Imax O caused by the backup.

Compared with AOF mode, RDB service restarts faster when data is recovered in the case of large data sets.

Shortcomings of RDB

If you want to minimize the possible loss of data when Redis shuts down unexpectedly (such as a power outage), RDB is not a good solution. You can configure different Savepoints where the RDB is generated (for example, create one SavePoint every 5 minutes when at least 100 writes to the dataset occur, or you can configure multiple SavePoint policies). However, you usually create RDB snapshots every 5 minutes or more, so when Redis stops working abnormally, you will lose the data from the last point in time to the present.

RDB will call the system fork () method to derive a child process to persist the data to the hard disk. If the dataset is large, the Fork () method will be very time-consuming, causing Redis to stop serving the client, and the stop time may be upper microseconds. If the dataset is very large and the CPU performance is not very good, the stop time can reach 1 second or more. AOF also calls the fork () method during persistence, but you can adjust the frequency of rewriting the log without any trade-off.

Advantages of AOF

AOF is more persistent: you can configure different fsync policies:

Without fsync

Fsync once a second

Fsync every time you query

Note: fsync (https://man7.org/linux/man-pages/man2/fsync.2.html) is a system method used to persist kernel-state cached data to a storage device, such as writing memory data to a hard disk

By default, the policy of executing fsync per second is used. In this scenario, the write performance of Redis is also very good, because fsync runs on a background thread, and the main thread tries its best to complete the write operation. So you can lose no more than 1 second of data.

The AOF log is a file that can only be appended, so the file will not be found (seek) or corrupted after a power outage. Even if there are half-written commands in the log due to disk fullness or other reasons, you can easily fix them using the redis-check-aof tool.

Redis will automatically rewrite the log in the background when the AOF file is too large. Rewriting is very safe. When rewriting, a child process derived from Redis rewrites the large AOF file to the smallest available dataset log file. When there is a write operation, Redis continues to append to the old AOF file as well as to the AOF rewriting buffer aof_rewrite_buf. When the rewrite is complete, the new small AOF file will merge the new data in the buffer, and finally rename the new AOF file to the old AOF file to complete the replacement operation. Future data will be written to the new AOF file.

AOF log files record all operations in a format that is easy to understand and parse. It is very easy to export an AOF file. Even if the erase command FLUSHALL (https://redis.io/commands/flushall)) is executed in error, if the rewrite operation is not performed at this time, you can still complete the data recovery by shutting down the service, deleting the last error command in the file, and restarting Redis.

Shortcomings of AOF

For the same dataset, AOF files are generally larger than RDB files.

Depending on the specific fsync strategy, AOF may be slower than RDB. Usually, under the default fsync per second policy, Reids performance is also very high. If fsync is disabled, AOF should be as fast as RDB even under high load. However, in the case of heavy write load, RDB provides more guarantees of maximum latency.

In the past, when executing some special commands (for example, here is a command involving blocking, BRPOPLPUSH: https://redis.io/commands/brpoplpush), Redis encountered some rare BUG, which caused AOF to reconstruct data, the data was inconsistent. These problems are so rare that we do unit tests and automatically create random and complex data sets to perform reconstruction tests. But if you use RDB persistence, this kind of problem is almost impossible. To make this clear: AOF, like MySQL or MongoDB, uses the mechanism of incremental updating of existing state, but RDB snapshots are created from scratch every time, and RDB is conceptually more robust. But there are two points worth noting:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Every time AOF is rewritten by Redis, it is recreated from scratch from the actual data contained in the dataset, making the new AOF file more resistant to bug than the AOF file that has been appended without rewriting.

In practice, we have not received a user report of an error in the AOF file.

Which one should I use?

In general, if you want data security like PostgreSQL, you should combine RDB with AOF.

If you are very concerned about your data, but are allowed to lose data for a few minutes, you can only use RDB persistence.

There are many users who only use AOF, but we don't recommend that because RDB's point-in-time snapshots are useful for database backups, quick restarts, or problems with the AOF engine.

Note: for these reasons, in the future (long-term plan), we will eventually unify AOF and RDB into a persistence model solution.

In the following sections, we will give examples to illustrate more details about RDB and AOF.

Snapshot

Redis saves snapshots to the dump.rdb file on your hard drive by default. You can configure, every N minutes, at least M times to change the data set to execute a snapshot, or manually execute the save SAVE or background save BGSAVE command.

Save 60 1000

How does it work?

Whenever Redis needs to save a dataset to disk, it performs the following tasks:

Redis forks derives a child process, where there will be a parent process and a child process.

The child process begins to write the dataset to a temporary RDB file.

When the child process finishes writing the new RDB file, it replaces the old RDB file.

This method is the write-copy meaning of Redis (copy-on-write).

AOF appends files only

Snapshots are not very long-lasting. If the Redis service stops abnormally, the power outage stops, or if kill-9 kills the Redis service process accidentally, the final data write will be lost. While this is a small problem for some applications, RDB is not a good choice for scenarios that require full persistence.

Appendonly yes

From now on, whenever Redis receives a command to change the dataset (such as SET), the operation will be appended to the AOF file, and when you restart Redis, the dataset will be rebuilt based on the AOF file.

Log rewrite

The AOF file size increases as the operation increases. For example, if you want to increment the count 100 times, the final dataset contains only one key value is the final result, but there are 100 records in the AOF file, in fact, when you rebuild the dataset, you do not need the remaining 99 records.

So Redis supports this interesting feature: rewriting AOF files in the background without interrupting Redis services. When executing the background rewrite command BGREWRITEAOF, Reids writes down the current dataset in memory with the shortest ordered command set. If you use Redis2.2, you need to execute BGREWRITEAOF regularly (https://redis.io/commands/bgrewriteaof), starting with Redis2.4, which can automatically trigger log rewriting. (for more information, please see the configuration example of 2.4, different versions of the configuration (https://redis.io/topics/config)).).

How does AOF persist?

You can configure the time interval, Redis to execute fsync to disk. Here are three strategies:

Appendfsync always: execute fsync. Fsync when each new command is appended to the AOF file. Very slow, but very safe. Note that if the appended command comes from a batch of commands from multiple clients or pipes, it will be treated as a write operation and fsync will only be executed once before sending the response.

Appendfsync everysec: fsync is executed every second. Fast enough (in Redis2.4, as fast as RDB snapshots), you can lose up to 1 second of data in the event of an accident.

Appendfsync no: never executes fsync, only gives data to the operating system. It's faster, but it's less safe. In this configuration, Linux usually refreshes data to the hard disk every 30 seconds, but the actual time can be tuned by kernel configuration.

Executing fsync once per second is recommended and is the default. It is fast and safe. Appendfsync always policy is very slow in practice, but it supports group commit, so you can merge multiple parallel writes and execute fsync once.

What should I do if the AOF file is truncated?

When writing the AOF file, the server appears crash or the disk space is full. At this time, the AOF still contains consistent data, representing a given point-in-time version of the data set (the default fsync policy may lose 1 second of data), but the final command will be truncated in the AOF record, and the latest Redis backbone version will still import all the AOF file contents, but ignore the final incomplete command. The server issues a warning log:

* Reading RDB preamble from AOF file... * Reading the remaining AOF tail... #! Warning: short read while loading the AOF file! #! Truncating the AOF at offset 439! # AOF loaded anyway because aof-load-truncated is enabled

You can change the default configuration to force this to stop happening, but the default configuration ignores the last incomplete command to ensure that the service is available after restart.

Older versions of Redis are not automatically restored and need to be restored by the following steps:

Back up the AOF file.

Repair the AOF file using redis-check-aof, a tool provided by Redis:

$redis-check-aof-fix

You can execute diff-u to check the differences between the two AOF files and confirm that the error has been fixed.

Restart the Redis service with the repaired AOF file and rebuild the dataset.

What if the AOF file is corrupted?

If the AOF file is not only truncated, but also invalid bytes are inserted in the middle, things will become more complicated, and Redis will break and prompt when it starts:

* Reading the remaining AOF tail... # Bad file format reading the append only file: make a backup of your AOF file, then use. / redis-check-aof-- fix

It is best to use the redis-check-aof tool to repair, first of all, do not apply the-fix option, find the problem, skip the wrong location of the file, and see if the file can be repaired manually. AOF uses the same protocol format as Reids, so it is very easy to repair the file manually, otherwise the tool will be used to repair the file. In this case, the data from the invalid location to the end of the file may be lost. If the corruption occurs at the beginning, it is equivalent to the loss of the entire dataset.

How does it work?

Log rewriting uses a copy-and-write (copy-on-write) approach consistent with snapshots, as follows:

Redis performs forks derivation so that there is a main process and a child process.

The child process starts writing a new AOF to the 00:00 file.

While Redis continues to append to the old AOF file, it is also appended to the AOF rewrite buffer aof_rewrite_buf, so even if it fails again, it is data safe.

When the child process finishes rewriting the AOF file, the parent process receives a completion signal to append the cached data to the new AOF file.

Finally, rename the new AOF file to the old AOF file to complete the replacement operation, and the future data will be written to the new AOF file.

How to switch from dump.rdb Snapshot to AOF

Redis2.0 and Redis2.2 use different steps to switch to AOF, and Redis2.2 switching to AOF is easier and does not require a restart.

Redis > = 2.2

Back up the latest dump.rdb files.

Transfer the backup files to a safe place.

Execute the following two commands:

Hongmeng official Strategic Cooperation to build HarmonyOS Technology Community

Redis-cli config set save "" # cancel RDB

Redis-cli config set appendonly yes # enable AOF

Check to make sure that the number of keys in the database is not lost.

Check that the write operations are correctly appended to the AOF file.

The first configuration command indicates that the AOF function is enabled. In this way, Redis will block to generate the initial backup, then open a new file to write the operation record, and subsequent writes will continue to be appended to the AOF file.

The second configuration command is used to turn off RDB snapshot persistence. This is optional if leaving save means using both RDB and AOF persistence.

Important: remember to modify the redis.conf configuration file to open AOF, otherwise the original configuration will be used when the service is restarted.

Redis 2.0

Back up the latest dump.rdb files.

Transfer the backup files to a safe place.

Stop all write operations.

Execute the background rewrite AOF command redis-cli BGREWRITEAOF. This operation creates an AOF file.

When the AOF backup is complete, stop the Redis service.

Edit redis.conf to enable AOF function.

Restart the service

Check to make sure that the number of keys in the database is not lost.

Check that the write operations are correctly appended to the AOF file.

Interaction between AOF and RDB

Redis > = 2.4 ensures that an AOF rewrite process is not triggered when the RDB snapshot is running, or that the snapshot BGSAVE is not allowed to be saved in the background when the AOF rewrite is already running. This can prevent the two background processes from generating high-load disk I mango O.

Backing up Redis data

Before you start this section, please make sure that the database has been backed up. If the disk is damaged or the cloud instance disappears, no backup means that the data is at great risk and will disappear in the "black hole" / dev/null.

Redis is very friendly for data backup, even if the database is running, it allows you to copy and back up the data: the RDB file will not be modified when it is generated, during the snapshot backup, it will generate the 00:00 file, and when the final backup of the snapshot completes, the original RDB file will be renamed and replaced.

This means that it is very safe to copy RDB files when the service is running. Here are our recommendations:

On the server, create a scheduled task CronJob, execute RDB snapshots every hour, save to one directory, and save daily snapshots in another directory.

Every time a scheduled task is executed, be sure to use the find command to find the oldest snapshots and delete them. For hourly snapshots, you can keep them for the last 48 hours, and for daily snapshots, you can keep them for 1 month or 2 months. And make sure that the package snapshot name contains time information.

Do a data dump at least once a day, such as transferring RDB snapshots to another data center, or at least from the current Redis service physical machine to another location.

If you use ROF persistence, you can still copy AOF files for backup. Redis can rebuild this AOF file even if it loses the last piece of data (please refer to the truncated AOF file handling above)

Disaster recovery

Disaster recovery and backup are basically consistent, plus the backup data can be transferred between many different data centers. In this case, backup elsewhere is safe and recoverable, even if the primary data center is affected.

In view of the fact that there is not much money to do large-scale backups at the beginning, here are some disaster recovery techniques that do not require much overhead:

AmazonS3 object storage or other similar service is a good way to implement disaster recovery system. Just encrypt hourly or daily RDB snapshots and transfer them to S3. You can use gpg-c (using symmetric encryption mode) to encrypt the data. Please make sure to save the password in a different secure place (such as making a copy and giving it to the most important person to manage). It is recommended to use a variety of storage services to improve data security.

Use the SCP (part of SSH) command to transfer data to another server. This is a simple and secure method: on the cloud, get a small virtual private server VPS far from the current Redis service, install ssh on the data side, generate a ssh client key without a password, and add it to VPS's authorized_keys file, so that you can continue to realize automatic secret-free transfer of backup data to VPS. In order to improve data security, different operators can be used. VPS in different network areas.

This method may cause the file transfer to fail, so after the transfer is completed, at least increase the file integrity check, such as checking the file size, if you use VPS, you can even use SHA1 check.

You also need to deploy an independent monitoring alarm system to monitor the backup process so that it can be detected and repaired in time when the backup fails.

At this point, the study of "what is RDB and AOF" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report