Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What is the principle of persisting RDB and AOF in Redis

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Today, I will talk to you about the principle of persistent RDB and AOF in Redis, which may not be well understood by many people. in order to make you understand better, the editor has summarized the following content for you. I hope you can get something according to this article.

Basic knowledge of RDB

RDB file exists as a compressed binary file, this RDB file is generally saved in the Redis installation directory, by starting the Redis server to execute the rdbLoad function to load the RDB file, and execute the rdbSave function to save the RDB file.

RDB save

The rdbSave function is responsible for saving the in-memory database data to disk in RDB format. If the RDB file already exists, the new RDB file will replace the existing RDB file, and both the SAVE and BGSAVE commands will call the rdbSave function.

SAVE calls rdbSave directly, blocking the Redis main process until the save is complete. The server cannot process any requests from the client during the blocking of the main process.

BGSAVE starts a child process, which is responsible for calling the rdbSave,Redis server to continue to process the client's request during the BGSAVE execution, and sends a signal to the main process after the save is completed, informing the main process that the save is complete.

Note: when executing the save command, Redis will not execute save, bgsave, bgrewriteaof, and so on again, because if you execute these commands, there will be resource preemption and unexpected bug for backups.

Automatic isolation and preservation

At this point, our partner realized that we could not execute commands every time to save manually or execute timer execution commands, which would lead to reduced availability, so Redis provided auto-saved configuration.

# make more than 10 modifications to Redis in 100 seconds save 100 10 modifications in one hour at least 1000 changes to Redis save 3600 1000 modifications # execute the bgsave command as long as any of the above requirements are met

So what saves this configuration in the Redis source code is a saveparam structure in redisServer, as follows:

Struct saveparam {teme_t seconds; / / second int changes; / / modifications} RDB snapshot principle

We will not analyze save here, because it directly blocks the main thread from writing Redis data directly to disk files, so we explain bgsave. As mentioned earlier, bgsave starts a child process to operate on files, so at this time, the Redis main process is still responding to client commands, so how can our Redis ensure data integrity and performance?

When Redis persists, it will fork a child process, and the persistence process will be completely handed over to the child process, while the parent process will respond to the command of the client. When the child process is created, the file will not be saved immediately because it will preempt resources with the parent process. Its pointer points to the memory space of the parent process to share memory with the parent process, so as to save resources more.

At this time, the parent process will read and write the data in memory, and the child process will only read the resources and not write. At this time, Redis uses the operating system's multi-process COW (Copy On Write) mechanism to separate the data segment pages, and the data per page will not exceed 4K. We call this data cold data. When the parent process makes changes, we only need to copy the shared page data and modify it, not on the original data. There is no change in the cold data. The child process can now persist the cold data with peace of mind, and the content we persist is fixed as if it had been photographed, which is the whole process of snapshots.

It is worth noting that with the passage of time, the parent process will modify more and more pages, in theory, the copied memory can reach the previous data memory, so we change the data as little as possible when designing the cache.

Principle of AOF log

Redis also has a persistence operation that is AOF (Append Only File). AOF saves the client modification instructions and appends them to the end of the aof_buf buffer in the server state, and finally saves them to the AOF log file. For instance.

First of all, we talked about Redis's communication protocol RESP, and here we convert the client's instructions into this protocol.

> Save set name mango#### to aof_buf buffer * 3\ r\ nroom3\ r\ nset\ r\ nroom4\ r\ name\ r\ nroom5\ r\ mango\ r\ n

Then we append the agreement text to the end of the aof_buf

Finally, we save these instruction information to the AOF log file.

AOF save mode

Whenever the server general task function is executed, or the event handler is executed, the aof.c/flushAppendOnlyFile function is called, which performs the following two tasks:

WRITE: writes the cache in aof_buf to the AOF file, depending on the condition.

SAVE: depending on the condition, call the fsync or fdatasync function to save the AOF file to disk.

Redis currently supports three AOF save modes:

1. AOF_FSYNC_NO: do not save, the operating system decides when to save.

2. AOF_FSYNC_EVERYSEC: save every second.

3. AOF_FSYNC_ALWAYS: always save, save every time you execute a command.

The execution speed of these three modes is getting slower and slower from top to bottom, and the security is getting higher and higher. if you use no mode, the recovered data will be restored to the last backup, and if it is everysec, you will lose at most one second, while always will lose only one instruction. So which AOF model we consider depends on our business needs.

About fsync

The AOF log is in the form of a file. When our AOF is written from the buffer to the AOF file, the server suddenly goes down, and our file is not fully written to disk. What should we do?

The fsync function forces the contents of a specified file to be brushed from the kernel cache to disk. As long as the Redis process calls the fsync function in real time, the AOF log can be guaranteed. But fsync is a disk IO operation, so it is very slow. If Redis needs to fsync once to execute an instruction, it is conceivable that the performance of Redis will lag a lot.

So in production servers, Redis usually performs fsync operations every 1s or so, and this 1s cycle is configurable. This is a tradeoff between data security and performance, maintaining high performance while minimizing data loss.

Redis also provides two other strategies, one is to never call fsync to let the operating system decide when to synchronize the disk, which is not safe, and the other is to call fsync once with a command-which results in a very slow result. These two strategies are rarely used in a production environment.

AOF rewriting

Our AOF has been accumulated and persisted, and the backup files will become larger and larger over time, even affecting our recovery and operation of Redis. Redis also provides a rewriting mechanism. Let's take a look at it.

> set name zhangsan > set name mango > set name lisi... > set name mango#### optimized > set name mango

We can see that a key is often modified many times, so we can save the last operation of the key so that we can easily "slim down" the AOF. Of course, there is another way, which is to traverse every key and its value throughout Redis,set, just like RDB complete, we need a child process to read the current Redis library.

There is a problem here. If we are traversing the entire Redis, we need to consider that the client must have an instruction to change the value. How can we ensure that the rewritten instruction will not be left behind after AOF rewriting?

Procedure:

AOF creates a child process for AOF rewriting, whose specified memory is the same as the main process

The client executes write commands, main thread processing instructions, instructions are appended to the AOF buffer, and appended to the AOF rewrite buffer

Replace the existing AOF file after AOF rewriting is completed

So why append this instruction to both the AOF buffer and the AOF rewrite area? The reason is that if the server suddenly crashes while we are rewriting, then this directive will be saved in our AOF file. Append to the AOF buffer to ensure that the operation instructions can be synchronized to the AOF rewrite area in time. The AOF rewrite operation is the bgrewriteaof mentioned earlier.

Redis4.0 hybrid persistence

When the Redis service is restarted, RDB will restore the Redis library, so if Redis has set up AOF log backup, Redis will first use AOF to restore the Redis library. Why can I use AOF to recover? The reason is that because AOF is an incremental backup, we can recover more complete data according to the rules of AOF. Of course, the recovery speed of RDB is faster than AOF, compared to it is a compressed binary file, compared to the execution of the command is indeed faster.

To solve this problem, Redis4.0 has introduced a new persistence option-mixed persistence. Store the contents of the RDB file with the incremental AOF log file. The AOF log here is no longer a full log, but an incremental AOF log that occurs from the beginning of the persistence to the end of the persistence, and this part of the AOF log is usually very small.

When Redis restarts, you can load RDB first, and then replay the incremental AOF log, which can completely replace the previous AOF full file replay. As a result, the restart efficiency is greatly improved.

To sum up:

Redis persistence operations include RDB and AOF

RDB saves a compressed binary file, while AOF saves an operation instruction file.

Redis gives priority to restoring AOF, because it is relatively complete, and Redis4.0 uses mixed mode to restore.

When RDB executes bgsave, like AOF rewriting, it starts a child process whose memory is shared with the parent process.

AOF has three save modes. No does not actively save depending on operating system scheduling; everysec executes once a second; and always executes one instruction at a time.

AOF save mode is mainly fsync function, which ensures that AOF logs will not be lost.

After reading the above, do you have any further understanding of the principle of persisting RDB and AOF in Redis? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report