Learning Redis persistence 07/11 Update SLTechnology News&Howtos

Learning Redis persistence

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Redis supports two persistence mechanisms: RDB and AOF. Persistence can effectively avoid the problem of data loss caused by process exit. Data recovery can be achieved by using previously persisted files when you restart next time. Understanding the persistence mechanism is very important for Redis operation and maintenance.

First, it introduces the configuration and running process of RDB and AOF, as well as the related commands that control persistence, such as bgsave and bgrewriteaof.

RDB

RDB persistence is the process of generating a snapshot of the current process data and saving it to the hard disk. The process of triggering RDB persistence can be divided into manual trigger and automatic trigger.

Trigger mechanism

Manual triggers correspond to save and bgsave commands, respectively:

Save command: block the current Redis server until the RDB process is completed, which will cause long-term blocking for instances with large memory, which is not recommended in the online environment. The Redis log for running the save command is as follows:

1283VR M 04 Apr 21R 39R 03.035 * DB saved on disk

Bgsave command: the Redis process executes the fork operation to create the child process, and the RDB persistence process is responsible for the child process, which ends automatically after completion. Blocking occurs only in the fork phase, usually for a short time. The Redis log for running the bgsave command is as follows:

1283 Apr M 04 Apr 21 40 Apr 33.849 * Background saving started by pid 14291429 V 04 Apr 21 V 40 33.874 * DB saved on disk1429:C 04 Apr 21 V 33.875 * RDB: 6 MB of memory used by copy-on-write1283:M 04 Apr 21 V 40 V 33.929 * Background saving terminated with success

Obviously, the bgsave command is optimized for the save blocking problem. Therefore, all operations involving RDB within Redis are done in bgsave mode, while the save command has been discarded.

In addition to manual triggering by executing commands, there is also a persistence mechanism within Redis that automatically triggers RDB, such as the following scenarios:

1) use save-related configurations, such as "save m n". Indicates that bgsave is automatically triggered when there are n modifications to the dataset within m seconds.

2) if the slave node performs a full copy operation, the master node automatically executes bgsave to generate RDB files and sends them to the slave node.

3) when you execute the debug reload command to reload Redis, the save operation will also be triggered automatically.

4) when the shutdown command is executed by default, bgsave is automatically executed if the AOF persistence feature is not enabled.

Note: if you want to turn off automatic RDB persistence, delete the configuration of "save m n" in the configuration file

Process description

Bgsave is the mainstream way to trigger RDB persistence. Here's how it works according to the figure below.

1) execute the bgsave command, and the Redis parent process determines whether there is an executing child process, such as the RDB/ AOF child process, and returns directly if there is a bgsave command.

2) when the parent process executes the fork operation to create the child process, the parent process will block during the fork operation. Check the latest_fork_usec option through the info stats command to get the time spent on the last ork operation (in microseconds).

3) after the parent process fork is completed, the bgsave command returns "Background saving started" information and no longer blocks the parent process, but can continue to respond to other commands.

4) the child process creates the RDB file, generates a temporary snapshot file according to the memory of the parent process, and completes the atomic replacement of the original file. Execute the lastsave command to get the time when the RDB was last generated, corresponding to the rdb_last_save_time option of info statistics.

5) the process sends a signal to the parent process to indicate that it is complete, and the parent process updates the statistical information. For more information, please see rdb_*-related options under info Persistence.

Processing of RDB files

Save: the RDB file is saved in the directory specified by the dir configuration, and the file name is specified by the dbfilename configuration. It can be executed dynamically by executing the config set dir {newDir} and config set dbfilename {newFileName} runtimes, and RDB will be saved to a new directory the next time it is run.

When you encounter a bad disk or the disk is full, you can modify the file path online through config set dir {newDir} to the available disk path, and then execute bgsave to switch the bgsave process disk, which is also suitable for AOF persistent files.

Compression: Redis uses the LZF algorithm to compress the generated RDB file by default. The compressed file is much smaller than the memory size. It is enabled by default and can be dynamically modified by config set rdbcompression {yes | no}.

Although compressing RDB consumes CPU, it can greatly reduce the size of the file, making it easy to save to the hard disk or send it to the slave node over the network, so it is recommended to turn it on online.

Check: if Redis refuses to start when loading a corrupted RDB file, print the following log:

2740:M 04 Apr 22:06:42.835 # Short read or OOM loading DB. Unrecoverable error, aborting now.2740:M 04 Apr 22 Apr 06 Internal error in RDB reading function at rdb.c:1666 42.835 # > Unexpected EOF reading RDB file

You can use the redis-check-rdb tool provided by Redis to detect RDB files and get error reports for applications.

[redis@rhel7 ~] $redis-check-rdb dump.rdb [offset 0] Checking RDB file dump.rdb [offset 27] AUX FIELD redis-ver = '4.0.13' [offset 41] AUX FIELD redis-bits ='64'[offset 53] AUX FIELD ctime = '1554386780' [offset 68] AUX FIELD used-mem = '570072' [offset 84] AUX FIELD aof-preamble ='0' [offset 86] Selecting DB ID 0muri-RDB ERROR DETECTED-[offset 109] Invalid object type: 209 [additional Info] While doing: read-type [info] 2 keys read [info] 0 expires [info] 0 already expired

Advantages and disadvantages of RDB

Advantages: RDB is a compact two-process file that represents a snapshot of Redis data at a certain point in time. It is very suitable for backup, full replication and other scenarios. For example, perform a bgsave backup at the 6th hour and copy the RDB file to a remote machine or file system (such as hdfs) for disaster recovery. The way Redis loads RDB to recover data is much faster than AOF.

Disadvantages: there is no real-time persistence / second persistence of data in RDB mode. Because every time bgsave runs, it has to perform the fork operation to create a child process, which is a heavyweight operation, and the cost of frequent execution is too high. RDB files are saved in a specific two-process format, and there are multiple RDB versions in the evolution of Redis versions. There is a problem that the old version of Redis service is not compatible with the new version of RDB format.

To solve the problem that RDB is not short-distance real-time persistence, Redis provides AOF persistence to solve the problem.

AOF

AOF (append only file) persistence: record each write command as an independent log, and then re-execute the command in the AOF file when you restart to recover the data. The main function of AOF is to solve the real-time performance of data persistence, and it has become the mainstream way of Redis persistence. Understanding the AOF persistence mechanism is very helpful for us to strike a balance between data security and performance.

Use AOF

To enable the AOF function, you need to set the parameter: appendonly yes, which is disabled by default. The AOF file name is set by the appendfilename parameter, and the default file name is appendonly.aof. The save path is the same as that of RDB persistence and is specified by dir configuration. Workflow operation of AOF: command write (append), file synchronization (sync), file rewrite (rewrite), restart load (load), as shown below:

The process is as follows:

1) all write commands are appended to the aof_buf (buffer).

2) the AOF buffer synchronizes with the hard disk according to the corresponding policy.

3) as the AOF file becomes larger and larger, it is necessary to rewrite the AOF file regularly to achieve the purpose of compression.

4) when the Redis server is restarted, the AOF file can be loaded for data recovery.

After you understand the AOF workflow, let's describe each step in detail.

Command write

The content written by the AOF command is directly in text protocol format. For example, the command set hello world appends the following text to the AOF buffer

* 3 $3set$5hello$5world

Two questions about AOF:

1) Why does AOF directly adopt the text protocol format? Possible reasons are as follows:

The text protocol has good compatibility.

When AOF is enabled, all write commands include additional operations, directly using the protocol format, avoiding the overhead of secondary processing.

The text protocol is readable and convenient for direct modification and processing.

2) Why does AOF append commands to aof_buf? Redis responds to commands using a single thread, and if each write AOF command is appended directly to the hard disk, performance depends entirely on the current hard disk load. Writing to the buffer aof_buf first has another benefit. Redis can provide a variety of buffer synchronization strategies for hard drives to strike a balance between performance and security.

File synchronization

Redis provides a variety of file synchronization strategies for AOF buffers, which are controlled by the parameter appendfsync. Different meanings are as follows:

Always: after the command is written into aof_buf, the system fsync operation is called to synchronize to the AOF file, and the thread returns after the fsync is completed.

Everysec: the write operation of the system is called after the command is written to aof_buf, and the thread returns after the write is completed. The fsync synchronization file operation is called once per second by a specialized thread.

No: after the command is written into aof_buf, the write operation of the system is called, and the fsync synchronization of AOF files is not performed. The operating system is responsible for synchronizing the hard disk. Usually, the synchronization period is up to 30 seconds.

System calls write and fsync instructions:

The write operation triggers the deferred write (delayed write) mechanism. Linux operates page buffers in the kernel to improve hard disk IO performance. The write operation returns directly after writing to the system buffer. Synchronous hard disk operations depend on system scheduling mechanisms, such as full buffer page space or reaching a specific time period. Before synchronizing files, if the system goes down at this time, the data in the buffer will be lost.

Fsync forces hard disk synchronization for individual file operations (such as AOF files), and fsync will block until writing to the hard disk is completed and return, ensuring data persistence.

In addition to write, fsync and Linux, sync and fdatasync operations are also provided.

When configured as always, the AOF file should be synchronized for each write. On a typical SATA hard disk, Redis can only support about a few hundred TPS writes, which obviously runs counter to the high-performance characteristics of Redis, and configuration is not recommended.

Configured as no, because the operating system synchronizes AOF files each time the cycle is uncontrollable, and will increase the amount of data each time the hard disk synchronization, although the performance is improved, but the data security can not be guaranteed.

Configured as everysec, is the recommended synchronization policy, but also the default configuration, to achieve both performance and data security. In theory, only 1 second of data is lost in the event of a sudden system downtime. (strictly speaking, it is inaccurate to lose at most 1it data.)

Rewriting mechanism

As commands continue to be written to AOF, the file becomes larger and larger. To solve this problem, Redis introduces an AOF rewriting mechanism to compress the file volume. AOF file rewriting is the process of converting data in the Redis process into write commands and synchronizing it to a new AOF file.

Why can the rewritten AOF file be smaller? There are the following reasons:

1) data that has timed out in the process is no longer written to the file

2) the old AOF file contains valid commands, such as del key1, hdel key2, srem keys, set A111, set A222, etc. Rewriting uses in-process data generation directly so that the new AOF file retains only the write command of the final data.

3) multiple write commands can be merged into one, such as: lpush list a, lpush list b, lpush list c can be converted into: lpush list a b c. In order to prevent the client buffer overflow caused by the excessive size of a single command, the operations of list, set, hash, zset and other types are divided into multiple operations with 64 elements as the boundary.

AOF rewriting reduces file footprint, and another purpose is that smaller AOF files can be loaded faster by Redis.

The AOF rewriting process can be triggered manually and automatically:

Manual trigger: directly invoke the bgrewriteaof command

Automatic trigger: the timing of automatic trigger is determined according to auto-aof-rewrite-min-size and auto-aof-rewrite-percentage parameters.

Auto-aof-rewrite-min-size: indicates the minimum file size when running AOF rewrite, which defaults to 64m.

Auto-aof-rewrite-percentage: represents the ratio of the current AOF file space (aof_current_size) to the last rewritten AOF file (aof_base_size).

Automatic trigger time = aof_current_size > auto-aof-rewrite-min-size & (aof_current_size-aof_base_size) / aof_base_size > = auto-aof-rewrite-percentage

Where aof_current_size and aof_base_size can be viewed in info Persistence statistics.

AOF rewrite is triggered automatically, and the following log is output:

18827 Apr M 04 Apr 23 30 growth18827:M 49.519 * Starting automatic rewriting of AOF on 1054% growth18827:M 04 Apr 23 V 49.520 * Background append only file rewriting started by pid 2136518827 R 04 Apr 23 30 V 50.617 * AOF rewrite child asks to stop sending diffs.21365:C 04 Apr 23 V 30 * 50.618 * Parent agreed to stop sending diffs. Finalizing AOF...21365:C 04 Apr 23 Apr 30 MB of AOF diff received from parent.21365:C 50.618 * Concatenating 0.03 MB of AOF diff received from parent.21365:C 04 Apr 23 Apr 30 MB of AOF diff received from parent.21365:C 50.631 * SYNC append only file rewrite performed21365:C 04 Apr 23 30 Apr 50.641 * Residual parent diff successfully flushed to the rewritten AOF (0.00MB) 18827 M 04 Apr 2315 30 50.641 * Background AOF rewrite finished successfully

What does memory do when AOF rewriting is triggered? Introduce the running process with the following figure

Process description:

1) request for AOF rewriting

If the current process is performing an AOF rewrite, the request is not executed and the following response is returned:

(error) ERR Background append only file rewriting already in progress

If the current process is performing a bgsave operation, the rewrite command is delayed until the bgsave is completed, and the following response is returned:

Background append only file rewriting scheduled

2) the parent process fork creates a child process, which is equivalent to the cost of the bgsave process.

3.1) after the fork operation of the main process is completed, continue to respond to other commands. All modification commands are still written to the AOF buffer and synchronized to the hard disk according to the appendfsync policy to ensure the correctness of the original AOF mechanism.

Because fork operations use write-time replication technology, child processes can only share in-memory data during fork operations. Because the parent process is still responding to the command, Redis uses the AOF rewrite buffer to save this new data, preventing it from being lost during the generation of the new AOF file.

4) according to the memory snapshot, the child process writes the license command merge rules to the new AOF file. The amount of data written to the hard disk in batches is controlled by the configuration aof-rewrite-incremental-fsync, which defaults to 32MB to prevent the hard disk from blocking due to too much data in a single flushing.

5.1) after the new AOF file is written, the child process sends a signal to the parent process, and the parent process updates the statistics. For more information, please see aof_* related statistics under info persistence.

The parent process writes the data from the AOF rewrite buffer to the new AOF file.

Replace the old file with the new AOF file to complete the AOF rewrite.

Restart loading

Both AOF and RDB files can be used for data recovery when the server is restarted. As shown in the following figure, the Redis persistence file loading process is represented.

Process description:

1) when AOF persistence is enabled and an AOF file exists, load the AOF file first and print the following log

18827 seconds M 04 Apr 2315 21V 01.257 * DB loaded from append only file: 1.207

2) when AOF is closed or the AOF file does not exist, load the RDB file and print the following log:

7792 seconds M 01 Apr 22 22 33 seconds 58.418 * seconds

3) after loading the AOF/RDB file successfully, Redis starts successfully

4) when there is an error in the AOF/RDB file, Redistribute fails to start and prints an error message.

Document check

Startup is denied when the corrupted AOF file is loaded and the following log is printed:

Bad file format reading the append only file: make a backup of your AOF file,then use. / redis-check-aof-fix

For the wrong format of AOF files, first backup, and then use the redis-check-aof-- fix command to repair, after the repair using diff-u to compare the data differences, to find the missing data, some can be manually modified.

The end of the AOF file may be incomplete, such as the AOF tail command is not fully written due to a sudden power down of the machine. Redis provides us with aof-load-truncated configuration to be compatible with this situation, which is enabled by default. When loading AOF, it will ignore and continue to start when this problem is encountered, and print the following warning log:

27752:M 04 Apr 23:47:34.655 #! Warning: short read while loading the AOF file! 27752:M 04 Apr 23:47:34.655 #! Truncating the AOF at offset 1377811! 27752:M 04 Apr 23:47:34.655 # AOF loaded anyway because aof-load-truncated is enabled

The content of the blog article is extracted from the book "Redis Development and Operation and maintenance".

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.