How to implement the persistence of Redis's highly available features 07/09 Update SLTechnology News&Howtos

How to implement the persistence of Redis's highly available features

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces the relevant knowledge of "how to achieve the persistence of high-availability features of Redis". The editor shows you the operation process through an actual case. The method of operation is simple and fast, and it is practical. I hope this article "how to achieve the persistence of high-availability features of Redis" can help you solve the problem.

Overview of Redis High availability

Before introducing Redis high availability, let's explain the meaning of high availability in the context of Redis. In Web servers, high availability refers to the time that the server can be accessed normally, which is measured by how long it takes to provide normal services (99.9%, 99.99%, 99.999%, etc.).

However, in the context of Redis, the meaning of high availability seems to be broader. In addition to ensuring the provision of normal services (such as master-slave separation, fast disaster recovery technology), we also need to consider the expansion of data capacity, data security will not be lost and so on.

In Redis, the technologies to achieve high availability mainly include persistence, replication, sentinel and clustering. The following describes their role and what kind of problems have been solved:

Persistence: persistence is the simplest highly available method (sometimes not even classified as a highly available means). Its main function is to back up the data, that is, to store the data on the hard disk, ensuring that the data will not be lost due to the exit of the process.

Replication: replication is the basis of highly available Redis, and both Sentinels and clusters achieve high availability on the basis of replication. Replication mainly realizes the multi-machine backup of data, as well as load balancing and simple fault recovery for read operations.

Defects: failure recovery can not be automated, write operations can not be load balanced, and storage capacity is limited by a single machine.

Sentinel: on the basis of replication, Sentinel achieves automated fault recovery.

Defect: the write operation cannot be load balanced; the storage capacity is limited by a single machine.

Cluster: through clustering, Redis solves the problem that write operations cannot be load balanced and storage capacity is limited by a single machine, and achieves a relatively perfect high availability solution.

Overview of Redis persistence

Persistence function: Redis is an in-memory database, and the data is stored in memory.

In order to avoid data loss caused by process exit, you need to save the data in Redis from memory to the hard disk in some form (data or commands) on a regular basis, and use persistent files to recover data when Redis is restarted next time.

In addition, for disaster backup, persistent files can be copied to a remote location.

Redis persistence is divided into RDB persistence and AOF persistence:

The former saves the current data to the hard disk

The latter saves each write command to the hard disk (similar to MySQL's binlog)

Because AOF persistence is more real-time, that is, less data is lost when the process exits unexpectedly, AOF is the mainstream persistence method at present, but RDB persistence still has the opportunity to show its ability.

RDB persistence and AOF persistence are described below in turn. Due to differences between different versions of Redis, unless otherwise specified, Redis 3.0 shall prevail.

RDB persistence

RDB persistence is to generate a snapshot of the data in the current process and save it to the hard disk (so it is also called snapshot persistence). The saved file suffix is RDB;. When the Redis restarts, the snapshot file recovery data can be read.

Trigger condition

RDB persistence can be triggered manually or automatically:

Manual trigger

Automatic trigger

Manual trigger: both the save command and the bgsave command can generate RDB files.

The save command blocks the Redis server process until the RDB file is created, and the server cannot process any command requests during the Redis server blocking.

The bgsave command creates a child process, which is responsible for creating the RDB file, and the parent process (that is, the Redis main process) continues to process the request.

At this point, the server execution log is as follows:

During the execution of the bgsave command, only the fork child process blocks the server, while for the save command, the entire process blocks the server.

Therefore, save is basically obsolete, and the use of save should be eliminated in the online environment; only the bgsave command will be introduced later.

In addition, when RDB persistence is triggered automatically, Redis also selects bgsave instead of save for persistence; the conditions for automatically triggering RDB persistence are described below.

Automatic trigger: the most common case is through save m n in the configuration file, specifying that bgsave will be triggered when n changes occur in m seconds.

For example, if you look at the default configuration file of Redis (redis.conf in the root directory of Redis under Linux), you can see the following configuration information:

The meaning of save 9001 is that if the Redis data changes at least once when the time reaches 900s, the bgsave is executed.

In the same way as save 60 10000, save 30010 will cause a call to bgsave when any of the three save conditions are met.

The implementation principle of save m n: the save m n of Redis is realized through serverCron function, dirty counter and lastsave timestamp.

ServerCron is a periodic operation function of the Redis server, which is executed every 100ms by default; this function maintains the state of the server, and one of the tasks is to check whether the conditions of the save m n configuration are met, and execute bgsave if so.

The dirty counter is a state maintained by the Redis server that records how many changes (including additions and deletions) have been made to the server state since the last bgsave/save command was executed, and when the save/bgsave execution is complete, the dirty is reset to 0.

For example, if Redis executes set mykey helloworld, the dirty value will be + 1; if sadd myset v1 v2 v3 is executed, the dirty value will be + 3; note that dirty records how many changes the server made, not how many commands the client executed to modify the data.

The lastsave timestamp is also a state maintained by the Redis server, recording the last time save/bgsave was successfully executed.

The principle of save m n is as follows: the serverCron function is executed every 100ms; in the serverCron function, the save condition of the save m n configuration is traversed and bgsave is performed as long as one condition is met.

For each save m n condition, only if the following two conditions are met at the same time:

Current time-lastsave > m

Dirty > = n

Save m n execution log: the following figure shows how the server prints the log when save m n triggers bgsave execution.

In addition to save m n, there are other situations that trigger bgsave:

In the master-slave replication scenario, if the slave node performs a full copy operation, the master node executes the bgsave command and sends the RDB file to the slave node.

RDB persistence is automatically performed when the shutdown command is executed, as shown in the following figure:

Execution process

The conditions for triggering bgsave are described earlier, and the execution flow of the bgsave command is described below, as shown in the following figure:

The five steps in the picture are as follows:

The Redis parent process first determines whether the child process of save or bgsave/bgrewriteaof (which will be described in more detail later) is currently executing, and if so, the bgsave command returns directly.

Bgsave/bgrewriteaof 's child processes cannot be executed at the same time, mainly based on performance considerations: two concurrent child processes perform a large number of disk writes at the same time, which may cause serious performance problems.

The parent process executes the fork operation to create the child process, in which the parent process is blocked and Redis cannot execute any commands from the client.

After the parent process fork, the bgsave command returns "Background saving started" information which no longer blocks the parent process and can respond to other commands.

The child process creates the RDB file, generates a temporary snapshot file according to the memory snapshot of the parent process, and then atomic replaces the original file.

The child process sends a signal to the parent process that it is complete, and the parent process updates the statistics.

RDB file

RDB files are compressed binaries. Here are some details about RDB files.

Storage path

The storage path of RDB files can be configured before startup or dynamically set by command.

Configuration: dir configuration specifies the directory, and dbfilename specifies the file name. The default is the dump.rdb file in the Redis root directory.

Dynamic setting: the RDB storage path can also be dynamically modified after Redis startup, which is very useful in the event of disk damage or insufficient space; the execution commands are config set dir {newdir} and config set dbfilename {newFileName}.

As follows (Windows environment):

RDB file format

The RDB file format is shown in the following figure:

The meaning of each field is described as follows:

REDIS: constant that holds five characters of "REDIS".

The version number of the db_version:RDB file. Note that it is not the version number of Redis.

SELECTDB 0 pairs: represents a complete database (database 0). Similarly, SELECTDB 3 pairs represents a complete database 3.

Only when there are key-value pairs in the database will the information of the database be found in the RDB file (only databases 0 and 3 have key-value pairs in the Redis shown above); if all databases in Redis do not have key-value pairs, this part will be omitted directly.

Among them: SELECTDB is a constant, indicating that the database number is followed by 0 and 3; pairs stores specific key-value pair information, including key, value values, and their data types, internal codes, expiration time, compression information, and so on.

EOF: constant that marks the end of the body content of the RDB file.

Check_sum: the checksum of all the previous contents; when Redis loads the RBD file, it calculates the previous checksum and compares it with the check_sum value to determine whether the file is corrupted.

Compress

Redis uses the LZF algorithm to compress RDB files by default. Although compression is time-consuming, it can greatly reduce the size of RDB files, so compression is on by default; you can turn it off with the command:

It is important to note that the compression of RDB files is not for the entire file, but for strings in the database, and only if the string reaches a certain length (20 bytes).

Load at startup

The loading of the RDB file is performed automatically when the server starts, and there are no special commands. However, because AOF has a higher priority, when AOF is turned on, Redis will load the AOF file first to recover the data.

Only when AOF is turned off will the RDB file be detected and loaded automatically when the Redis server starts. The server is blocked during the loading of the RDB file until the loading is complete.

You can see the execution of automatic loading in the Redis startup log:

When Redis loads the RDB file, it verifies the RDB file. If the file is corrupted, an error will be printed in the log and Redis startup fails.

Summary of common configurations in RDB

The following are the common configuration items for RDB, as well as the default values, which will not be described in detail here:

The condition that save m n:bgsave automatically triggers; if there is no save m n configuration, it is equivalent to automatic RDB persistence off, but it can still be triggered in other ways.

Stop-writes-on-bgsave-error yes: whether Redis stops executing the write command when there is an error in bgsave; if set to yes, it can be found in time when there is something wrong with the hard disk to avoid a large amount of data loss.

Set to no, Redis continues to execute write commands in spite of bgsave's errors, and this option is set to no when monitoring the Redis server's system, especially the hard disk.

Rdbcompression yes: whether to enable RDB file compression.

Rdbchecksum yes: whether to enable the check of RDB files works when writing and reading files; turning off checksum can bring about a 10% performance improvement when writing and starting files, but cannot be detected when data is corrupted.

Dbfilename dump.rdb:RDB file name.

Dir. /: the directory where the RDB file and AOF file are located.

AOF persistence

RDB persistence writes process data to a file, while AOF persistence (that is, Append Only File persistence) records each write command executed by Redis in a separate log file (a bit like MySQL's binlog). When Redis restarts, execute the commands in the AOF file again to recover the data.

Compared with RDB, AOF has better real-time performance, so it has become the mainstream persistence scheme.

Turn on AOF

RDB is enabled by default on Redis server, and AOF; is disabled. To enable AOF, you need to configure: appendonly yes in the configuration file.

Execution process

Since each write command of Redis needs to be recorded, AOF does not need to be triggered, so the execution flow of AOF is described below.

The execution process of AOF includes:

Command append (append): appends Redis's write command to the buffer aof_buf.

File writing (write) and file synchronization (sync): synchronize the contents of aof_buf to the hard disk according to different synchronization strategies.

File rewriting (rewrite): rewrite AOF files regularly to achieve the purpose of compression.

Command append (append)

Redis appends the write command to the buffer first instead of writing directly to the file, mainly to avoid writing the write command directly to the hard disk every time, causing the hard disk IO to become the bottleneck of the Redis load.

The additional format of the command is the protocol format requested by the Redis command, which is a plain text format, which has the advantages of good compatibility, strong readability, easy processing, simple operation and avoiding secondary overhead.

In the AOF file, except for the select command used to specify the database (for example, select 0 is the selected database 0) is added by Redis, the other write commands are sent from the client.

File write (write) and file synchronization (sync)

Redis provides a variety of file synchronization policies for AOF cache, which involves the write function and fsync function of the operating system, as described below:

In order to improve the efficiency of file writing, in modern operating systems, when users call the write function to write data to a file, the operating system usually temporarily stores the data in a memory buffer. When the buffer is filled or exceeds the specified time limit, the buffer data is actually written to the hard disk.

Although this operation improves efficiency, it also brings security problems: if the computer is down, the data in the memory buffer will be lost.

Therefore, the system also provides synchronization functions such as fsync and fdatasync, which can force the operating system to write the data in the buffer to the hard disk immediately, thus ensuring the security of the data.

The synchronization file policy of the AOF cache is controlled by the parameter appendfsync, and the values are as follows:

Always: immediately after the command is written into aof_buf, the system fsync operation is called to synchronize to the AOF file, and the thread returns after the fsync is completed.

In this case, every write command has to be synchronized to the AOF file, the hard disk IO has become a performance bottleneck, and Redis can only support about a few hundred TPS writes, which seriously reduces the performance of Redis.

Even with solid state drives (SSD), only about tens of thousands of commands can be processed per second, and the lifespan of SSD is greatly reduced.

No: the write operation of the system is called after the command is written into aof_buf, and the fsync synchronization of AOF files is not performed. Synchronization is the responsibility of the operating system, and the synchronization period is usually 30 seconds.

In this case, the time of file synchronization is uncontrollable, and there will be a lot of data accumulated in the buffer, so the data security can not be guaranteed.

Everysec: the system write operation is called after the command is written into aof_buf, and the thread returns after the write is completed; the fsync synchronization file operation is called once per second by a special thread.

Everysec is a compromise between the above two strategies and a balance between performance and data security, so it is the default configuration of Redis and our recommended configuration.

File rewriting (rewrite)

With the passage of time, the Redis server executes more and more write commands, and the AOF file will become larger and larger; too large AOF files will not only affect the normal operation of the server, but also cause the data recovery time to be too long.

File rewriting refers to regularly rewriting AOF files to reduce the size of AOF files. It should be noted that AOF rewriting converts the data in the Redis process into write commands and synchronizes it to the new AOF file; it does not read or write to the old AOF file!

Another thing to note about file rewriting is that file rewriting is highly recommended but not necessary for AOF persistence. Even if there is no file rewriting, the data can be persisted and imported when Redis starts.

So in some implementations, automatic file rewriting is turned off and then executed regularly at a certain time of day through a scheduled task.

The reason why file rewriting can compress AOF files is that:

Expired data is no longer written to the file.

Invalid commands are no longer written to the file: for example, some data is duplicated (set mykey v1 point set mykey v2), some data is deleted (sadd myset v1 point del myset), and so on.

Multiple commands can be merged into one: for example, sadd myset v1 myset v2 add myset v2 myset v3 can be merged into sadd myset v1 v2 v3.

However, in order to prevent a single command from overflowing the client buffer, it is not necessary to use only one command for key of list, set, hash, and zset types.

Instead, the command is divided into multiple strips bounded by a constant. This constant is defined in redis.h/REDIS_AOF_REWRITE_ITEMS_PER_CMD and cannot be changed. The value in version 3.0 is 64.

As can be seen from the above, because AOF executes fewer commands after rewriting, file rewriting can not only reduce the space occupied by files, but also speed up recovery.

Trigger of file rewrite

The trigger of file rewriting can be divided into manual trigger and automatic trigger:

Triggered manually, the bgrewriteaof command is invoked directly, which executes somewhat like bgsave: both fork child processes do specific work and are blocked only during fork.

At this point, the server execution log is as follows:

Trigger automatically, and determine the trigger time based on auto-aof-rewrite-min-size and auto-aof-rewrite-percentage parameters, as well as aof_current_size and aof_base_size status:

Auto-aof-rewrite-min-size: the minimum volume of the file when performing an AOF rewrite. The default value is 64MB.

Auto-aof-rewrite-percentage: the ratio of the current AOF size (that is, aof_current_size) to the AOF size (aof_base_size) of the last override when performing an AOF rewrite.

The parameters can be viewed through the config get command:

The status can be viewed through info persistence:

AOF rewriting, that is, the bgrewriteaof operation, will be triggered automatically only if both auto-aof-rewrite-min-size and auto-aof-rewrite-percentage parameters are met.

When bgrewriteaof is triggered automatically, you can see that the server log is as follows:

The process of file rewriting

The file rewriting process is shown in the following figure:

With regard to the process of file rewriting, there are two points to pay special attention to:

The rewrite is done by the parent process fork child process.

Write commands executed by Redis during rewriting need to be appended to the new AOF file, so aof_rewrite_buf cache is introduced for Redis.

Compared with the figure above, the process of file rewriting is as follows:

1): the Redis parent process first determines whether there is a child process executing bgsave/bgrewriteaof, and if so, the bgrewriteaof command returns directly; if there is a bgsave command, it waits for bgsave execution to be completed, which is mainly based on performance considerations.

2): the parent process performs the fork operation to create the child process, in which the parent process is blocked.

3.1After the parent process fork, the bgrewriteaof command returns "Background append only file rewrite started" information which no longer blocks the parent process and can respond to other commands.

All Redis write commands are still written to the AOF buffer and synchronized to the hard disk according to the appendfsync policy to ensure the correctness of the original AOF mechanism.

3. 2): because fork operations use write-time replication technology, child processes can only share in-memory data during fork operations.

Because the parent process is still responding to commands, Redis uses the AOF rewrite buffer (aof_rewrite_buf in the figure) to save this part of the data to prevent it from being lost during the generation of the new AOF file.

That is, during the execution of bgrewriteaof, the write command of Redis is appended to both aof_buf and aof_rewirte_buf buffers.

4): according to the memory snapshot, the child process writes to the new AOF file according to the command merge rule.

After the child process finishes writing the new AOF file, it signals to the parent process that the parent process updates the statistical information, which can be viewed through info persistence.

The parent process writes the data from the AOF rewrite buffer to the new AOF file, which ensures that the database state saved by the new AOF file is consistent with the current state of the server.

5. 3): replace the old file with the new AOF file and complete the AOF rewrite.

Load at startup

As mentioned earlier, when AOF is turned on, Redis will first load the AOF file to recover the data when it starts; only when AOF is closed will the RDB file be loaded to recover the data.

When AOF is enabled and the AOF file exists, the Redis startup log:

When AOF is enabled, but the AOF file does not exist, it will not be loaded even if the RDB file exists (some earlier versions may be loaded, but 3.0 will not). The Redis startup log is as follows:

Document check

Similar to loading a RDB file, when Redis loads an AOF file, it verifies the AOF file. If the file is corrupted, an error will be printed in the log and Redis startup will fail.

However, if the end of the AOF file is incomplete (the tail of the file is easily incomplete due to sudden machine downtime, etc.), and the aof-load-truncated parameter is enabled, a warning will be output in the log. Redis ignores the tail of the AOF file and starts successfully.

The aof-load-truncated parameter is enabled by default:

Pseudo client

Because the command of Redis can only be executed in the context of the client, while the command is read directly from the AOF file when it is loaded, it is not sent by the client.

So before loading the AOF file, the Redis server creates a client without a network connection, and then uses it to execute the commands in the AOF file, which is exactly the same as the client with a network connection.

Summary of common configurations in AOF

The following are the common configuration items for AOF, as well as the default values:

Appendonly no: whether to enable AOF.

Appendfilename "appendonly.aof": AOF file name.

Dir. /: the directory where the RDB file and AOF file are located.

Appendfsync everysec:fsync persistence strategy.

Whether to disable fsync; during no-appendfsync-on-rewrite no:AOF rewriting if this option is turned on, it can reduce the load on CPU and hard disk (especially hard disk) during file rewriting, but may lose data during AOF rewriting; a balance needs to be made between load and security.

Auto-aof-rewrite-percentage 100: one of the trigger conditions for file rewriting.

Auto-aof-rewrite-min-size 64mb: file rewriting triggers one of the submissions.

Aof-load-truncated yes: if the end of the AOF file is corrupted, whether the AOF file is still loaded when Redis starts.

Scheme selection and frequently asked questions

Earlier, we introduced the details of RDB and AOF persistence solutions. The following describes the characteristics of RDB and AOF, how to choose the persistence scheme, and the problems often encountered in the persistence process.

Advantages and disadvantages of RDB and AOF

RDB and AOF have their own advantages and disadvantages:

RDB persistence

Advantages: RDB files are compact, small, fast network transmission, suitable for full replication; recovery speed is much faster than AOF. Of course, one of the most important advantages of RDB over AOF is that it has relatively little impact on performance.

Disadvantages: the fatal disadvantage of RDB files is that the persistence mode of its data snapshots determines that real-time persistence must not be done. Today, when data is becoming more and more important, a large amount of data loss is often unacceptable, so AOF persistence has become the mainstream.

In addition, RDB files need to meet specific formats and have poor compatibility (for example, the old version of Redis is not compatible with the new version of RDB files).

AOF persistence

Corresponding to RDB persistence, AOF has the advantages of supporting second persistence and good compatibility, while its disadvantages are large files, slow recovery and great impact on performance.

Persistence strategy selection

Before introducing persistence strategies, it is important to understand that whether it is RDB or AOF, there is a performance price to turn on persistence:

For RDB persistence, on the one hand, the Redis main process will block when bgsave performs fork operations, on the other hand, the child process will also bring IO pressure to write data to the hard disk.

For AOF persistence, the frequency of writing data to the hard disk is greatly increased (seconds under everysec policy), and the IO pressure is even greater, which may even cause AOF additional blocking problems (which will be described in more detail later).

In addition, the rewriting of AOF files is similar to RDB's bgsave, which can cause blocking in fork and IO pressure problems in child processes.

Relatively speaking, because AOF writes data to the hard disk more frequently, it has a greater impact on the performance of the Redis main process.

In the actual production environment, there will be a variety of persistence strategies according to the amount of data, application security requirements for data, budget constraints and other conditions.

Such as not using any persistence at all, using one of RDB or AOF, or turning on RDB and AOF persistence at the same time.

In addition, the choice of persistence must be considered together with the master-slave strategy of Redis, because master-slave replication and persistence also have the function of data backup, and the host master and slave slave can independently choose the persistence scheme.

The following scenarios discuss the choice of persistence strategy, and the discussion is for reference only. The actual solution may be more complex and diverse:

It doesn't matter if the data in Redis is completely discarded (for example, Redis is used entirely as a Cache for DB-tier data), then both stand-alone and master-slave architecture can be made without any persistence.

In a stand-alone environment (this may be common for individual developers), RDB is more beneficial to the performance of Redis if you can accept data loss for more than ten minutes or more; if you can only accept data loss in seconds, you should choose AOF.

However, in most cases, we will configure the master-slave environment. The existence of slave can not only achieve hot backup of data, but also split read and write to share Redis read requests, and continue to provide services after the master goes down.

In this case, a feasible approach is:

Master: completely turn off persistence (including RDB and AOF), which allows master performance to reach *.

Slave: disable RDB, enable AOF (if data security is not required, enable RDB and disable AOF), and back up persistent files regularly (such as backing up to other folders and marking the backup time).

Then turn off automatic rewriting of AOF, and then add scheduled tasks to call bgrewriteaof every day when Redis is idle, such as 12:00 in the morning.

Here we need to explain why master-slave replication is enabled, hot backup of data can be realized, and persistence needs to be set.

Because in some special cases, master-slave replication is still not sufficient to ensure the security of data, such as:

Master and slave processes are stopped at the same time: consider such a scenario: if master and slave are in the same building or in the same data room, a power outage may cause master and slave machines to shut down at the same time, and the Redis process will stop; if there is no persistence, the data will be completely lost.

Incorrect restart of master: consider such a scenario: if the master service is down due to a failure, if the system has an automatic pull mechanism (that is, restart the service after it is detected that the service is stopped), the master will be automatically restarted. Since there is no persistent file, the data will be empty after master restart, and the slave synchronization data will also become empty. If both master and slave are not persisted, they will also face complete data loss.

It is important to note that even if the Sentinel (which will be described later in the article) is used for automatic master-slave switching, it is possible to be restarted by the automatic pull mechanism before the Sentinel poll reaches master. Therefore, the simultaneous occurrence of "automatic pull-up mechanism" and "no persistence" should be avoided as far as possible.

Remote disaster recovery: the persistence strategies discussed above are all aimed at general system failures, such as abnormal process exit, downtime, power outage, etc., which will not damage the hard disk.

However, for some disaster situations that may lead to hard disk damage, such as fire and earthquake, remote disaster preparedness is needed.

For example, in the case of a stand-alone machine, you can regularly copy the RDB file or the rewritten AOF file to the remote machine through scp, such as Aliyun, AWS, etc.

In the master-slave case, you can execute bgsave on master regularly, then copy the RDB file to the remote machine, or copy the AOF file to the remote machine after bgrewriteaof rewriting the AOF file on slave.

Generally speaking, RDB files are commonly used in disaster recovery because of their small size and fast recovery. The frequency of remote backups is determined according to the needs of data security and other conditions, but not less than once a day.

Fork blocking: blocking of CPU

In the practice of Redis, many factors limit the memory of a Redis stand-alone machine, such as:

In the face of the surge of requests and the need to expand the capacity from the library, the Redis memory is too large, which leads to the expansion time is too long.

When the host goes down, the slave library needs to be mounted after switching the host, and the mount speed is too slow due to excessive Redis memory.

Fork operation during persistence.

First of all, let's explain the fork operation: the parent process can create a child process through the fork operation; after the child process is created, the parent process shares the code segment and does not share the data space of the process, but the child process will get a copy of the data space of the parent process.

In the actual implementation of the operating system fork, the write-time replication technology is basically adopted, that is, before the parent / child process tries to modify the data space, the parent and child processes actually share the data space.

But when any of the parent / child processes attempts to modify the data space, the operating system makes a copy of that part of the modification (a page of memory).

Although the child process does not copy the data space of the parent process during fork, it copies the memory page table (the page table is equivalent to the index and directory of memory); the larger the data space of the parent process, the larger the memory page table, and the more time it takes to copy fork.

In Redis, both the bgsave persisted by RDB and the bgrewriteaof rewritten by AOF require a child process of fork to operate.

If the Redis memory is too large, it will take too much time to copy the memory page table during the fork operation, while the Redis main process is completely blocked during fork, which means that it is unable to respond to the client's request, resulting in excessive request delay.

Fork operation time varies with different hardware and operating systems. Generally speaking, if Redis memory reaches 10GB memory, it may reach the level of 100 milliseconds (if you use a Xen virtual machine, this time may reach the level of seconds).

Therefore, in general, Redis stand-alone memory is generally limited to 10GB; however, this data is not absolute and can be adjusted by observing the time spent in the online environment fork.

The method of observation is as follows: execute the command info stats and look at the value of latest_fork_usec in microseconds.

In order to reduce the blocking problem caused by fork operation, in addition to controlling the size of Redis stand-alone memory, you can also appropriately relax the trigger conditions for AOF rewriting, choose physical machines or virtualization technologies that efficiently support fork operations, such as using Vmware or KVM virtual machines instead of Xen virtual machines.

AOF additional blocking: blocking of hard disk

As mentioned earlier, in AOF, if the file synchronization policy of the AOF buffer is everysec, in the main thread, the system write operation is called after the command is written to aof_buf, and the main thread returns after the write is completed.

The fsync synchronization file operation is called once per second by a dedicated file synchronization thread. The problem with this approach is that if the hard drive is overloaded, the fsync operation may exceed 1 second.

If the Redis main thread continues to write commands to aof_buf at a high speed, the load on the hard disk may become larger and higher, and the IO resources may be consumed faster; if the Redis process exits abnormally at this time, more and more data will be lost, which may be much more than 1 second.

For this reason, the processing strategy of Redis is as follows: each time the main thread performs AOF, it compares the time when the last fsync succeeded; if it is less than 2s from last time, the main thread returns directly; if it exceeds 2s, the main thread blocks until the fsync synchronization is complete.

Therefore, if the system hard disk load causes the fsync speed to be too slow, it will cause the main thread of Redis to block; in addition, with everysec configuration, AOF may lose up to 2s of data instead of 1s.

AOF appending blocking problem location method:

Monitor aof_delayed_fsync in info Persistence: when AOF append blocking occurs (that is, the main thread waits for fsync to block), this metric accumulates.

Redis log when AOF is blocked: Asynchronous AOF fsync is taking too long (disk is busy?). Writing the AOF buffer without waiting for fsync to complete, this may slow down Redis.

If AOF additional blocking occurs frequently, the hard disk load of the system is too heavy. You can consider replacing the hard disk with faster IO, or use IO monitoring and analysis tools to analyze the IO load of the system, such as iostat (system-level io), iotop (top of io version), pidstat and so on.

Info command and persistence

I mentioned some ways to view the persistence-related state through the info command, which is summarized below.

Info Persistence

The implementation results are as follows:

Some of the more important ones include:

Rdb_last_bgsave_status: the result of the last bgsave execution can be used to find bgsave errors.

Rdb_last_bgsave_time_sec: the last time the bgsave was executed (in s), which can be used to find out whether the bgsave takes too long.

Whether aof_enabled:AOF is enabled or not.

Aof_last_rewrite_time_sec: the last file rewrite execution time (in s), which can be used to find out whether the file rewrite takes too long.

Aof_last_bgrewrite_status: the result of the last bgrewrite execution can be used to find bgrewrite errors.

Aof_buffer_length and aof_rewrite_buffer_length:AOF cache size and AOF rewrite buffer size.

Aof_delayed_fsync:AOF appends the statistics of congestion.

Info stats

The one that has a lot to do with persistence is latest_fork_usec, which means that the last fork took time.

This is the end of the content on "how to achieve the persistence of Redis's highly available features". Thank you for reading. If you want to know more about the industry, you can follow the industry information channel. The editor will update different knowledge points for you every day.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.