Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

An example Analysis of Redis persistence principle

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)05/31 Report--

Today, I would like to share with you the relevant knowledge points about the example analysis of Redis persistence principle. The content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article. Let's take a look.

Redis is an in-memory database, and all the data will be saved in memory. Compared with the traditional relational databases such as MySQL, Oracle, SqlServer, which directly save the data to the hard disk, the reading and writing efficiency of Redis is very high. However, there is also a big defect in keeping it in memory. Once the power is cut off or down, all the contents in the in-memory database will be lost. To make up for this defect, Redis provides the function of persisting memory data to hard disk files and restoring data by backing up files, namely Redis persistence mechanism.

Redis supports persistence in two ways: RDB snapshots and AOF.

RDB persistence

RDB snapshots officially: the RDB persistence scheme is a point-in-time snapshot (point-to-time snapshot) generated from your dataset at a specified time interval. It saves the memory snapshot of all data objects in Redis database at a certain time in compact binary files, and can be used for data backup, transfer and recovery of Redis. So far, it is still the official default support scheme.

How RDB works

Since RDB is a point-in-time snapshot of a dataset in Redis, let's take a brief look at how data objects in Redis are stored and organized in memory.

By default, there are 16 databases in Redis, numbered from 0 to 15, each Redis database is represented by a redisDb object, and redisDb uses hashtable to store Kmuri V objects. To make it easier to understand, I take one of the db as an example to draw a schematic diagram of the storage structure of the internal data in Redis. A point-in-time snapshot is the state of each data object in each DB in Redis at a certain time. Assuming that all data objects no longer change at this time, we can read these data objects in turn and write them to a file according to the data structure relationship in the figure above, so as to achieve Redis persistence. Then, when Redis restarts, it reads the contents of the file according to the rules, and then writes to Redis memory to restore the state it was when it was persisted.

Of course, this premise is true when our above assumption is true, otherwise we would not be able to start with a constantly changing data set. We know that client-side command processing in Redis is a single-threaded model, and if persistence is treated as a command, the data set must be at rest. In addition, the child process created by the fork () function provided by the operating system can get the same memory data as the parent process, which is equivalent to obtaining a copy of the memory data; after the fork is completed, the parent process does what it needs to do, and the work of persistence state is handed over to the child process.

Obviously, the first situation is not desirable. Persistent backups can cause Redis services to become unavailable for a short period of time, which is intolerable for high-HA systems. Therefore, the second way is the main practice of RDB persistence. Because the data of the parent process changes all the time after the RDB child process, and the child process is not synchronized with the parent process, RDB persistence must not guarantee real-time performance; power outage or downtime occurs after the completion of RDB persistence, which will lead to part of the data loss; backup frequency determines the amount of data lost, so increasing the backup frequency means that the fork process consumes more CPU resources, and will also lead to larger disk iLever O.

Persistence process

There are two function methods to accomplish RDB persistence in Redis: rdbSave and rdbSaveBackground (in the source file rdb.c). First, briefly describe the difference between the two:

RdbSave: is executed synchronously, and the persistence process is started immediately after the method call. Because Redis is a single-threaded model, it will block during persistence and Redis cannot provide services.

RdbSaveBackground: is executed in the background (asynchronously), this method will fork the child process, the real persistence process is executed in the child process (call rdbSave), and the main process will continue to provide services

The trigger of RDB persistence is inseparable from the above two methods, which can be triggered manually and automatically. Manual trigger is easy to understand. It means that we artificially initiate a persistent backup instruction to the Redis server through the Redis client, and then the Redis server starts to execute the persistence process. The instructions here are save and bgsave. Automatic trigger is a persistence process automatically triggered by Redis when the preset conditions are met according to its own operation requirements. The following scenarios are automatically triggered (excerpted from this article):

Automatic trigger of save m n configuration rules in serverCron

When the slave node replicates in full, the master node sends the rdb file to the slave node to complete the copy operation, and the master node will start bgsave.

When you execute the debug reload command to reload redis

By default (AOF is not turned on) when the shutdown command is executed, bgsave is executed automatically

Combined with the source code and reference articles, I sorted out the RDB persistence process to help you have an overall understanding, and then explain it in some details. As can be seen from the above picture:

Automatically triggered RDB persistence is a persistence policy executed as a child process through rdbSaveBackground

Manual triggering is triggered by client commands, including save and bgsave commands, where the save command is done in a blocking manner by calling the rdbSave method in the Redis command processing thread.

The automatic trigger process is a complete link, covering rdbSaveBackground, rdbSave, and so on. Next, I will take serverCron as an example to analyze the whole process.

Save rules and Inspection

ServerCron is a periodic function in Redis that is executed every 100ms. One of its tasks is to determine the current need for an automatic persistence process according to the save rules in the configuration file, and try to start persistence if the conditions are met. Learn about the implementation of this part.

There are several fields related to RDB persistence in redisServer. I extracted them from the code and looked at them in Chinese and English:

Struct redisServer {/ * omit other fields * / / * RDB persistence * / long long dirty; / * Changes to DB from the last save * the number of times key has been modified since the last persistence * / struct saveparam * saveparams; / * Save points array for RDB, * multiple save parameters of the corresponding configuration file * / int saveparamslen / * Number of saving points, * number of save parameters * / time_t lastsave; / * Unix time of last successful save * Last persistence time * / / * omit other fields * /} / * corresponding to save parameter * / struct saveparam {time_t seconds in redis.conf / * Statistical time range * / int changes; / * number of data modifications * /}

Saveparams corresponds to the save rule under redis.conf. Save parameter is the trigger policy for Redis to trigger automatic backup, seconds is the statistical time (in seconds), and changes is the number of writes occurring within the statistical time. Save m n means that if there are n writes in m seconds, a snapshot is triggered, that is, a backup. The save parameter can be configured in multiple groups to meet the backup requirements under different conditions. If you need to turn off the automatic backup policy for RDB, you can use save ". The following are descriptions of several configurations:

# indicates that the value of at least one key has changed within 900s (15 minutes). Executing save 900indicates that the value of at least one key has changed within 300s (5 minutes). Executing save 3001min means that the value of at least 10000 key has changed within 60 seconds (1 minute). Then executing save 60 10000s this configuration will turn off the persistent save in RDB mode ""

The serverCron detection code for the RDB save rule is as follows:

Int serverCron (struct aeEventLoop * eventLoop, long long id, void * clientData) {/ * omit other logic * / / * if Redis is performing RDB persistence when the user requests AOF file rewriting, Redis will arrange to perform AOF file rewriting after the RDB persistence is completed. * if aof_rewrite_scheduled is true, it means that the user's request needs to be executed * / / * Check if a background saving or AOF rewrite in progress terminated. * / if (hasActiveChildProcess () | | ldbPendingChildren ()) {run_with_period (1000) receiveChildInfo (); checkChildrenDone ();} else {/ * there is no saving/rewrite child process in the background. Check each save rule * / for (j = 0; j) one by one.

< server.saveparamslen; j++) { struct saveparam *sp = server.saveparams+j; /* 检查规则有几个:满足修改次数,满足统计周期,达到重试时间间隔或者上次持久化完成*/ if (server.dirty >

= sp- > changes & & server.unixtime-server.lastsave > sp- > seconds & & (server.unixtime-server.lastbgsave_try > CONFIG_BGSAVE_RETRY_DELAY | | server.lastbgsave_status = = C_OK) {serverLog (LL_NOTICE, "% d changes in% d seconds. Saving... ", sp- > changes, (int) sp- > seconds); rdbSaveInfo rsi, * rsiptr; rsiptr = rdbPopulateSaveInfo (& rsi); / * execute bgsave process * / rdbSaveBackground (server.rdb_filename,rsiptr); break;}} / * omitted: Trigger an AOF rewrite if needed. * /} / * omit other logic * /}

If there is no RDB persistence or AOF rewriting process in the background, serverCron will determine whether the persistence operation needs to be performed based on the above configuration and status, based on whether lastsave and dirty meet one of the conditions in the saveparams array. If a condition matches, the rdbSaveBackground method is called to execute the asynchronous persistence process.

RdbSaveBackground

RdbSaveBackground is an auxiliary method of RDB persistence, and the main work is the fork child process, and then there are two different execution logic depending on the caller (parent process or child process).

If the caller is a parent process, fork sends out the child process, saves the child process information and returns directly.

If the caller is a child process, call rdbSave to execute the RDB persistence logic, and exit the child process after the persistence is completed.

Int rdbSaveBackground (char * filename, rdbSaveInfo * rsi) {pid_t childpid; if (hasActiveChildProcess ()) return Clearer; server.dirty_before_bgsave = server.dirty; server.lastbgsave_try = time (NULL); / / childpid child process if ((childpid = redisFork (CHILD_TYPE_RDB)) = = 0) {int retval; / * Child child process: modify the process title * / redisSetProcTitle ("redis-rdb-bgsave") RedisSetCpuAffinity (server.bgsave_cpulist); / / perform rdb persistence retval = rdbSave (filename,rsi); if (retval = = C_OK) {sendChildCOWInfo (CHILD_TYPE_RDB, 1, "RDB");} / / exit child process exitFromChild ((retval = = C_OK)? 0: 1) after persistence is completed } else {/ * Parent parent process: record the time and other information of the fork child process * / if (childpid = =-1) {server.lastbgsave_status = Clearerr; serverLog (LL_WARNING, "Can't save in background: fork:% s", strerror (errno)); return C_ERR } serverLog (LL_NOTICE, "Background saving started by pid% ld", (long) childpid); / / record the time, type, etc. of the child process. Server.rdb_save_time_start = time (NULL); server.rdb_child_type = RDB_CHILD_TYPE_DISK; return CleavOK;} return Cobb OK; / * unreached * /}

RdbSave is a real method to perform persistence, which has a large number of time-consuming and CPU-consuming operations when it is executed. In the single-thread model of Redis, the persistence process will continue to consume thread resources, resulting in Redis being unable to provide other services. In order to solve this problem, Redis fork the child process in rdbSaveBackground, and the child process completes the persistence work, which avoids taking up too much resources of the parent process.

It should be noted that if the parent process takes up too much memory, the fork process will be time-consuming, and the parent process will not be able to provide external services during this process. In addition, you need to take into account the computer memory usage, and you need to make sure that the memory is used up by double the memory resources of the fork child process. By viewing the latest_fork_usec option through the info stats command, you can get the time spent by the most recent fork to operate.

RdbSave

Redis's rdbSave function is really a function for RDB persistence. There are many procedures and details. The overall process can be summarized as: creating and opening temporary files, writing Redis memory data to temporary files, writing temporary files to disk, renaming temporary files to official RDB files, and updating persistence status information (dirty, lastsave). Among them, "Redis memory data write temporary file" is the most core and complex, the writing process directly reflects the file format of the RDB file, in line with the idea that a picture is worth a thousand words, I draw the following figure in accordance with the source code flow. Add that the lower right corner of the above figure "traversing the key-value pairs of the current database and writing" this link will be written to the RDB file in different formats according to different types of Redis data types and underlying data structures, no longer expanded. I think it would be nice for you to have an intuitive understanding of the whole process, which is very helpful for us to understand the inner workings of Redis.

AOF persistence

In the previous section, we know that RDB is a point-in-time (point-to-time) snapshot, suitable for data backup and disaster recovery, because the "congenital defect" of the working principle can not guarantee real-time persistence, which is a hard wound for systems with zero tolerance for cache loss, so there is AOF.

How AOF works

AOF is the abbreviation of Append Only File, which is the full persistence strategy of Redis and is supported since version 1.1. The file here stores a set of commands that cause changes to Redis data (such as set/hset/del, etc.), which are appended to the file in the order in which Redis Server is processed. When Redis is restarted, Redis can read the instructions in AOF from scratch and replay them, thus restoring the data state before the shutdown.

AOF persistence is disabled by default. Modify the following information in redis.conf and restart to enable AOF persistence.

# no- is disabled, yes- is enabled. Default is noappendonly yesappendfilename appendonly.aof.

The essence of AOF is to persist, and the persistence object is the state of every key in the Redis. The purpose of persistence is to restore to the state before restart or before failure after Reids failure and restart. Compared to the strategy adopted by RDB,AOF to persist each command that can cause a change in the state of objects in the Redis in the order of execution, the commands are orderly and selective. Transfer the aof file to any Redis Server and replay the commands in order from beginning to end. For example:

First execute the instruction set number 0, then randomly call incr number and get number five times, and finally execute get number one more time, and the result must be 5.

Because in this process, only instructions of type set/incr can cause changes in the state of number, and the order in which they are executed is known, no matter how many times get is executed will not affect the state of number. So, just keep all set/incr commands and persist to the aof file. According to the design principle of aof, the content in the aof file should be like this (hypothetically, it is actually the RESP protocol):

Set number 0incr numberincr numberincr numberincr numberincr number

The most essential principle can be summarized with the words "command replay". However, considering the complexity of the actual production environment and operating system limitations, the work that Redis has to consider is much more complex than this example:

After Redis Server starts, the aof file appends commands all the time, and the file gets bigger and bigger. The larger the file, the longer the recovery time after Redis restart; the larger the file, the more difficult it is to transfer work; if you leave it unmanaged, the hard disk may explode. It is clear that documents need to be streamlined at the right time. The five incr instructions in the example can obviously be replaced with a set command, and there is a lot of compression space.

As we all know, the file Redis O is a deficiency in the performance of the operating system. In order to improve efficiency, the file system has designed a complex cache mechanism. The append operation of the file operation command only writes the data to the buffer (aof_buf). There are different choices between performance and security from the buffer to writing to the physical file.

File compression means rewriting. When rewriting, you can do command integration according to the existing aof file, or you can first make a snapshot according to the status of the data in the current Redis, and then append the new commands in the process of storing the snapshot.

The file after aof backup is to recover the data. Combined with the format and integrity of aof file, Redis should also design a complete scheme to support it.

Persistence process

From a flow point of view, the working principle of AOF can be summarized into several steps: command appending (append), file writing and synchronization (fsync), file rewriting (rewrite), restart loading (load), and then learn the details of each step and the design philosophy behind it.

Command append

When AOF persistence is turned on, after executing a write command, Redis appends the executed write command to the end of the AOF buffer maintained by the Redis server in the protocol format (that is, RESP, the communication protocol between the Redis client and the server). There are only single-threaded append operations for AOF files, and there is no risk of file corruption even if there is a power outage or downtime without complex operations such as seek. In addition, there are many benefits to using a text protocol:

Text protocols have good compatibility.

The text protocol is the request command of the client, which does not need secondary processing, so it saves the processing overhead of storage and loading.

The text protocol is readable and easy to view, modify and so on.

AOF buffer type is Redis independently designed data structure sds,Redis will use different methods (catAppendOnlyGenericCommand, catAppendOnlyExpireAtCommand, etc.) to process the contents of the command according to the type of command, and finally write to the buffer.

It is important to note that if AOF rewriting is in progress when commands are appended, these commands are also appended to the rewrite buffer (aof_rewrite_buffer).

File writing and synchronization

The writing and synchronization of AOF files cannot be done without the support of the operating system. Before we begin the introduction, we need to add some knowledge about the Linux Icano buffer. The performance of the hard disk is poor, and the file reading and writing speed is far lower than the processing speed of the CPU. If you wait for data to be written to the hard disk every time the file is written, it will slow down the performance of the operating system as a whole. In order to solve this problem, the operating system provides a delayed write (delayed write) mechanism to improve the performance of the hard disk.

Traditional UNIX implementations have buffer caches or page caches in the kernel, and most disks are buffered. When writing data to a file, the kernel usually copies the data to one of the buffers first, and if the buffer is not full, it does not put it in the output queue, but waits for it to be full or when the kernel needs to reuse the buffer to hold other disk block data, then put the buffer into the output queue, and then wait until it reaches the head of the queue to perform the actual Icano operation. This mode of output is called deferred write.

Delayed writing reduces the number of disk reads and writes, but reduces the update speed of the contents of the file, so that the data to be written to the file is not written to disk for a period of time. When the system fails, this delay may result in the loss of file updates. In order to ensure the consistency between the actual file system on the disk and the buffer cache, the UNIX system provides three functions, sync, fsync and fdatasync, to support forced writing to the hard disk.

Redis calls the function flushAppendOnlyFile,flushAppendOnlyFile before the end of each event rotation (beforeSleep) to write the data in the AOF buffer (aof_buf) to the kernel buffer, and according to the appendfsync configuration to decide which strategy to use to write the data in the kernel buffer to disk, that is, call fsync (). This configuration has three optional always, no, and everysec, which are described as follows:

Always: call fsync () every time, which is the most secure and worst-performing strategy.

No: fsync () is not called. The performance is the best and the security is the worst.

Everysec: call fsync () only if the synchronization condition is met. This is the official recommended synchronization policy, and it is also the default configuration, which takes into account both performance and data security. Theoretically, only one second of data can be lost in the event of a sudden system downtime.

Note: the policy described above is affected by the configuration item no-appendfsync-on-rewrite, which tells Redis:AOF whether to disable the call to fsync () during file rewriting. The default is no.

If appendfsync is set to always or everysec, the BGSAVE or BGREWRITEAOF in progress in the background consumes too much disk Imax O, and under some Linux system configurations, the call to fsync () by Redis may block for a long time. However, this problem has not been fixed because synchronous write operations can be blocked even if fsync () is executed in different threads.

To alleviate this problem, you can use this option to prevent fsync () from being called in the main process while BGSAVE or BGREWRITEAOF is in progress.

Setting to yes means that if the child process is doing BGSAVE or BGREWRITEAOF,AOF persistence, it has the same effect as if appendfsync is set to no. In the worst case, this can result in 30 seconds of cached data loss.

If your system has the latency problem described above, set this option to yes, otherwise leave it as no.

File rewriting

As mentioned earlier, if Redis runs for a long time and commands are constantly written to AOF, the file will become larger and larger, and the security of the host may be affected without control.

In order to solve the problem of AOF file volume, Redis introduces the AOF file rewriting function, which generates a new AOF file according to the latest state of data objects in the Redis. The corresponding data state of the new and old files is the same, but the new file will have a smaller size. Rewriting not only reduces the disk space occupied by AOF files, but also improves the speed of data recovery when Redis is restarted. In the following example, 6 commands in the old file are equivalent to 1 command in the new file, and the compression effect is obvious. We said that when the AOF file is too large, it will trigger the AOF file rewriting, how big is that? What will trigger the rewrite operation? * * like RDB, AOF file rewriting can be triggered either manually or automatically. Manual trigger calls the bgrewriteaof command directly, which will be executed immediately if there is no child process execution at that time, otherwise it is scheduled to be executed after the child process ends. Automatic trigger is triggered by Redis's periodic method serverCron check when certain conditions are met. Learn about two configuration items first:

Auto-aof-rewrite-percentage: represents the percentage of increase in the current AOF file size (aof_current_size) compared to the AOF file size after the last rewrite (aof_base_size).

Auto-aof-rewrite-min-size: indicates the minimum space occupied by AOF files when running BGREWRITEAOF. The default is 64MB.

Redis initializes aof_base_size to the size of the aof file at that time, and updates it when the AOF file rewrite operation is completed during Redis operation; aof_current_size is the real-time size of the AOF file when serverCron executes. AOF file rewriting is triggered when the following two conditions are met:

Growth ratio: (aof_current_size-aof_base_size) / aof_base_size > auto-aof-rewrite-percentage file size: aof_current_size > auto-aof-rewrite-min-size

The code for manual trigger and automatic trigger is as follows, also in the periodic method serverCron:

Int serverCron (struct aeEventLoop * eventLoop, long long id, void * clientData) {/ * omit other logic * / / * if Redis is performing RDB persistence when the user requests for AOF file rewriting, Redis will arrange to perform AOF file rewriting after the RDB persistence is completed, * if aof_rewrite_scheduled is true Indicates that the user's request * / if (! hasActiveChildProcess () & & server.aof_rewrite_scheduled) {rewriteAppendOnlyFileBackground () needs to be executed } / * Check if a background saving or AOF rewrite in progress terminated. * / if (hasActiveChildProcess () | | ldbPendingChildren ()) {run_with_period (1000) receiveChildInfo (); checkChildrenDone () } else {/ * omit rdb persistence condition check * / / * AOF rewrite condition check: aof is enabled, no child processes are running, the growth percentage has been set, The current file size exceeds the threshold * / if (server.aof_state = = AOF_ON & &! hasActiveChildProcess () & & server.aof_rewrite_perc & & server.aof_current_size > server.aof_rewrite_min_size) {long long base = server.aof_rewrite_base_size? Server.aof_rewrite_base_size: 1; / * calculate growth percentage * / long long growth = (server.aof_current_size*100/base)-100; if (growth > = server.aof_rewrite_perc) {serverLog (LL_NOTICE, "Starting automatic rewriting of AOF on% lld%% growth", growth); rewriteAppendOnlyFileBackground () } / * /}

What is the process of rewriting AOF files? I heard that Redis supports mixed persistence, what is the impact on AOF file rewriting?

Since version 4.0, Redis has introduced a mixed persistence scheme in AOF mode, that is, pure AOF mode and RDB+AOF mode. This policy is controlled by the configuration parameter aof-use-rdb-preamble (using RDB as the first half of the AOF file). It is turned off by default (no), and yes is enabled. Therefore, there are two different ways to write files during AOF rewriting. When the value of aof-use-rdb-preamble is:

No: write commands in AOF format, no different from pre-4.0 versions

Yes: first write the data state according to RDB format, and then write the contents of the AOF buffer during the rewrite in AOF format. The first half of the file is in RDB format, and the second half is in AOF format.

Combined with the source code (version 6.0, too much source code is not posted here, please refer to aof.c) and reference materials, draw the flow chart of AOF rewriting (BGREWRITEAOF). Combined with the above figure, summarize the process of AOF file rewriting:

RewriteAppendOnlyFileBackground starts execution, checking for ongoing AOF rewriting or RDB persistence child processes: if so, exit the process; if not, continue to create a communication pipeline for subsequent data transfer between parent and child processes. Perform the fork () operation, and the parent and child processes execute different processes after success.

Parent process:

Record child process information (pid), timestamp, etc.

Continue to respond to other client requests

Collect commands during AOF rewriting and append to aof_rewrite_buffer

Wait and synchronize the contents of the aof_rewrite_buffer to the child process

Child processes:

Modify the current process name, create a temporary file needed for rewriting, and call the rewriteAppendOnlyFile function

According to the aof-use-rdb-preamble configuration, write the first half in RDB or AOF mode and synchronize to the hard disk

Receive incremental AOF commands from the parent process, write the second half in AOF, and synchronize to the hard disk

Rename the AOF file and the child process exits.

Data loading

After Redis starts, data loading is performed through the loadDataFromDisk function. It should be noted here that although the persistence method can be AOF, RDB or both, a choice must be made when the data is loaded, and each of the two methods will be messed up after each loading.

Theoretically, AOF persistence has better real-time performance than RDB. When AOF persistence mode is enabled, Redis gives priority to AOF mode when loading data. Moreover, AOF supports mixed persistence after Redis version 4.0, so version compatibility should be taken into account when loading AOF files. The Redis data loading process is shown in the following figure: in AOF mode, the files generated by enabling the hybrid persistence mechanism are "RDB header + AOF tail". When not opened, all the files generated are in AOF format. Considering the compatibility of the two file formats, if Redis finds that the AOF file is a RDB header, it will use the RDB data loading method to read and restore the first half, and then use AOF to read and restore the second half. Because the data stored in AOF format are RESP protocol commands, Redis uses pseudo-client to execute commands to recover the data.

If downtime occurs during the append of the AOF command, the AOF RESP command may be incomplete (truncated) due to the technical characteristics of delayed writing. In this case, Redis executes different processing policies according to the configuration item aof-load-truncated. This configuration tells Redis to read the aof file at startup and what to do if it is found to be truncated (incomplete):

Yes: load as much data as possible and notify the user by log

No: then crash in the way of system error and disable startup, requiring the user to repair the file before restarting.

Summary

Redis provides two persistence options: RDB supports generating point-in-time snapshots of the dataset at specific practice intervals; AOF persists every write instruction received by Redis Server into the log and restores the data through the replay command when Redis restarts. The log format is RESP protocol, only append operations are performed on the log files, and there is no risk of damage. And when the AOF file is too large, you can automatically overwrite the compressed file.

Of course, you can disable Redis persistence if you don't need to persist the data, but in most cases this is not the case. In fact, it is possible for us to use both RDB and AOF at the same time, and the most important thing is to understand the difference between the two in order to use them properly.

RDB vs AOF

Advantages of RDB

RDB is a compact binary file that represents a snapshot of Redis data at a certain point in time, which is very suitable for scenarios such as backup and full replication.

RDB is very friendly to disaster recovery and data migration, and RDB files can be moved wherever they are needed and reloaded.

RDB is a memory snapshot of Redis data, which has faster data recovery speed and higher performance than AOF command playback.

Shortcomings of RDB

Real-time or second persistence cannot be achieved in RDB mode. Because the persistence process is completed by the child process after the fork child process, the memory of the child process is only a snapshot of the data of the parent process at that time of the fork operation, while after the fork operation, the parent process continues to serve the outside world, the internal data changes all the time, and the data of the child process is no longer updated, so there is always a difference between the two, so real-time cannot be achieved.

The fork operation during RDB persistence results in a doubling of memory usage, and the more data the parent process has, the longer the fork process.

High concurrency of Redis requests may frequently hit save rules, resulting in uncontrollable frequency of fork operations and persistent backups

RDB files have file format requirements, different versions of Redis will adjust the file format, the old version is not compatible with the new version of the problem.

Advantages of AOF

AOF persistence has better real-time performance, we can choose three different ways (appendfsync): no, every second, always,every second as the default strategy has the best performance, in extreme cases, one second of data may be lost.

AOF files only have append operations, no complex seek and other file operations, there is no risk of damage. Even if the last written data is truncated, it can be easily repaired using the redis-check-aof tool

When the AOF file becomes larger, Redis can be automatically rewritten in the background. The old file continues to be written during the rewrite, the new file becomes smaller when the rewrite is complete, and the incremental commands during the rewrite are append to the new file.

The AOF file contains all the commands for manipulating the data in Redis in a way that has been understood and parsed. Even if we accidentally erase all the data, as long as we don't rewrite the AOF file, we can retrieve all the data by removing the last command.

AOF already supports mixed persistence, file size can be effectively controlled, and improves the efficiency of data loading.

Shortcomings of AOF

For the same data set, the AOF file is usually larger than the RDB file

Under certain fsync policies, AOF is slightly slower than RDB. Generally speaking, the performance of fsync_every_second is still very high, and the performance of fsync_no is comparable to that of RDB. However, under the huge write pressure, RDB can provide the maximum low latency guarantee.

On AOF, Redis has encountered some rare bug that is almost impossible to encounter on RDB. Some special instructions (such as BRPOPLPUSH) cause the reloaded data to be inconsistent with those before persistence, and Redis officials have tested under the same conditions, but cannot reproduce the problem.

Use suggestion

After understanding the working principle, implementation process, advantages and disadvantages of RDB and AOF persistence methods, let's think about how to weigh the pros and cons and reasonably use the two persistence methods in the actual scene. If you only use Redis as a caching tool, and all data can be rebuilt according to the persistent database, you can turn off the persistence feature and do some protective work such as preheating, cache penetration, breakdown, avalanche and so on.

In general, Redis will undertake more work, such as distributed locks, rankings, registries and so on. Persistence will play an important role in disaster recovery and data migration. Several principles are recommended:

Do not use Redis as a database, all data can be automatically rebuilt by application services as far as possible.

Use version 4. 0 or later of Redis and use AOF+RDB mixed persistence feature.

Reasonably plan the maximum memory usage of Redis to prevent insufficient resources in AOF rewriting or save process.

Avoid deploying multiple instances on a single machine.

Most of the production environments are deployed in clusters, and persistence can be enabled in slave, so that master can better provide write services.

Backup files should be automatically uploaded to remote computer room or cloud storage for disaster backup.

About fork ()

From the above analysis, we all know that snapshots of RDB and rewriting of AOF require fork, which is a heavyweight operation that can block Redis. Therefore, in order not to affect the response of the Redis main process, we need to reduce blocking as much as possible.

Reduce the frequency of fork, for example, you can manually trigger RDB to generate snapshots and rewrite with AOF

Control the maximum memory usage of Redis to prevent fork from taking too long

Use higher performance Hardwar

Reasonably configure the memory allocation policy of Linux to avoid fork failure due to insufficient physical memory

These are all the contents of the article "case Analysis of Redis persistence principle". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report