What is the persistence and master-slave replication mechanism of Redis? 07/04 Update SLTechnology News&Howtos

What is the persistence and master-slave replication mechanism of Redis?

2025-07-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Editor to share with you what is the persistence of Redis and master-slave replication mechanism. I hope you will learn a lot after reading this article. Let's discuss it together.

Redis persistence

Redis provides several different levels of persistence:

RDB persistence can generate a point-in-time snapshot of a dataset (point-in-time snapshot) within a specified time interval

AOF persistence records all write commands performed by the server and restores the dataset by re-executing these commands when the server starts. All commands in the AOF file are saved in the Redis protocol format, and the new command is appended to the end of the file. Redis can also rewrite (rewrite) the AOF file in the background so that the size of the AOF file does not exceed the actual size required to save the dataset state.

Redis can also use both AOF persistence and RDB persistence. In this case, when Redis restarts, it gives priority to using the AOF file to restore the dataset because the dataset saved by the AOF file is usually more complete than the dataset saved by the RDB file.

You can even turn off persistence so that the data exists only while the server is running.

RDB (Redis DataBase)

Rdb: writes a snapshot of a dataset in memory to disk within a specified time interval, that is, a snapshot snapshot in jargon. When it is restored, it reads the snapshot file directly into memory.

Redis will create (fork) a child process separately for persistence. It will first write the data to a temporary file, wait for the persistence process to end, and then replace the last persisted file with this temporary file. In the whole process, the main process does not do any IO operation, which ensures extremely high performance. If large-scale data recovery is needed and is not very sensitive to the integrity of data recovery, then the RDB method is more efficient than the AOF approach. The disadvantage of RDB is that the data may be lost after the last persistence.

The function of Fork is to copy a process that is the same as the current process. All data (variables, environment variables, program counters, etc.) of the new process are the same as the original process, but it is a brand new process and acts as a child process of the original process.

Hidden trouble: if the current process has a large amount of data, then the amount of data after fork is * 2, which will cause great pressure on the server and reduce the running performance.

Rdb saves the dump.rdb file

In the test: when you execute the flushAll command and use shutDown to close the process directly, redis automatically reads the dump.rdb file the second time it is opened, but when it is restored, it is all empty. (the reason at this time: at the time of shutdown, the redis system will save the empty dump.rdb to replace the original cache file. So when the redis system is opened for the second time, the null value file is automatically read)

RDB save operation

Rdb is the compressed snapshot,RDB data structure of the whole memory, which can be configured to meet the snapshot trigger conditions. The default is to change 10,000 times in 1 minute, or 10 times in 5 minutes, or once every 15 minutes.

Save disabled: if you want to disable the RDB persistence policy, as long as you don't set any save instructions, or you can pass an empty string parameter to save.

-> save directive: save the Operand immediately

How to trigger a RDB snapshot

When Save:save, just save, do not care about others, all block.

Bgsave:redis performs snapshot operations in the background, which can also respond to client requests at the same time, and the time of the last successful snapshot can be obtained through the lastsave command.

Executing the fluhall command also produces a dump.rdb file, but it is empty.

How to recover:

Just move the backup file (dump.rdb) to the redis installation directory and start the service

The Config get dir command gets the directory

How to stop

The method of dynamically stopping RDB to save rules: redis-cli config set save ""

AOF (Append Only File)

Record each write operation in the form of a log, and record all write instructions executed by redis (read operations are not recorded). Only append files but not overwrite files. Redis will read the file and reconstruct data at the beginning of startup. In other words, if redis restarts, it will write instructions to complete data recovery one at a time according to the contents of the log file.

= APPEND ONLY MODE=

Enable aof: appendonly yes (default is no)

Note:

In the actual production, it often occurs: aof file corruption (network transfer or other problems lead to aof file corruption)

Server startup error (but the dump.rdb file is complete) indicates that the aof file is loaded before startup

Solution: execute the command redis-check-aof-fix aof file [automatically check for fields that are inconsistent with aof syntax]

Aof strategy

Appendfsync parameters:

Always synchronous persistence every data change will be recorded to disk immediately, which has poor performance but good data integrity.

Everysec: factory default recommendation, asynchronous operation, recording per second, downtime after one second, data loss

No: never fsync: leave the data to the operating system for processing. Faster and less secure options.

Rewrite

Concept: AOF file append way, files will be more and more to greatly avoid this situation, a new rewrite mechanism, aof file size exceeds the set threshold, redis will automatically aof file content compression, value retention can recover the data of the minimum instruction set, you can use the command bgrewirteaof.

Rewriting principle: when an aof file continues to grow and grow, a new process is created to rewrite the file (that is,

Is to write the temporary file first and then rename), traverse the data in the memory of the new process, each record has a set statement, rewrite the aof file operation, and do not read the old aof file, but the entire memory of the database contents of the way to rewrite a new aof file, which is similar to the snapshot.

Trigger mechanism: redis records the size of the last rewritten aof. The default configuration is when the aof file size is twice the size of the last rewrite and the file is larger than 64m trigger (3G).

No-appendfsync-on-rewrite no: whether you can use Appendfsync when rewriting and use the default no to ensure data security

Auto-aof-rewrite-percentage multiplier sets the base value

Auto-aof-rewrite-min-size sets the base value size

Advantages of AOF

Using AOF persistence makes Redis very durable: you can set different fsync policies, such as no fsync, fsync per second, or fsync each time a write command is executed. AOF's default policy is to fsync once per second, under this configuration, Redis can still maintain good performance, and even if a failure occurs, it will only lose data for up to one second (fsync executes in the background thread, so the main thread can continue to work hard on command requests).

The AOF file is an append-only log file (append only log), so writing to the AOF file does not require seek, and the redis-check-aof tool can easily fix this problem even if the log contains commands that are not fully written for some reason (such as disk full at write time, write outage, etc.).

Redis can automatically rewrite AOF in the background when the AOF file becomes too large: the new AOF file after rewriting contains the minimum set of commands needed to recover the current dataset. The entire rewrite operation is perfectly safe because Redis continues to append commands to existing AOF files during the creation of new AOF files, and existing AOF files will not be lost even if downtime occurs during rewriting. Once the new AOF file is created, Redis switches from the old AOF file to the new AOF file and starts appending the new AOF file.

The AOF file saves all writes to the database in an orderly manner, which are saved in the format of the Redis protocol, so the contents of the AOF file are easy to read and parse. Exporting (export) the AOF file is also very simple: for example, if you accidentally execute the FLUSHALL command, but as long as the AOF file is not rewritten, simply stop the server, remove the FLUSHALL command at the end of the AOF file, and restart Redis, and you can restore the dataset to the state it was before the FLUSHALL execution.

Shortcomings of AOF

For the same dataset, the volume of the AOF file is usually larger than that of the RDB file.

Depending on the fsync strategy used, AOF may be slower than RDB. In general, the performance of fsync per second is still very high, and turning off fsync can make AOF as fast as RDB, even under heavy loads. However, RDB can provide a more guaranteed maximum latency (latency) when dealing with large write loads.

AOF has had such bug in the past: due to individual commands, when the AOF file is reloaded, the dataset cannot be restored to what it was when it was saved. (for example, the blocking command BRPOPLPUSH has caused such a bug. )

Tests are added to the test suite: they automatically generate random, complex data sets and make sure everything is all right by reloading the data. Although this kind of bug is not common in AOF files, it is almost impossible for RDB to have this kind of bug by comparison.

Backing up Redis data

Be sure to back up your database!

Disk failures, node failures, and so on can make your data disappear, and it is very dangerous not to back up.

Redis is very friendly for data backup because you can copy the RDB file while the server is running: once the RDB file is created, no modification is made. When the server wants to create a new RDB file, it saves the contents of the file in a temporary file. When the temporary file is written, the program uses rename (2) atomically to replace the original RDB file with the temporary file.

That is to say, it is absolutely safe to copy RDB files at any time.

Recommendations:

Create a regular task (cron job) that backs up one RDB file to one folder every hour and one RDB file to another folder every day.

Make sure that snapshots are backed up with the appropriate date and time information, and use the find command to delete expired snapshots each time you execute a regular task script: for example, you can keep hourly snapshots for the last 48 hours and daily snapshots for the last month or two.

At least once a day, back up the RDB outside your data center, or at least to the physical machine where you are running the Redis server.

Disaster recovery backup

The disaster recovery backup of Redis is basically to back up the data and transfer these backups to several different external data centers.

Disaster recovery backups can keep data secure in the event of serious problems in the primary data center where Redis is running and snapshots are taken.

Some Redis users are entrepreneurs who don't have a lot of money to waste, so here are some practical and cheap disaster recovery backup methods:

Amazon S3, and other services like S3, are a good place to build disaster backup systems. The easiest way is to encrypt and send your hourly or daily RDB backups to S3. The encryption of the data can be done with the gpg-c command (symmetric encryption mode). Remember to put your password in several different, secure places (for example, you can copy it to the most important people in your organization). Using multiple storage services to save data files at the same time can improve the security of the data.

Sending snapshots can be done using SCP (a component of SSH). Here is a simple and secure way to transfer: buy a VPS (virtual dedicated server) very far from your data center, install SSH, create a password-less SSH client key, and add this key to VPS's authorized_keys file, so you can send snapshot backup files to this VPS. In order to achieve the best data security, at least one VPS from each of two different providers should be purchased for data disaster recovery backup.

It should be noted that this kind of disaster recovery system is easy to fail if it is not handled carefully.

At a minimum, you should check that the size of the transferred backup file is the same as that of the original snapshot file after the file transfer is complete. If you are using VPS, you can also verify that the file is fully transferred by comparing the SHA1 checksum of the file.

In addition, you need a separate alarm system to notify you if the transfer responsible for sending backup files fails.

Redis master-slave replication

Redis supports the simple and easy-to-use master-slave replication (master-slave replication) feature, which makes the slave server (slave server) an exact replica of the master server (master server).

Here are several important aspects of the Redis replication feature:

Redis uses asynchronous replication. Starting with Redis 2.8, the slave server reports the progress of the replication stream (replication stream) processing to the master server at a frequency of once per second.

A master server can have multiple slave servers.

Not only the master server can have a slave server, but also a slave server can have its own slave server, and multiple slave servers can form a graph structure.

Replication does not block the master server: the master server can continue to process command requests even if one or more slave servers are synchronizing for the first time.

The replication feature also does not block the slave server: as long as the appropriate settings are made in the redis.conf file, the server can use the old version of the dataset to handle command queries, even if the slave server is in the process of initial synchronization.

However, connection requests are blocked during the period of time when the old version of the dataset is deleted from the server and the new version is loaded.

You can also configure the slave server to send an error to the client when the connection to the master server is disconnected.

Replication can be used simply for data redundancy (data redundancy), or it can improve scalability (scalability) by having multiple slave servers handle read-only command requests: for example, heavy SORT commands can be handed over to subsidiary nodes to run.

The replication function can be used to save the master server from the persistence operation: just turn off the persistence function of the master server, and then let the slave server perform the persistence operation.

Data security for the replication feature when the primary server persistence is turned off.

When configuring Redis replication, it is strongly recommended that you turn on the persistence feature of the primary server. Otherwise, due to delays and other problems, deployed services should avoid automatic pull-up.

Case study:

Assume that node An is the primary server and persistence is turned off. And node B and node C copy data from node A.

Node A crashes, and then the automatic pull service restarts node A. Because the persistence of node An is turned off, there is no data after reboot.

Node B and node C copy the data from node A, but A's data is empty, so the copy of the data saved by itself is deleted.

Even using Sentinel to achieve high availability of Redis is very dangerous when persistence on the primary server is turned off and the automatic pull process is turned on at the same time. Because the primary server may pull up so fast that Sentinel does not detect that the primary server has been restarted during the configured heartbeat interval, and then performs the above data loss process.

Data security is extremely important at any time, so the primary server should be prohibited from pulling automatically while the persistence is turned off.

Configure from the server

Configuring a slave server is very simple, as long as you add the following line to the configuration file:

Slaveof 192.168.1.1 6379

Another way is to call the SLAVEOF command, enter the IP and port of the primary server, and then synchronization will begin

127.0.0.1 SLAVEOF 6379 > 192.168.1.1 10086

Read-only slave server

Starting with Redis 2.6, read-only mode is supported from the slave server, and this mode is the default mode for the slave server.

Read-only mode is controlled by the slave-read-only option in the redis.conf file, and can also be turned on or off with the CONFIG SET command.

The read-only slave server refuses to execute any write commands, so there is no case of accidentally writing data to the slave server because of an operational error.

In addition, executing the command SLAVEOF NO ONE on a secondary server will cause the secondary server to turn off replication and switch back to the primary server from the secondary server, so that the original synchronized dataset will not be discarded.

Taking advantage of the feature that "SLAVEOF NO ONE will not discard the synchronized dataset", the secondary server can be used as the new master server when the primary server fails, thus achieving uninterrupted operation.

From server-related configuration:

If the master server sets the password through the requirepass option, then in order for the synchronization of the slave server to proceed smoothly, we must also make the appropriate authentication settings for the slave server.

For a running server, you can use the client to enter the following command:

Config set masterauth

To set this password permanently, add it to the configuration file:

Masterauth

The master server performs a write operation only if there are at least N slave servers.

Starting with Redis 2.8, in order to ensure the security of the data, the master server can be configured to execute write commands only if there are at least N currently connected slave servers.

However, because Redis uses asynchronous replication, write data sent by the master server is not necessarily received from the server, so the possibility of data loss still exists.

Here is how this feature works:

The slave server PING the master server once per second and reports on the processing of the replication stream.

The master server records the last time each slave server sent PING to it.

Through configuration, the user can specify the maximum value of the network delay, min-slaves-max-lag, and the minimum number of slave servers required to perform the write operation min-slaves-to-write.

If there are at least min-slaves-to-write slave servers, and all of them have latency values less than min-slaves-max-lag seconds, then the master server will perform the write operation requested by the client.

On the other hand, if the conditions do not meet the conditions specified by min-slaves-to-write and min-slaves-max-lag, the write operation will not be performed and the primary server will return an error to the client requesting the write operation.

Here are two options for this feature and the parameters they require:

Min-slaves-to-write min-slaves-max-lag finished reading this article, I believe you have a certain understanding of Redis persistence and master-slave replication mechanism, want to know more related knowledge, welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.