What is the principle of Redis master-slave replication? 07/01 Update SLTechnology News&Howtos

What is the principle of Redis master-slave replication?

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

What is the principle of Redis master-slave replication? This problem may be often seen in our daily study or work. I hope you can gain a lot from this question. The following is the reference content that the editor brings to you, let's take a look at it!

What is Redis master-slave replication?

Master-slave replication means that there are now two redis servers that synchronize the data of one redis to another redis database. The former is called master node (master) and the latter is slave node (slave). Data can only be synchronized one-way from master to slave.

But in the actual process, it is impossible for only two redis servers to do master-slave replication, which means that each redis server may be called the master node (master).

In the example below, our slave3 is both the slave node of master and the master node of slave.

First know this concept, more details continue to see below.

Second, why do you need Redis master-slave replication?

Suppose we now have a redis server, that is, stand-alone.

The first problem that will arise in this case is server downtime, which directly leads to data loss. If the project is related to the occupation of, the consequences can be imagined.

The second case is the memory problem, when there is only one server, the memory will certainly reach its peak, it is impossible to upgrade a server indefinitely.

So in view of the above two problems, we will prepare a few more servers and configure master-slave replication. Save the data on multiple servers. And ensure that the data of each server is synchronized. Even if one server goes down, it will not affect the use of users. Redis can continue to achieve high availability and redundant backup of data at the same time.

There should be a lot of questions about how to connect master and slave. How to synchronize data? What if the master server goes down? Don't worry, solve your problem bit by bit.

III. The role of Redis master-slave replication

Above we talked about why master-slave replication of redis is used, so the role of master-slave replication is to talk about why it is used.

Let's continue to use this diagram to talk about the first point is that data is redundant and hot backup of data is achieved, which is another way to persist. The second point is to solve the problem of single machine failure. When there is a problem with the master node, that is, master, the service can be provided by the slave node, that is, slave, which enables rapid recovery of failures, that is, service redundancy. The third point is the separation of read and write, master server is mainly written, slave is mainly used to read data, which can improve the load capacity of the server. At the same time, the number of slave nodes can be added according to the change of demand. The fourth point is load balancing, with the separation of read and write, there are master nodes to provide write services and slave nodes to provide read services, sharing the server load, especially in the case of less writing and more reading. Sharing the read load through multiple slave nodes can greatly improve the concurrency and load of the redis server. The fifth point is the cornerstone of high availability. Master-slave replication is the basis on which sentinels and clusters can implement, so we can say that master-slave replication is the cornerstone of high availability. Configure Redis master-slave replication

Having said that, let's simply configure a master-slave replication case, and then talk about the principle of implementation.

The redis storage path is: usr/local/redis

Logs and configuration files are stored in: usr/local/redis/data

First of all, let's configure two configuration files to modify the configuration files for redis6379.conf and redis6380.conf, mainly to modify the port. For convenience, the names of log files and persistence files are identified by their respective ports. Then open two redis services, one port is 6379 and the other port is 6380. Execute the command redis-server redis6380.conf, and then use redis-cli-p 6380 to connect, because the default port for redis is 6379, so we can start another redis server directly using redis-server redis6379.conf and then connect directly using redis-cli. At this time, we successfully configured two redis services, one for 6380 and the other for 6379. This is just for demonstration. In practice, it needs to be configured on two different servers.

1. Start using the client command line

We have to have a concept first, that is, when configuring master-slave replication, all operations are performed on slave nodes, that is, slave.

So we are executing a command from the slave node as slaveof 127.0.0.1 6379, which means we are connected. Let's first test to see if master-slave replication is implemented. Execute two set kaka 123 and set master 127.0.0.1 on the master server, and then you can successfully obtain them on the slave6380 port, which means that our master-slave replication is complete. However, this is not the end of the implementation of the production environment, and the master-slave replication will be further optimized until high availability is achieved.

two。 Enable using profile

Before using the configuration file to start master-slave replication! First, you need to disconnect the connection using the client command line, and then disconnect the master-slave replication by executing slaveof no one from the host. Where can I see that the slave node has been disconnected from the master node? Enter the command line info on the client side of the primary node to view

In this picture, after the slave node uses the client command line to connect to the master node, enter the information printed by info in the client of the master node, and you can see a message with a slave0. This figure is the info printed on the master node after the slave node has executed the slaveof no one, indicating that the slave node has been disconnected from the master node. After starting the redis service according to the configuration file, redis-server redis6380.conf

When the slave node is restarted, the connection information to the slave node can be viewed directly in the master node. Test data, what is written by the master node, and the slave node will be automatically synchronized.

3. Start when you start the redis server

This way of configuration is also very simple, start master-slave replication directly when you start the redis server, and execute the command: redis-server-- slaveof host port.

4. View the log information after master-slave replication starts

This is the log information of the master node and this is the information of the slave node, which contains the information of the connected master node and the RDB snapshot.

Fifth, the working principle of master-slave replication 1. The three stages of master-slave replication

The complete workflow of master-slave replication is divided into the following three stages. Each paragraph has its own internal workflow, so we will talk about these three processes.

The process of establishing a connection: this process is the process of data synchronization between slave and master: the process of synchronizing data from master to slave the process of command propagation: the process of repeatedly synchronizing data 2. The first phase: the connection establishment process

The figure above is a complete master-slave replication connection workflow. Then use short words to describe the above workflow.

Set the address and port of master, save master information, establish a socket connection (what this connection does later) continue to send ping commands, authentication, send slave port information

In the process of establishing the connection, the slave node saves the address and port of the master, and the master node master saves the port of the slave node slave.

3. The second phase: data synchronization phase process

This diagram describes in detail the data synchronization process when the master node is connected to the slave node for the first time.

When the slave node connects the master node for the first time, a full copy will be performed first. This full copy is inevitable.

After the full copy is completed, the master node will send the data of the replication backlog buffer, and then the slave node will perform bgrewriteaof to recover the data, which is partial replication.

Three new points are mentioned at this stage, full replication, partial replication, and replication buffer backlog area. These points are described in detail in the frequently asked questions below.

4. The third stage: the stage of command propagation

When the master database is modified and the data of the master-slave server is inconsistent, the master-slave data will be synchronized to consistent, a process called command propagation.

Master will send the received data change command to slave,slave and execute the command after receiving the command to make the master-slave data consistent.

"partial replication of command propagation phase"

Network disconnection occurs during the command propagation phase, or network jitter can lead to disconnection (connection lost)

At this time, the master node master will continue to write data to replbackbuffer (replication buffer backlog area).

The slave node will continue to attempt to connect to the host (connect to master)

When the slave node sends its runid and replication offset to the master node, and executes the pysnc command to synchronize

If master determines that the offset is within the range of the copy buffer, it returns the continue command. And send the data of the copy buffer to the slave node.

Receive data from the node and perform bgrewriteaof to recover the data

six。 Introduce the principle of master-slave replication in detail (full replication + partial replication)

This process is the most complete process explanation of master-slave replication. So let's give a brief introduction to each step of the process.

Send the instruction psync? 1 psync runid offset from the node to find the corresponding runid to request data. But consider here that the runid and offset of the master node are not known at all when the slave node is connected for the first time. So the first instruction sent was psync? 1 means I want all the data of the master node. The master node starts to execute bgsave to generate the RDB file, recording the current replication offset offset the master node will send its own runid and offset to the slave node through socket through the + FULLRESYNC runid offset instruction. Receive + FULLRESYNC from the slave node to save the runid and offset of the master node, then empty all the current data, receive the RDB file through socket, and begin to recover the RDB data. After full replication, the slave node has obtained the runid and offset of the master node, and begins to send instructions psync runid offset the master node to receive instructions to determine whether the runid matches and whether the offset is in the replication buffer. The master node determines that there is a dissatisfaction between runid and offset and will continue to perform a full copy after returning to step 2. The runid mismatch here can only be solved after rebooting from the node, and the offset (offset) mismatch means that the copy backlog buffer overflows. Ignored if the runid or offset check passes and the offset of the slave node is the same as the offset of the master node. If the runid or offset verification passes and the offset of the slave node is different from the offset, + CONTINUE offset (the offset master node) is sent, and the data from the slave node offset to the master node offset in the replication buffer is sent through socket. After receiving the information from the node + CONTINUE saving master offset through socket, execute bgrewriteaof to recover the data.

"1-4 is full replication. 5-8 is partial replication."

Below step 3 of the master node, the master node has been receiving data from the client during the master-slave replication, and the offset of the master node has been changing. Only if there is a change, it will be sent to each slave. This sending process is called the heartbeat mechanism.

seven。 Heartbeat mechanism

In the command propagation phase, the master node and the slave node always need to exchange information, and the heartbeat mechanism is used to maintain the connection between the master node and the slave node.

Master heartbeat

Instruction: ping is performed once every 10 seconds by default, and the main thing determined by the parameter repl-ping-slave-period is to determine whether the slave node can use info replication to check the interval of the last connection time after the slave node is leased. A lag of 0 or 1 is normal.

Slave heartbeat task

Instruction: replconf ack {offset} executes once per second the main thing to do is to send its own replication offset to the master node, to get the latest data change command from the master node, and to determine whether the master node is online.

"matters needing attention in heartbeat phase" the master node ensures data stability when the number of slaves or delay is too high. All information synchronization will be denied.

Here are two parameters that can be configured:

Min-slaves-to-write 2

Min-slaves-max-lag 8

These two parameters indicate that there are only 2 slave nodes left, or when the delay of the slave node is greater than 8 seconds, the master node will forcibly turn off the maste function and stop data synchronization.

So how does the master node know the number of slaves and the delay time? In the heartbeat mechanism, slave sends the perlconf ack instruction every second, which can carry the offset, the delay time of the slave node and the number of slave nodes.

Eight, the three core elements of partial replication 1. Running id (run id) of the server

Let's take a look at what this run id is, and you can see it by executing the info command. We can also see the startup log information in the above section.

Redis automatically generates a random id when it starts (note here that each startup id will be different), which is made up of 40 random hexadecimal strings that uniquely identify a redis node.

When master-slave replication is started for the first time, master will send its own runid to the id where slave,slave will save master. We can use the info command to see

When disconnected and reconnected, slave sends the id to master, and if the runid saved by slave is the same as master's current runid, master will try to use partial replication (another factor in the success of this copy is offset). If the runid saved by slave is different from master's current runid, a full copy will be made directly.

two。 Copy backlog buffer

The replication buffer backlog area is a first-in-first-out queue in which users store command records of data collected by master. The default storage space for the replication buffer is 1m.

You can modify the repl-backlog-size 1mb in the configuration file to control the buffer size. This ratio can be changed according to your server memory. About 30% of the buffer is reserved on the Kaka side.

"what exactly is stored in the copy buffer? "

When executing a command set name kaka, we can look at the persistence file to see that the copy backlog buffer is the stored aof persisted data, separated by bytes, and each byte has its own offset. This offset is the copy offset (offset). "then why is it said that the replication buffer backlog may lead to full replication?"

During the command propagation phase, the master node stores the collected data in a copy buffer and then sends it to the slave node. It is here that there is a problem. When the amount of data of the master node is particularly large in an instant, beyond the memory of the replication buffer, some of the data will be squeezed out, resulting in data inconsistency between the master node and the slave node. In order to make a full copy. If the buffer size is not set properly, it is likely to cause an endless loop, and the slave node will copy all the time, empty the data, and copy all the time.

3. Copy offset (offset)

The master node replication offset is sent to the slave node one record at a time, and the slave node is received one record at a time.

Used to synchronize information, compare the differences between master and slave nodes, and restore data when slave is disconnected.

This value comes from the offset in the copy buffer backlog area.

nine。 Common problems of master-slave replication 1. Master node restart problem (internal optimization)

When the master node is restarted, the value of runid will change, resulting in full replication of all slave nodes.

We don't need to think about this problem, we just need to know how the system is optimized.

After the master-slave replication is established, the master node creates a master-replid variable, which is the same as runid, with a length of 41 bits and a runid length of 40 bits, and then sends it to the slave node.

When the master node executes the shutdown save command, a RDB persistence saves the runid and offset to the RDB file. You can view this information using the command redis-check-rdb.

After the master node restarts, the RDB file is loaded, and the repl-id and repl-offset in the file are loaded into memory. Even if all slave nodes are considered to be the previous master node.

two。 The interrupt offset from the node network crosses the boundary and leads to full replication.

Due to the poor network environment, the network of the slave node is interrupted. Replication backlog buffer memory is too small to lead to data overflow, accompanied by offsets from the node out of bounds, resulting in full replication. It may lead to repeated full copies.

Solution: modify the size of the replication backlog buffer: repl-backlog-size

Setting suggestion: test the time that the master node connects with the slave node, and get the average total write_size_per_second of commands generated by the master node per second.

Copy buffer space setting = 2 * master-slave connection time * total amount of data generated by the master node per second

3. Frequent network outages

Because the cpu of the master node is too high, or the slave node is connected frequently. The result of this situation is that all kinds of resources of the primary node are seriously occupied, including but not limited to buffer, broadband, connection and so on.

Why are the resources of the primary node heavily occupied?

In the heartbeat mechanism, the slave node sends an instruction replconf ack instruction to the master node every second. The slave node executes a slow query, which takes up a lot of cpu master node to call the replication timing function replicationCron every second, and then the slave node does not respond for a long time.

Solution:

Set timeout release from node

Setting parameter: repl-timeout

This parameter defaults to 60 seconds. Release slave for more than 60 seconds.

4. Data inconsistency problem

Due to network factors, the data of multiple slave nodes will be inconsistent. This factor cannot be avoided.

Here are two solutions to this problem:

The first data requires a highly consistent configuration of a redis server, using one server for both reading and writing, which is limited to a small amount of data, and the data needs to be highly consistent.

The second one monitors the offset of the master-slave node and temporarily blocks the client's access to the slave node if the delay of the slave node is too large. Set the parameter to slave-serve-stale-data yes | no. Once this parameter is set, it can only respond to a few commands such as info slaveof.

Thank you for reading! After reading the above, do you have a general understanding of the principle of Redis master-slave replication? I hope the content of the article will be helpful to all of you. If you want to know more about the relevant articles, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.