Example Analysis of Master-Slave replication principle in redis 11/20 Update SLTechnology News&Howtos

Example Analysis of Master-Slave replication principle in redis

2025-11-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article will explain in detail the example analysis of the principle of master-slave replication in redis. The editor thinks it is very practical, so I share it with you for reference. I hope you can get something after reading this article.

1. Replication process

Execute the slaveof command from the node.

The slave node only saves the information of the master node in the slaveof command and does not initiate replication immediately.

The information of the master node is found from the timing task within the node, and the master node is connected using socket.

After the connection is successfully established, send a ping command in the hope of getting a response from the pong command, otherwise the connection will be reconnected.

If the primary node has permissions set, then permission verification is required, and if the verification fails, the replication is terminated.

After the permission verification is passed, data synchronization is carried out, which is the longest operation, and the master node will send all the data to the slave node.

When the master node synchronizes the current data to the slave node, the replication establishment process is completed. Next, the master node will continuously send write commands to the slave node to ensure the consistency of the master-slave data.

two。 Synchronization between data

One of the steps in the replication process mentioned above is "synchronizing datasets", which is now referred to as "synchronization between data".

Redis synchronization has two commands: sync and psync, the former is the synchronization command before redis 2.8.The latter is the command of redis 2.8to optimize the new design of sync. We will focus on the psync command of2.8.

The psync command requires three components to support:

The master and slave nodes copy the offset respectively

Primary node replicates backlog buffer

The primary node is running ID

The master and slave nodes copy the offset separately:

The master and slave nodes participating in replication maintain their own replication offsets.

After processing the write command, the master node will accumulate the byte length of the command, and the statistical information is in the masterreploffset indicator in info replication.

The slave node reports its own replication offset to the master node every second, so the master node also saves the replication offset of the slave node.

After receiving the command sent by the master node, the slave node will also accumulate its own offset, and the statistical information is in info replication.

By comparing the replication offset of master-slave nodes, we can judge whether the data of master-slave nodes are consistent or not.

The primary node copies the backlog buffer:

The copy backlog buffer is a fixed-length first-in-first-out queue stored in the primary node with the default size 1MB.

This queue is created in the slave connection. When the master node responds to the write command, it will not only send the command to the slave node, but also write to the copy buffer.

Its role is to remedy the data lost in some replication and replication commands. You can see the relevant information through info replication.

The primary node runs ID:

When each redis starts, a 40-bit running ID is generated.

The main purpose of running ID is to identify Redis nodes. If ip+port is used, it is not safe for the slave node to copy based on the offset if the master node restarts and modifies the RDB/AOF data. Therefore, when running id changes, the slave node will make a full copy. In other words, when redis is restarted, the slave node will replicate in full by default.

What if I don't change to run ID when I restart?

You can use the debug reload command to reload RDB and keep running ID unchanged, thus effectively avoiding unnecessary full replication.

The disadvantage is that the debug reload command will block the main thread of the current Redis node, so it needs to be used with caution for master nodes with large amounts of data or nodes that cannot tolerate blocking. Generally, this problem can be solved through the failover mechanism.

How the psync command is used:

The command format is psync {runId} {offset}

RunId: the running id of the master node copied from the slave node

Offset: the current data offset replicated from the node

Psync execution process:

Process description:

The slave node sends a psync command to the master node, and runId is the ID of the target master node. If there is no default,-1 offset is the copy offset saved by the slave node, or-1 if it is the first copy.

The master node returns the result based on the runid and offset decisions:

If you reply + FULLRESYNC {runId} {offset}, the slave node will trigger the full copy process.

If you reply + CONTINUE, the slave node will trigger a partial replication.

If you reply to + ERR, the master node does not support the 2.8psync command and will use sync to perform a full copy.

At this point, the synchronization between the data is almost done, the length is still relatively long. It mainly focuses on the introduction of psync commands.

3. Full copy

Full replication is the earliest replication mode supported by Redis, and it is also a stage that must be experienced when the master and slave establish replication for the first time. The commands that trigger full replication are sync and psync. As mentioned earlier, the watershed version of these two commands is 2.8. Before using sync, only full synchronization can be performed. After 2.8, both full synchronization and partial synchronization are supported.

The process is as follows:

Send the psync command (spync? -1)

The master node returns FULLRESYNC according to the command

Slave node records master node ID and offset

Send the psync command (spync? -1)

The master node returns FULLRESYNC according to the command

Slave node records master node ID and offset

Bgsave the master node and save the RDB locally

The master node sends the RBD file to the slave node

The RDB file is received from the node and loaded into memory

During the period of receiving data from the slave node, the master node saves the new data to the "copy client buffer", and when the slave node loads the RDB, it sends it. (if you spend too much time from the node, it will cause a buffer overflow and the full synchronization will fail.)

Load the RDB file after emptying the data from the node. If the RDB file is very large, this step is still time-consuming. If the client accesses it at this time, it will lead to data inconsistency. You can use the configuration slave-server-stale-data to close it.

After successfully loading RBD from the slave node, if AOF is enabled, bgrewriteaof will be done immediately.

The bold part above is the time-consuming part of the whole synchronization.

Note:

If the RDB file is larger than 6GB and is a gigabit Nic, the default timeout mechanism of Redis (60 seconds) will cause full replication to fail. You can solve this problem by adjusting the repl-timeout parameter. Although Redis supports diskless replication, that is, it is directly sent to the slave node through the network, its function is not perfect and should be used cautiously in the production environment.

4. Partial replication

When the slave node is copying the master node, if a network flash and other exceptions occur, the slave node will ask the master node to reissue the lost command data, and the master node only needs to send the data of the replication buffer to the slave node to ensure the consistency of the data. compared with full replication, the cost is much lower.

When the slave node has a network outage and the repl-timeout time is exceeded, the master node will interrupt the replication connection.

The master node writes the requested data to the copy backlog buffer, which defaults to 1MB.

When the slave node is restored and the master node is reconnected, the slave node sends the offset and the master node id to the master node.

After the primary node is verified, if the data after the offset is in the buffer, a cuntinue response is sent-- indicating that partial replication can be made.

The master node sends the data of the buffer to the slave node to ensure the normal state of the master-slave replication.

5. heartbeat

After the master and slave nodes establish replication, they maintain a long connection and send heartbeat commands to each other.

The key mechanism of heartbeat is as follows:

Both of them have heartbeat detection mechanism, each of which is simulated as each other's client to communicate, and the client list command is used to view the replication-related client information. The connection status of the master node is flags = M, and that of the slave node is flags = S.

By default, the master node sends ping commands to the slave node every 10 seconds, and the configuration repl-ping-slave-period can be modified to control the sending frequency.

The slave node sends the replconf ack {offset} command every other second in the master thread to report its current replication offset to the master node.

After receiving the replconf message, the master node determines the slave node timeout, and if it exceeds 60 seconds of repl-timeout, the slave node is determined to go offline.

Note:

In order to reduce the master-slave delay, the redis master-slave nodes are generally deployed in the same computer room / same city computer room to avoid heartbeat interruption caused by network partition caused by network delay.

6. Asynchronous replication

The master node is not only responsible for data reading and writing, but also responsible for synchronizing the write command to the slave node. The sending process of the write command is completed asynchronously, that is to say, the master node returns to the client immediately after processing the write command, and does not wait for the slave node to complete the replication.

The steps for asynchronous replication are simple, as follows:

The master node accepts the processing command.

The master node returns the response result after processing.

For the modification command, it is sent asynchronously to the slave node, and the slave node executes the copied command in the master thread.

This is the end of the article on "example Analysis of Master-Slave replication principle in redis". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.