Detailed explanation of Redis full replication and partial replication examples 07/11 Update SLTechnology News&Howtos

Detailed explanation of Redis full replication and partial replication examples

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)06/01 Report--

Redis master-slave replication

Redis instance is divided into master node (master) and slave node (slave). By default, Redis is a master node and each slave node can have only one master node, while the data flow that can be replicated by multiple slave nodes at the same time is unidirectional and can only be copied from the master node to the slave node when the slaveof command is used, it can be dynamically configured at run time. You can also write in advance the master-slave replication steps in the configuration file. After saving the master node information and executing the slaveof, the slave node only saves the master node address information and directly returns the master-slave node to establish the socket connection. The slave node (slave) maintains replication-related logic through scheduled tasks running every second. When the scheduled task discovers that there is a new master node, it will try to establish a network connection with the node. The slave node creates a socket socket that is specially used to accept replication commands sent by the resident node. If the slave node cannot establish a connection, the scheduled task will retry indefinitely until the connection is successful or the execution of slaveof no one cancels the replication. After the connection is successfully established, the slave node sends a ping request for the first communication. The purpose of the ping request is to detect whether the master node can accept the processing command. If the slave node does not receive a pong reply from the master node or times out after sending the ping command, such as the network timeout or the master node is blocking and unable to respond to the command, the slave node will copy the connection at the port. The next scheduled task will initiate reconnection permission verification. If the master node sets the requirepass parameter, password verification is required. The slave node must configure the masterauth parameter to ensure that the same password as the master node can pass the verification. If the verification fails, the replication will be terminated, and the slave node will restart the replication process to synchronize the data set. After the master-slave replication connection communicates normally, the master node will send all the data held by the master node to the slave node in the scenario where replication is established for the first time. Command continuous replication when the master node synchronizes the current data to the slave node, it becomes the replication establishment process, and then the master node will continuously send write commands to the slave node to ensure that the master and slave data consistency starts 6380, 63816381 execution command 127.0.1 6380Redis5.0.0 > slaveof 127.0.0.1 6380Redis5.0.0 is changed to: replicaof 6380 startup

6381 start

View info replication

Data synchronization

Type description full replication is generally used in the initial replication scenario. The replication function supported by Redis in the early days is only full replication, which will send all the data of the master node to the slave node at once. When the amount of data is large, it will cause great overhead to the master node and the network. Partial replication is used to deal with the data loss scenario caused by network flash breakage in the master-slave replication. When the slave node is connected to the master node again, if the conditions permit. The master node reissues the lost data to the slave node. Because the reissued data is far less than the full data, the excessive overhead of full replication can be effectively avoided.

Copy offset

The parameter describes that the master and slave nodes in which master_repl_offset participates in replication maintain their own replication offsets. After processing the write command, the master node (master) will accumulate the byte length of the command. In the master_repl_offset index of info replication, the slave0 slave node (slave) reports its own replication offset to the master node every second, so the master node will also save the replication offset of the slave node. After receiving the command sent by the master node, the slave node will accumulate its own offset.

Copy backlog buffer

The replication backlog buffer is a fixed-length queue saved on the master node. The default size is 1MB. When the master node has a connected slave node (slave), it is created. When the master node (master) responds to the write command, it will not only send the name to the slave node, but also write to the replication backlog buffer. Since the buffer is essentially a FIFO fixed-length queue, it can save recently replicated data. Data remediation parameters for partial replication and replication command loss describe repl_backlog_active:1 enable replication buffer repl_backlog_size:1048576 buffer maximum length repl_backlog_first_byte_offset:1 starting offset, calculate the current buffer available range repl_backlog_histlen:2301 effective length of saved data master_replid the same master_replid2 of the primary node instance has not been switched, that is, the primary instance has not changed So the initial value is 0

Psync command

Use the psync command from the slave node to complete partial and full copy functions

30227asks for synchronization30227:M M 05 Aug 2019 182asks for synchronization30227:M 44.698 * Replica 127.0.0.1 Partial resynchronization not accepted: Replication ID mismatch (Replica asked for 'e7d71fb600183a175afadbd1354e97edddb2541abath, my replication IDs are 'e24f6e42917e7c162ec45a713bee3872005ee8b'and '00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

6381 print Analysis from Node

31771 seconds31771:S S 06 Aug 2019 12 seconds31771:S 21 seconds31771:S 40.213 * DB loaded from disk: 0.000 seconds31771:S 06 Aug 2019 12 Fraser 21 15 * Before turning into a replica Using my master parameters to synthesize a cached master: I may be able to synchronize with the new master with just a partial transfer.# successfully started 31771Aug 2019 1221 Aug 40.213 * Ready to accept connections# begins to connect the main node 31771Aug 06 Aug 2019 1215 21V 40.214 * Connecting to MASTER 127.0.1 Aug 638 start synchronization 31771R 06 Aug 2019 1221 Aug 40.214 * MASTER REPLICA sync started31771:S 06 Aug 2019 12v 2140.214 * Non blocking connect for SYNC fired the event.31771:S 06 Aug 2019 12 Master replied to PING 21 Master replied to PING 40.214 Replication can continue...# attempts to incrementally synchronize 31771Trying a partial resynchronization * Trying a partial resynchronization * Trying a partial resynchronization (request 668b25f85e84c5900e1032e4b5e1f038f01cfa49:5895). # full synchronization 31771 Trying a partial resynchronization S 06 Aug 2019 12 Trying a partial resynchronization 40.215 * Full resync from master: c88cd043d66193e867929d9d5fadc952954371e5:031771:S 06 Aug 2019 12 Trying a partial resynchronization 40.215 * Discarding previously cached master state.31771:S 06 Aug 2019 12 MASTER REPLICA sync: receiving 224 bytes from master31771:S 06 Aug 2019 12:21:40. 241s241 * MASTER REPLICA sync: Flushing old data31771:S 06 Aug 2019 12 MASTER REPLICA sync: Loading DB in memory31771:S 06 Aug 2019 12 Fraser 21 purge 40.241 * MASTER REPLICA sync: Finished with success

Full copy

Full replication is the earliest replication mode supported by Redis, and it is also a stage that must be experienced when the master and slave establish replication for the first time. The command that triggers full replication is that sync and psync send psync commands for data synchronization. Since this is the first time to replicate, the slave node has no replication offset and the master node runs ID, so the sending psync-1 master node parses the current full replication according to psync-1. Reply + FULLRESYNC response to receive response data from the slave node run ID and offset offset Master node executes bgsave to save the RDB file to the local 31651offset M 06 Aug 2019 11V 08offset 40.802 * Starting BGSAVE for SYNC with target: disk31651:M 06 Aug 2019 11V 08V 40.802 * Background saving started by pid 3167631676C 06 Aug 2019 11L 0840.805 * DB saved on disk31676:C 06 Aug 2019 118V 40.806 * RDB: 0 MB of memory used By copy-on-write31651:M 06 Aug 2019 11 Background saving terminated with success31651:M 08 Background saving terminated with success31651:M 06 Aug 2019 11 Background saving terminated with success31651:M 0814 Synchronization with replica 127.0.1 succeeded master node sends RDB to slave node The slave node saves the received RDB file locally and directly as the data file of the slave node. After receiving the RDB, the slave node prints the related log 31645 RDB S 06 Aug 2019 118 RDB 40.886 * MASTER REPLICA sync: receiving 224 bytes from master during the period from receiving the RDB snapshot from the slave node to the completion of the reception, the master node still responds to the read and write command, so the master node saves the write command data in the replication client buffer. When the slave node loads the RDB file, the master node sends the data in the buffer to the slave node to ensure the data consistency between the master and slave. Redis.conf configuration client-output-buffer-limit replica 256mb 64mb 60 if the master node takes too long to create and transmit RDB, it is very easy to cause the primary node replication client buffer overflow for high-traffic write scenarios. The default configuration is shown above. If the buffer consumption continues to be greater than 64MB or directly exceeds 256MB within 60 seconds, the primary node will directly close the replication client connection, causing full synchronization failure for the primary node. When all the data is sent, it is considered that the full copy is completed. 31651: M 06 Aug 2019 11 succeeded succeeded will empty its old data after receiving all the data from the master node after receiving all the data from the master node. 31645 MASTER REPLICA sync: Flushing old data starts loading RDB files after emptying the data from the node. For larger RDB files, this step is still time-consuming. You can determine the total time taken to load RDB by calculating the time difference between logs: 31645 Aug S 06 Aug 2019 11 MASTER REPLICA sync: Loading DB in memory31645:S 06 Aug 2019 118 RDB 40.886 * MASTER REPLICA sync: after the slave node successfully loads RDB, if the current node enables the bgrewriteaof persistence function, it will immediately do the bgrewriteaof operation, in order to ensure that the AOF persistence file is available immediately after full replication. The reason why full replication is time-consuming: master node bgsave time RDB file network transfer time slave node emptying data time possible AOF rewrite time below Redis 3.0 will be identified M current master node log S is currently slave node log C child process log

Partial replication

Partial replication is mainly an optimization measure made by Redis for the excessive overhead of full replication, which is implemented by using the psync {runId} {offset} command. When the slave node (slave) is replicating the master node (master), if there are abnormal conditions such as network flash or command loss, the slave node will ask the master node to reissue the lost command data, and if the master node's replication backlog buffer memory is directly sent to the slave node, so that the consistency of master-slave node replication can be maintained. This part of the reissued data is generally much smaller than the full data. When the direct network of the master node is interrupted, if the repl-timeout time is exceeded, the master node will consider the slave node to fail and interrupt the replication connection 31767 Connection with replica M 06 Aug 2019 14 Connection with replica 1326.096 # Connection with replica 127.0.0.1 Connection with replica 6381 lost. During the interruption of the master-slave connection, the master node still responds to the command, but the command cannot be sent to the slave node because the replication connection is interrupted. However, the replication backlog buffer inside the master node can still save the write command data for the most recent period of time. The default maximum cache 1MB can be viewed through into replication when the slave node network is restored. The slave node will be connected to the master node again and the slave node will print: 31934Aug S 06 Aug 2019 14Aug 20Aug 54.745 * MASTER REPLICA sync started31934:S 06 54.745 * Non blocking connect for SYNC fired the event.31934:S 06 54.745 * 54.745 * Master replied to PING Replication can continue...31934:S 06 Aug 2019 14 Trying a partial resynchronization 20 replication can continue...31934:S 54.745 * MASTER REPLICA sync (request c88cd043d66193e867929d9d5fadc952954371e5:9996). 31934 request c88cd043d66193e867929d9d5fadc952954371e5:9996 S 06 Aug 2019 14V 20 Aug 54.746 * 54.746 * Successful partial resynchronization with master.31934:S 06 Aug 2019 14 V 20 Trying a partial resynchronization * 54.746 * MASTER REPLICA sync. Master Node print: 31767 asks for synchronization31767:M M 06 Aug 2019 14 asks for synchronization31767:M 21 asks for synchronization31767:M * Replica 127.0.0.1 asks for synchronization31767:M 06 Aug 2019 14V 21V 49.066 * Partial resynchronization request from 127.0.1 asks for synchronization31767:M 6381 accepted. Sending 0 bytes of backlog starting from offset 10066. When the master-slave connection is restored, because the slave node previously saved its own replicated offset and the running ID of the master node. Therefore, they are sent as psync parameters to a master node, requiring a partial copy operation. The slave node corresponds to the log: 31938 request c88cd043d66193e867929d9d5fadc952954371e5:10066 S 06 Aug 2019 14 request c88cd043d66193e867929d9d5fadc952954371e5:10066 21 purl 49.065 * log. After receiving the psync command, the master node first checks whether the parameter runId is consistent with itself, which means that the current master node was copied before. Then the master node is searched in its own replication backlog buffer according to the parameter offset. If the data after the offset is stored in the buffer, a + COUTINUE response is sent to the slave node, indicating that partial replication can be carried out. The reply received from the node is printed as follows: 31938 Aug S 06 Aug 2019 14 Aug 21 Aug 49.066 * MASTER REPLICA sync: Master accepted a Partial Resynchronization. The master node sends the data in the replication backlog buffer to the slave node according to the offset to ensure that the master-slave replication enters a normal state. The amount of data sent can be obtained from the log of the master node: 31767 Aug M 06 Aug 2019 14 asks for synchronization31767:M 21 asks for synchronization31767:M * Replica 127.0.1 asks for synchronization31767:M 06 Aug 2019 14 asks for synchronization31767:M 49.066 * Partial resynchronization request from 127.0.1 asks for synchronization31767:M 6381. Sending 0 bytes of backlog starting from offset 10066.

heartbeat

After the master and slave nodes establish replication, they maintain a long connection and send heartbeat commands to each other to determine the heartbeat judgment mechanism: master and slave nodes all have heartbeat detection mechanism to communicate with each other's client, the connection state of the master node is flags=M, and the connection state of the slave node is flags=S. By default, the master node sends ping commands to the slave node every 10 seconds to judge the viability and connection status of the slave node. The slave node can send replconf ack {offset} commands every other second in the master thread through repl-ping-replica-period 10 to report its current replication offset to the master node. The master node judges the time-out time of the slave node according to the replconf command, which is reflected in the lag information in the info replication statistics. Lag represents the number of seconds of the last communication delay of the slave node, and the normal delay should be between 0 and 1. If the value configured by repl-timeout is exceeded (the default is 60 seconds), it is determined to go offline from the node and disconnect the replication client. Even if the master node decides that the slave node is offline, if the slave node recovers, the heartbeat detection and execution will continue.

Asynchronous replication

The master node is not only responsible for reading and writing data, but also responsible for synchronizing write commands to the slave node. The sending process of the write command is completed asynchronously, that is, the master node directly returns to the client after processing the write command, and does not wait for the slave node to complete the replication.

Separation of reading and writing

For scenarios where the read share is relatively high, the pressure on the master node (master) can be reduced by allocating part of the read traffic to the slave node (slave). At the same time, it is suggested that you should consider using distributed solutions such as Redis Cluster before doing read-write separation.

Summary

The above is the whole content of this article. I hope the content of this article has a certain reference and learning value for everyone's study or work. Thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.