What is the use of master-slave replication, Sentinel and cluster in Redis 07/11 Update SLTechnology News&Howtos

What is the use of master-slave replication, Sentinel and cluster in Redis

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Shulou(Shulou.com)05/31 Report--

This article mainly shows you "what is the use of master-slave replication, Sentinel, and clustering in Redis". It is easy to understand and clear. I hope it can help you solve your doubts. Let me lead you to study and learn this article "what is the use of master-slave replication, Sentinel and clustering in Redis".

1. Master-slave replication 1. Introduction

Master-slave replication is not only the cornerstone of Redis distribution, but also the guarantee of high availability of Redis. In Redis, the replicated server is called the master server (Master), and the server that replicates the master server is called the slave server (Slave).

The configuration of master-slave replication is very simple, and there are three ways (where IP- master server IP address / PORT- master server Redis service port):

Configuration file-- in the redis.conf file, configure slaveof ip port

Command-enter the Redis client to execute slaveof ip port

Startup parameter--. / redis-server-- slaveof ip port

2. The evolution of master-slave replication.

The master-slave replication mechanism of Redis is not as perfect as the 6.x version at the beginning, but is iterated from one version to another. It generally goes through three versions of iteration:

Before 2.8

2.8mm 4.0

After 4.0

With the growth of the version, the master-slave replication mechanism of Redis has been gradually improved, but their nature revolves around two operations: sync and command propagate:

Sync: refers to updating the data state of the slave server to the current data state of the master server, mainly in initialization or subsequent full synchronization.

Command propagation (command propagate): when the data state of the master server is modified (write / delete, etc.) and the data state between the master and slave is inconsistent, the master service propagates the command of data change to the slave server to restore the state between the master and slave servers.

2.1 before version 2.8

2.1.1 synchronization

In versions prior to 2.8, synchronization from the slave server to the master server required a sync command from the server to the master server:

The slaveof ip prot command sent by the client is received from the server, and the slave server creates a socket connection to the master server according to the ip:port

After the socket successfully connects to the master server, the slave server associates a file event handler dedicated to handling replication work for the socket connection to handle subsequent RDB files sent by the master server and propagated commands

Start replication and send sync commands from the server to the master server

After the master server receives the sync command and executes the bgsave command, the child process of the main process fork of the master server generates a RDB file and records all write operations after the RDB snapshot is generated in the buffer

After the execution of the bgsave command, the master server sends the generated RDB file to the slave server. After receiving the RDB file from the server, it will first clean up all its own data, then load the RDB file and update its own data status to the data status of the RDB file of the master server.

The master server sends the write command of the buffer to the slave server, receives the command from the server, and executes it.

Master-slave replication synchronization step completed

2.1.2 Command propagation

When the synchronization work is completed, the master and slave need to maintain the consistency of the data state through command propagation. As shown in the following figure, after the synchronization between the master and slave servers is completed, the master service deletes K6 after receiving the DEL K6 instruction from the client. K6 still exists in the slave server, and the master-slave data state is not consistent. In order to maintain the consistency of the state of the master and slave servers, the master server propagates the commands that cause changes in the state of its data to the slave server, and when the slave server executes the same command, the data state between the master and slave servers will remain consistent.

2.1.3 defects

From the above, we don't see any defects in the master-slave replication of the previous version 2.8, because we haven't considered the fluctuation of the network. Brothers who know about distribution must have heard of CAP theory. CAP theory is the cornerstone of distributed storage system. P (partition network partition) must exist in CAP theory, and Redis master-slave replication is no exception. When a network failure occurs between the master and slave servers, resulting in the inability to communicate between the slave server and the master server for a period of time, when the slave server is reconnected to the master server, if the data state of the master server changes during this period, then there will be data state inconsistencies between the master and slave servers. In master-slave versions of Redis prior to 2.8, the way to resolve this data state inconsistency was by resending the sync command. Although sync can guarantee the consistent state of master-slave server data, it is clear that sync is a very resource-consuming operation.

When the sync command is executed, the resources required by the master and slave server are:

The main server executes BGSAVE to generate RDB files, which will take up a lot of CPU, disk Icano and memory resources.

The master server sends the generated RDB file to the slave server, which takes up a lot of network bandwidth.

Receiving RDB files from the server and loading them will cause the slave server to block and unable to provide services.

As can be seen from the above three points, the sync command will not only reduce the responsiveness of the master server, but also cause the slave server to refuse to provide services during this period.

2.2 version 2.8-4.0

2.2.1 improvement points

For versions prior to 2.8, Redis has improved data state synchronization after reconnecting from the server. The direction of improvement is to reduce the occurrence of full synchronization (full resynchronizaztion) and use incremental synchronization (partial resynchronization) as much as possible. After version 2.8, the psync command is used instead of the sync command to perform synchronization, and the psync command has the functions of both full synchronization and incremental synchronization:

Full synchronization is consistent with the previous version (sync)

In incremental synchronization, different measures will be taken for replication after disconnection and reconnection; if conditions permit, only part of the data missing from the service will be sent.

2.2.2 how to implement psync

In order to achieve incremental synchronization after being disconnected from the server and reconnected, Redis added three auxiliary parameters:

Copy offset (replication offset)

Backlog buffer (replication backlog)

The server runs id (run id)

2.2.2.1 replication offset

A replication offset is maintained within both the master server and the slave server

The master server sends data to the slave service, propagating N bytes of data, and the replication offset of the master service increases N

Receive the data sent by the master server from the server, receive N bytes of data, and increase the replication offset from the server by N

The situation of normal synchronization is as follows:

By comparing whether the replication offset between the master and slave servers is equal, we can know whether the data state between the master and slave servers is consistent. Assuming that Aswab propagates normally and C is disconnected from the server, the following occurs:

It is obvious that with the replication offset, after the slave server C is disconnected and reconnected, the master server only needs to send 100 bytes of data missing from the slave server. But how does the master server know what data is missing from the slave server?

2.2.2.2 copy backlog buffer

The copy backlog buffer is a fixed-length queue that defaults to the 1MB size. When the data state of the master server changes, the master server synchronizes the data to the slave server while saving another copy to the copy backlog buffer.

In order to match the offset, the copy backlog buffer not only stores the data content, but also records the offset for each byte:

When the slave server is disconnected and reconnected, the slave server sends its own replication offset (offset) to the master server through the psync command, and the master server can use this offset to determine whether to carry out incremental propagation or full synchronization.

If the data of the offset offset+1 is still in the copy backlog buffer, then perform an incremental synchronization operation

On the contrary, full synchronization operation is carried out, which is consistent with sync

The size of Redis's copy backlog buffer defaults to 1MB. What should I do if customization is needed? Obviously, we want to use incremental synchronization as much as possible, but we don't want buffers to take up too much memory space. Then we can set the size of the replication backlog buffer by estimating the memory size M of the write commands received by the master server per second after the Redis is reconnected from the service.

S = 2 * M * T

Note that the two-fold expansion here is to leave some leeway to ensure that the vast majority of disconnected reconnections can use incremental synchronization.

2.2.2.3 the server is running ID

When you see here, why do you want to run ID when you think that the incremental synchronization of disconnection and reconnection can be realized above? In fact, there is another situation that is not considered, that is, when the master server goes down, a slave server is elected as the new master server. We can distinguish this situation by comparing and running ID.

Running ID (run id) is 40 random hexadecimal strings automatically generated when the server starts. Both the master service and the slave server generate and run ID.

When the slave server synchronizes data from the master server for the first time, the master server sends its own running ID to the slave server, and the slave server saves it in the RDB file

When the slave server is disconnected and reconnected, the slave server will send the previously saved master server to run ID. If the server runs ID match, it proves that the master server has not changed and can try incremental synchronization.

If the server running ID does not match, perform full synchronization

2.2.3 full psync

The complete psync process is very complex and has been perfected in the master-slave replication version of 2.84.0. The parameters sent by the psync command are as follows:

Psync

When the slave server has not replicated any master server (not the first master-slave replication, because the master server may change, but the first full synchronization from the slave server), the slave server will send:

Psync?-1

A complete psync process is shown below:

Received the SLAVEOF 127.0.0.1 6379 command from the server

Return OK from the server to the command initiator (this is an asynchronous operation, return OK first, and then save address and port information)

Save IP address and port information from the server to Master Host and Master Port

The slave server initiates a socket connection to the master server according to Master Host and Master Port. At the same time, the slave service will not have this socket connection associated with a file event handler dedicated to file replication for subsequent RDB file replication and other tasks.

The master server receives a socket connection request from the slave server, and after creating a corresponding socket connection for the request, it will look at a client from the server (in master-slave replication, the master server and the slave server are actually clients and servers for each other)

After the socket connection is established, the slave server actively sends a PING command to the master service. If the master server returns PONG within the specified timeout period, it proves that the socket connection is available, otherwise it is disconnected and reconnected.

If the master server has a password (masterauth) set, then the slave server sends an AUTH masterauth command to the master server for authentication. Note that if a password is sent from the slave server and the master server does not set the password, the master server will send a no password is set error; if the master server requires a password and the slave server does not send a password, the master server will send a NOAUTH error; if the password does not match, the master server will send an invalid password error.

REPLCONF listening-port xxxx is sent from the slave server to the master server (xxxx represents the port of the slave server). The master server will save the data after receiving the command, and when the client uses INFO replication to query the master-slave information, it can return the data.

Send the psync command from the server. For this step, please take a look at the two cases of psync shown above.

The master server and the slave server are clients to request / respond to data.

Between the master server and the slave server, the heartbeat packet mechanism is used to determine whether the connection is disconnected. The slave server sends a command to the master server every 1 second, REPLCONF ACL offset (replication offset of the slave server). This mechanism ensures the correct synchronization of the data between the master and slave. If the offset is not equal, the master server will take incremental / full synchronization measures to ensure the consistency of the data state between the master and slave (the choice of increment / total depends on whether the offset+1 data is still in the replication backlog buffer).

2.3 version 4.0

There is still some room for improvement in Redis 2.84.0. Can incremental synchronization be performed when the primary server is switched? Therefore, Redis version 4.0 has been optimized for this problem, and psync has been upgraded to psync2.0. Psync2.0 abandons the server running ID and uses replid and replid2 instead, where replid stores the current primary server running ID,replid2 saves the last primary server running ID.

Copy offset (replication offset)

Backlog buffer (replication backlog)

The primary server is running id (replid)

The last primary server was running id (replid2)

Through replid and replid2, we can solve the problem of incremental synchronization when the master server is switched:

If the replid is equal to the running id of the current primary server, then determine the synchronization mode incremental / full synchronization

If the replid is not equal, determine whether the replid2 is equal (whether it belongs to the slave server of the previous master server), if equal, you can still choose incremental / full synchronization, if not equal, you can only do full synchronization.

II. Sentinel1, introduction

Master-slave replication lays the foundation for Redis distribution, but ordinary master-slave replication can not achieve a highly available state. In the ordinary master-slave replication mode, if the master server is down, the master server can only be switched manually by the operation and maintenance staff, which is obviously not desirable. In response to the above situation, Redis officially launched a highly available scheme-Redis Sentinel (Sentinel), which can resist node failures. Redis Sentinel (Sentinel): a Sentinel system consisting of one or more Sentinel instances that can monitor any number of master and slave servers, automatically log off the master server when the monitored master server goes down, and choose to upgrade from the server to the new master server.

The following example: when the offline duration of the old Master exceeds the upper limit set by the user, the Sentinel system will perform a failover operation on the old Master, which consists of three steps:

Select the latest data in Slave as the new Master

Send a new copy instruction to the other Slave so that the other Slave from the server becomes the new Master

Continue to monitor the old Master and set the old Master to the Slave of the new Master if it is online

This paper is based on the following resource list:

IP address node role port 192.168.211.104Redis Master/ Sentinel6379/26379192.168.211.105Redis Slave/ Sentinel6379/26379192.168.211.106Redis Slave/ Sentinel6379/263792, Sentinel initialization and network connection

There is nothing particularly magical about Sentinel. It is a simpler Redis server that loads different command tables and configuration files when Sentinel starts, so Sentinel is essentially a Redis service with fewer commands and some special functions. When a Sentinel starts, it needs to go through the following steps:

Initialize the Sentinel server

Replace the ordinary Redis code with the special code of Sentinel

Initialize Sentinel statu

Initialize the list of master servers monitored by Sentinel based on the Sentinel profile given by the user

Create a network connection to the primary server

Obtain the slave server information according to the master service, and create a network connection to the slave server.

Obtain Sentinel information according to publish / subscribe and create a network connection between Sentinel

2.1 initialize the Sentinel server

Sentinel is essentially a Redis server, so starting Sentinel requires starting a Redis server, but Sentinel does not need to read the RDB/AOF file to restore the data state.

2.2 replace the ordinary Redis code with the special code of Sentinel

Sentinel is used for fewer Redis commands, most of which are not supported on Sentinel clients, and Sentinel has some special features that require Sentinel to replace the code used by the Redis server with Sentinel-specific code at startup. During this time, Sentinel loads a different command table than the normal Redis server. Sentinel does not support commands such as SET, DBSIZE, etc.; instructions such as PING, PSUBSCRIBE, SUBSCRIBE, UNSUBSCRIBE, INFO, etc. are supported on reservation; these instructions provide a guarantee in the work of Sentinel.

2.3 initialize Sentinel statu

After loading Sentinel-specific code, Sentinel initializes the sentinelState structure, which is used to store Sentinel-related state information, the most important of which is the masters dictionary.

Struct sentinelState {/ / current era, failover master server information monitored using uint64_t current_epoch; / / Sentinel / > master server name / / value-> point to the sentinelRedisInstance pointer dict * masters; / /.} sentinel;2.4 initializes the list of master servers monitored by Sentinel

The list of master servers monitored by Sentinel is saved in the masters dictionary of sentinelState, and when the sentinelState is created, initialization of the list of master servers monitored by Sentinel begins.

The key of masters is the name of the main service

The value of masters is a pointer to sentinelRedisInstance

The name of the master server is specified by our sentinel.conf configuration file. For example, the name of the following master server is redis-master (here is the configuration of one master and two slaves):

Daemonize yesport 26379protected-mode nodir "/ usr/local/soft/redis-6.2.4/sentinel-tmp" sentinel monitor redis-master 192.168.211.104 6379 2sentinel down-after-milliseconds redis-master 30000sentinel failover-timeout redis-master 180000sentinel parallel-syncs redis-master 1

The sentinelRedisInstance instance holds the information of the Redis server (master server, slave server, and Sentinel information are all stored in this instance).

Typedef struct sentinelRedisInstance {/ / identity value that identifies the type and status of the current instance. For example, SRI_MASTER, SRI_SLVAE, SRI_SENTINEL int flags; / / instance names the master server configures the instance name for the user, the slave server and Sentinel run the ID char * runid; / / configuration era for the ip:port char * name; / / server, and the failover uses the uint64_t config_epoch; / / instance address sentinelAddr * addr / / the duration of subjective referral sentinel down-after-milliseconds redis-master 30000 mstime_t down_after_period; / / the number of votes required for objective referral sentinel monitor redis-master 192.168.211.104 6379 2 int quorum / / when performing a failover operation, the maximum number of slave servers that can synchronize the new master server sentinel parallel-syncs redis-master 1 int parallel-syncs; / / refresh the failover state sentinel failover-timeout redis-master 180000 mstime_t failover_timeout; / /.} sentinelRedisInstance

According to the above one master and two slaves configuration, you will get the following instance structure:

2.5 create a network connection to the primary server

When the instance structure is initialized, Sentinel will start to create a network connection to Master, and Sentinel will become the client of Master. A command connection and a subscription connection are created between Sentinel and Master:

Command connection is used to obtain master-slave information.

Subscription connections are used to broadcast information between Sentinel. The _ sentinel_:hello channel is subscribed between each Sentinel and the master / slave server it monitors. (note that no subscription connection is created between the Sentinel. They subscribe to the _ sentinel_:hello channel to obtain the initial information of other Sentinel).

After creating the command connection, Sentinel sends INFO instructions to Master every 10 seconds. Through the reply messages of Master, you can gain knowledge in two aspects:

Information of Master itself

Slave information under Master

2.6 create a network connection from the server

Based on the slave server information obtained by the master service, Sentinel can create a network connection to Slave, and command and subscription connections are also created between Sentinel and Slave.

When a network connection is created between Sentinel and Slave, Sentinel becomes the client of Slave, and Sentinel requests Slave for server information through INFO instructions every 10 seconds. At this step, Sentinel obtains the relevant server data of Master and Slave. The more important information is as follows:

Server ip and port

The server is running id run id

Server role role

Server connection status mater_link_status

Slave replication offset slave_repl_offset (required to elect a new Master in a failover)

Slave priority slave_priority

The instance structure information is as follows:

2.7 create a network connection between Sentinel

Is there any doubt at this time about how Sentinel discovers and communicates with each other, which has something to do with subscribing to the _ sentinel_:hello channel between the above Sentinel and the master and slave they monitor. Sentinel subscribes to the _ sentinel_:hello channel with all the Master and Slave it monitors, and Sentinel sends a message to the _ sentinel_:hello channel every 2 seconds, as follows:

PUBLISH sentinel:hello ","

The s code Sentinel,m represents the IP address, the port represents the port, the runid represents the running id, and the epoch represents the configuration era.

Multiple Sentinel will configure the same master server ip and port information in the configuration file, so multiple Sentinel will subscribe to the _ sentinel_:hello channel. The ip and port of other Sentinel can be obtained from the information received through the channel. There are two points to note:

If the obtained runid is the same as Sentinel's own runid, the message is published by itself and discarded directly.

If it is not the same, it means that the received message is published by another Sentinel, and you need to update or add Sentinel instance data according to ip and port.

Subscription connections are not created between Sentinel, they only create command connections:

The instance structure information is as follows:

3. Sentinel work

The main job of Sentinel is to monitor the Redis server and switch new Master instances when the Master instance exceeds the preset time limit. There are many details, which are roughly divided into four steps: detecting whether Master is subjectively offline, detecting whether Master is objectively offline, electing leader Sentinel, and failover.

3.1to detect whether Master is subjectively offline.

Every second, Sentinel sends PING commands to all Master, Slave, and Sentinel in the sentinelRedisInstance instance to determine whether they are still online by responses from other servers.

Sentinel down-after-milliseconds redis-master 30000

In the configuration file of Sentinel, when an instance of Sentinel PING returns an invalid command within the time of continuous down-after-milliseconds configuration, the current Sentinel considers it subjectively offline. The down-after-milliseconds configured in Sentinel's configuration file will adapt to all Master, Slave, and Sentinel in its sentinelRedisInstance instance.

Invalid instructions refer to instructions other than + PONG,-LOADING,-MASTERDOWN, including no response

If the current Sentinel detects that Master is subjectively offline, it will change the flags of its sentinelRedisInstance to SRI_S_DOWN

3.2. check whether Master is offline objectively.

Currently, Sentinel believes that its downline can only be in the subjective offline state. In order to determine whether the current Master is objectively offline, you need to ask other Sentinel, and all think that the sum of subjective or objective offline of Master needs to reach the value configured by quorum. The current Sentinel will mark Master as objective offline.

The current Sentinel sends the following command to other Sentinel in the sentinelRedisInstance instance:

SENTINEL is-master-down-by-addr

Ip: the IP address of the Master judged to be subjectively referenced

Port: the port of the Master judged to be subjectively offline

Current_epoch: the current configuration era of sentinel

Runid: the running id,runid of the current sentinel

Both current_epoch and runid are used in the election of Sentinel. After Master is offline, it is necessary to elect a leading Sentinel to elect a new Master,current_epoch and runid.

The Sentinel that receives the command will check whether the master server is offline according to the parameters in the command. After the check, the following three parameters will be returned:

Down_state: check result 1: offline, 0: not offline

Leader_runid: returns * indicates whether to go offline or not, and returns runid to represent the election leader Sentinel

Leader_epoch: when leader_runid returns runid, the configuration era will have a value, otherwise 0 will always be returned

When Sentinel detects that Master is subjectively offline, current_epoch and runid will be sent when other Sentinel is asked. At this time, current_epoch=0,runid=*

When the Sentinel that receives the command returns the down_state to judge whether the Master is offline or not, down_state = *, leader_epoch=0

3.3 Election Leader Sentinel

Down_state returns 1, which proves that the Sentinel that receives the is-master-down-by-addr command thinks that the Master is also subjectively offline. If the number of 1s returned by down_state (including itself) is greater than or equal to quorum (the value configured in the configuration file), then Master is officially marked as objective offline by the current Sentinel. At this point, Sentinel sends the following instruction again:

SENTINEL is-master-down-by-addr

At this point, the runid will no longer be 0, but the value of Sentinel's own running id (runid), indicating that the current Sentinel wants the other Sentinel that receives the is-master-down-by-addr command to set it as the leader Sentinel. This setting is on a first-come-first-served basis. The Sentinel will set the leader Sentinel to the one who first receives the request for setting. The Sentinel that sends the command will determine whether it is set as the lead Sentinel by the Sentinel based on the results of other Sentinel replies. If the Sentinel is set by other Sentinel to more than half of the number of lead Sentinel (this number can be obtained in sentinelRedisInstance's sentinel dictionary), then Sentinel will think that it has become the lead Sentinel and start subsequent failover work (since half is required, and only one lead Sentinel is set for each Sentinel Then only one leader Sentinel will appear, and if one does not meet the requirements of the lead Sentinel, the Sentinel will be re-elected until the lead Sentinel is generated.

3.4 failover

The lead sentinel will be responsible for the failover, and the lead sentinel needs to do the following:

From the slave of the previous master, select the best slave as the new master

Let other slave become the slave of the new master

Continue to monitor the old master and set it to the slave of the new master if it is online

The most difficult step is that if you choose the best new Master, the lead Sentinel will do the following cleaning and sorting:

Determine whether the slave is offline, and if it is removed from the slave list

Delete a slave that does not respond to sentinel's INFO command within 5 seconds

Delete all slave servers whose disconnection time with the offline master server exceeds down_after_milliseconds * 10

According to the slave priority slave_priority, select the highest priority slave as the new master

If the priority is the same, according to the slave copy offset slave_repl_offset, select the slave with the largest offset as the new master

If the offset is the same, run the id run id sort according to the slave server, and select the slave with the lowest run id as the new master

When a new Master is generated, the lead sentinel sends SLAVEOF ip port commands to other slave servers that have been taken offline (excluding the new Master), making it the slave of the new master.

At this point, the workflow of Sentinel is over, and if the new master goes offline, then the cycle can be done!

3. Cluster 1. Introduction

Redis cluster is a distributed database solution provided by Redis. Clusters share data through sharding. Redis clusters mainly achieve the following goals:

It can still perform well at 1000 nodes and the scalability is linear.

There is no merge operation (multiple nodes do not have the same key), which performs well in the most typical big data values in Redis's data model.

Write security, and the system tries to save all the writes made by clients connected to most nodes. However, Redis cannot guarantee that the data will not be lost completely, and the master-slave replication of asynchronous synchronization will have data loss anyway.

Availability, master node is not available, slave node can replace master node to work.

For those of you who do not have any experience in learning Redis clusters, you are advised to read these three articles (Chinese series): Redis Cluster tutorials.

REDIS cluster-tutorial-- Redis Chinese Information Station-- Redis Chinese user Group (CRUG)

Redis Cluster Specification

REDIS cluster-spec-- Redis Chinese Information Station-- Redis Chinese user Group (CRUG)

Redis3 Master 3 slave pseudo-cluster deployment

CentOS 7 installs Redis Cluster on a stand-alone (3 master and 3 slave pseudo clusters), with only five simple steps _ Li Ziqi's blog-CSDN blog

The following content depends on the following figure 3 master and three slave structures:

Resource list:

Node IP slot (slot) range Master [0] 192.168.211.107:6319Slots 0-5460Master [1] 192.168.211.107:6329Slots 5461-10922Master [2] 192.168.211.107:6339Slots 10923-16383Slave [0] 192.168.211.107Master 6369

Slave [1] 192.168.211.107:6349

Slave [2] 192.168.211.107:6359

Redis cluster. Png

2. Inside the cluster

Instead of using consistent hash, Redis clusters introduce the concept of hash slots. The Redis cluster has 16384 hash slots, and each key uses the 16384 module after CRC16 verification to decide which slot to place. This structure is easy to add or delete nodes. Each node of the cluster is responsible for part of the hash slots. For example, the cluster with the above resource list has 3 nodes, and the slot allocation is as follows:

Node Master [0] contains hash slots 0 to 5460

Node Master [1] contains hash slots 5461 to 10922

Node Master [2] contains hash slots 10923 to 16383

Before delving into the Redis cluster, you need to understand the internal structure of the Redis instances in the cluster. When a Redis service node is configured for yes to enable cluster mode through cluster_enabled, the Redis service node will not only continue to use server components in stand-alone mode, but also add custerState, clusterNode, custerLink and other structures to store special data in cluster mode.

The following three data bearer objects must be carefully looked at, especially the notes in the structure. After reading, you will know how the cluster works in general. Hey, hey, hey.

2.1 clsuterNode

ClsuterNode is used to store node information, such as node name, IP address, port information, configuration era, and so on. The following code lists some very important attributes:

Typedef struct clsuterNode {/ / creation time mstime_t ctime; / / node name, consisting of 40-bit random hexadecimal characters (the same as the server running id in sentinel) char name [redis _ CLUSTER_NAMELEN] / / Node identity, you can identify the role and status of the node / / role-> master node or slave node for example: REDIS_NODE_MASTER (master node) REDIS_NODE_SLAVE (slave node) / / status-> online or offline such as: REDIS_NODE_PFAIL (suspected offline) REDIS_NODE_FAIL (offline) int flags / / Node configuration era, used for failover, similar to the usage in sentinel / / configuration epoch unit64_t configEpoch; / / Node IP address char IP [redis _ IP_STR_LEN] that represents the cluster in clusterState; / / Node port int port; / / connection node information clusterLink * link / / A 2048-byte binary array / / bit array index value may be 0 or 1 / / array index I position value is 0, which means that the node is not responsible for processing slot I / / array index I position value is 1, which means the node is responsible for processing slot i unsigned char slots [16384CP8]; / / record the total number of processing slots of the current node int numslots / / if the current node is a slave node / / points to the master node of the current slave node struct clusterNode * slaveof; / / if the current node is the master node / / the number of slave nodes of the current master node is being copied int numslaves; / / array-record all slave nodes of the current master node struct clusterNode * * slaves;} clsuterNode

What may not be easy to understand in the above code is slots [16384 clusterNode 8]. In fact, it can be simply understood as an array with a size of 16384. A subscript of the array index of 1 indicates that the current slot belongs to the current clusterNode processing, and a value of 0 means it does not belong to the current clusterNode processing. ClusterNode can identify which slots are handled by the current node processing by slots. The slots of the clsuterNode in the initial clsuterNode or unassigned slot cluster is as follows:

Suppose the cluster is like the resource list I gave above, and the slots of the clusterNode representing Master [0] is as follows:

2.2 clusterLink

ClusterLink is an attribute in clsuterNode that stores relevant information needed to connect nodes, such as socket descriptors, input and output buffers, etc. The following code lists some of the very important attributes:

Typedef struct clusterState {/ / connection creation time mstime_t ctime; / / TCP socket descriptor int fd; / / output buffer, messages that need to be sent to other nodes are cached here sds sndbuf; / / input buffer, and messages received and called by other nodes are cached here sds rcvbuf / / other nodes that establish connections with the node represented by the current clsuterNode node are saved here struct clusterNode * node;} clusterState;2.3 custerState

Each node has a custerState structure that stores all the data of the current cluster, such as cluster status, information of all nodes in the cluster (master node, slave node), and so on. The following code lists some very important attributes:

Typedef struct clusterState {/ / current node pointer to a clusterNode clusterNode * myself; / / cluster current configuration era for failover, similar to the use in sentinel unit64_t currentEpoch; / / cluster status online / offline int state; / / the total number of nodes in the cluster handling slots int size / / Cluster node dictionary, all clusterNode including their own dict * node; / / assignment information of all slots in the cluster clsuterNode * slots [16384]; / / for slot reallocation-record the slot clusterNode * importing_slots_from that the current node is importing from other nodes [16384] / / for slot reallocation-record the slot clusterNode * migrating_slots_to [16384] that the current node is migrating to another node; / /...} clusterState

There are three structures that need to be carefully understood in custerState. The first is the slots array. The slots array in clusterState is different from the slots array in clsuterNode. In clusterNode, the slots array records the slots that the current clusterNode is responsible for, while the slots array in clusterState records which clsuterNode is responsible for each slot in the entire cluster. Therefore, when the cluster is working normally, each index of the clusterState slots array points to the clusterNode responsible for the slot, and before the cluster slot is allocated, it points to null.

The figure shows the slots array in the cluster clusterState in the resource list and the slots array in clsuterNode:

The reason for using two slots arrays in a Redis cluster is for performance reasons:

When we need to get which slots clusterNode is responsible for in the whole cluster, we only need to query the slots array in clusterState. If you don't have clusterState's slots array, you need to traverse all the clusterNode structures, which is obviously slower

In addition, the slots array in clusterNode is also necessary, because any node in the cluster needs to know the slots that are responsible for each other, and the nodes only need to transmit the slots array structure in clusterNode to each other.

The second structure that needs to be carefully understood is the node dictionary, which is simple, but all the clusterNode is stored in the node dictionary, which is also the main location where a single node in the Redis cluster gets information about other master nodes and slave nodes, so we also need to pay attention to it. The third structure that needs to be carefully understood is the importing_slots_from [16384] array and migrating_slots_to [16384]. These two arrays need to be used when the cluster is re-shredded and need to be understood. Later, the words here are not in the right order.

3. How to assign 3.1slots (slot) to cluster work?

The Redis cluster has a total of 16384 slots. As shown in the resource list above, in a cluster with three masters and three slaves, each master node is responsible for its own slot, but during the deployment of the above three masters and slaves, I did not see that I assigned slots to the corresponding master node. This is because the Redis cluster itself has divided slots for us, but what if we want to assign slots ourselves? We can send the following command to the node to assign one or more slots to the current node:

CLUSTER ADDSLOTS

For example, if we want to assign 0 and 1 slots to Master [0], we just need to send the following command to the Master [0] node:

CLUSTER ADDSLOTS 0 1

When a node is assigned a slot, it will update the slots array of clusterNode, and the node will send the slot it is responsible for handling, that is, the slots array, to other nodes in the cluster through a message, and the other nodes will update the slots array of the corresponding clusterNode and the solts array of clusterState after receiving the message.

3.2 how is ADDSLOTS implemented within the Redis cluster?

This is actually relatively simple. When we send a CLUSTER ADDSLOTS command to a node in the Redis cluster, the current node will first use the slots array in clusterState to confirm whether the slot assigned to the current node is not assigned to another node. If it has been assigned, an exception will be thrown directly and an error will be returned to the assigned client. If none of the slots assigned to the current node are assigned to other nodes, the current node assigns those slots to itself. There are three main steps in assigning:

Update the slots array of clusterState to point the specified slot slots [I] to the current clusterNode

Update the slots array of clusterNode to update the value at the specified slot slots [I] to 1

Send messages to other nodes in the cluster and send the slots array of clusterNode to other nodes. After receiving the message, other nodes also update the corresponding slots array of clusterState and the slots array of clusterNode.

3.3 with so many nodes in the cluster, how does the client know which node to request?

Before you understand this problem, you need to know one thing: how does the Redis cluster calculate which slot the current key belongs to? According to the official website, Redis does not actually use the consistent hash algorithm, but selects the 16384 module after each requested key is verified by CRC16 to decide which slot to place.

HASH_SLOT = CRC16 (key) mod 16384

At this point, when the client connection sends a request to a node, the node that currently receives the command will first calculate the slot I to which the current key belongs through the algorithm. After calculation, the current node will determine whether it is responsible for the slot I of the clusterState. If it happens to be responsible, then the current node will respond to the client's request. If the current node is not responsible, it will go through the following steps:

The node returns a MOVED redirection error to the client. In the MOVED redirection error, the calculated ip and port of the clusterNode that correctly handles the key will be returned to the client.

When the client receives the MOVED redirection error returned by the node, it forwards the command to the correct node according to ip and port. The whole process is transparent to the programmer and is completed by both the server and the client of the Redis cluster.

3.4 what if I want to reassign slots that have been assigned to node A to node B?

This problem actually includes a lot of problems, such as removing some nodes in the Redis cluster, adding nodes, etc., can be summarized as moving the hash slot from one node to another. And the Redis cluster is very powerful here, it supports online (non-downtime) distribution, that is, officially called cluster online reconfiguration (live reconfiguration).

Before we implement it, let's take a look at the CLUSTER instruction. Once the instruction is ready, the operation will work:

CLUSTER ADDSLOTS slot1 [slot2]... [slotN]

CLUSTER DELSLOTS slot1 [slot2]... [slotN]

CLUSTER SETSLOT slot NODE node

CLUSTER SETSLOT slot MIGRATING node

CLUSTER SETSLOT slot IMPORTING node

The instructions used by CLUSTER for slot allocation are mainly as above. ADDSLOTS and DELSLOTS are mainly used for rapid slot allocation and fast deletion, which are usually used only when the cluster is just established. CLUSTER SETSLOT slot NODE node is also used to assign slots directly to specified nodes. If the cluster has been established, we usually use the last two to redistribute, the meaning of which is as follows:

When a slot is set to MIGRATING, the node that used to hold the hash slot will still accept all requests related to the hash slot, but only if the query key still exists in the original node, the original node will process the request, otherwise the query will be forwarded to the migrated target node through a-ASK redirect (- ASK redirection).

When a slot is set to IMPORTING, the node accepts all requests to query the hash slot only after receiving the ASKING command. If the client never sends an ASKING command, the query is forwarded to the node that actually handles the hash slot through the-MOVED redirection error.

Do the above two sentences feel difficult to understand? this is an official description. If you don't understand, I'll give you a popular description. The whole process is roughly as follows:

Redis-trib (the cluster management software redis-trib will be responsible for the slot allocation of the Redis cluster), send CLUSTER SETSLOT slot IMPORTING node commands to the target node (slot import node), and the target node will be ready to import slots from the source node (slot export node).

Redis-trib immediately sends a CLUSTER SETSLOT slot MIGRATING node command to the source node, and the source node prepares for slot export.

Redis-trib immediately sends a CLUSTER GETKEYSINSLOT slot count command to the source node. After receiving the command, the source node will return the keys belonging to slot slot, with a maximum of count keys.

Redis-trib sends MIGRATE ip port key 0 timeout commands to the source node in turn according to the keys returned by the source node, and if the key is in the source node, it will migrate to the target node.

After the migration is completed, redis-trib sends a CLUSTER SETSLOT slot NODE node command to a node in the cluster. After receiving the command, the node updates the clusterNode and clusterState structure, and then the node uses the assignment information of the message dissemination slot. At this point, the migration of the cluster slot is completed, and the other nodes in the cluster also update the new slot allocation information.

3.5 what if the slot to which the key accessed by the client belongs is being migrated?

Excellent you always think of this kind of concurrency, cowboy! Guys!

This question has also been considered officially. Remember when we were talking about the structure of clusterState? Importing_slots_from and migrating_slots_to are used to deal with this problem.

Typedef struct clusterState {/ /... / for slot reallocation-record the slot clusterNode * importing_slots_from that the current node is importing from other nodes [16384]; / / for slot reallocation-record the slot clusterNode * migrating_slots_to [16384] that the current node is migrating to other nodes; / /...} clusterState

When a node is exporting a slot, it is set to point to the corresponding clusterNode at the subscript corresponding to the migrating_slots_to array in clusterState, and the clusterNode points to the imported node.

When a node is importing a slot, it is set to point to the corresponding clusterNode at the subscript corresponding to the importing_slots_from array in clusterState, and the clusterNode points to the exported node.

With the above two mutual arrays, you can tell whether the current slot is migrating, where it is coming from and where it is going. Isn't it that simple to be funny?

At this point, back to the question, if the key requested by the client happens to belong to the slot being migrated. Then the node that receives the command will first try to find the key key in its own database. If the slot has not been migrated and the current key has not been migrated, just respond to the client's request directly. If the key is gone, the node will query the index slot corresponding to the migrating_slots_to array. If the value at the index is not null, but points to a clusterNode structure, the key has been migrated to the clusterNode. Instead of continuing to process instructions, the node returns the ASKING command, which also carries the ip and port corresponding to the import slot clusterNode. The client needs to redirect the request to the correct node after receiving the ASKING command, but there is one thing to note here * * (so I put a meme here to facilitate the reader's attention). **

As mentioned earlier, when a node finds that the current slot does not belong to its own processing, it will return a MOVED instruction, so what happens when migrating slots? This Redis cluster is played in this way. When the node finds that the slot is migrating, it returns the ASKING command to the client, and the client receives the ASKING command, which contains the nodes ip and port of the clusterNode in which the slot is migrated. Then the client will first send an ASKING command to the clusterNode that moves in. The purpose of this command must be to tell the current node that you want to handle this request as an exception, because the slot has been migrated to you, so you cannot refuse me directly. (therefore, if Redis does not receive the ASKING command, it will directly query the node's clusterState, and the slot being migrated has not been updated to clusterState, so you can only return MOVED directly. In this way, it will be cycled many times. The node that receives the ASKING command enforces the request once (only once, and the next time you need to send the ASKING command in advance).

4. Cluster failure

Redis cluster failure is relatively simple, which is similar to the way in which the master node in sentinel goes down or fails to respond within the specified maximum time, and the new master node is re-elected from the slave node. Of course, the premise is that every master node in the Redis cluster, we set up the slave node in advance, or hey. Not a chance. The general steps are as follows:

In a working cluster, each node sends PING commands to other nodes periodically. If the node receiving the command does not return a PONG message within the specified time, the current node will set the flags of the clusterNode of the receiving node to REDIS_NODE_PFAIL,PFAIL, which is not offline, but is suspected to be offline.

The cluster node will inform other nodes of the status information of each node in the cluster by sending messages.

If more than half of the primary nodes responsible for processing slots in the cluster set a primary node as suspected offline, then the node will be marked offline, and the node will set the flags of the clusterNode of the node receiving the command to REDIS_NODE_FAIL,FAIL to indicate that it has been offline.

The cluster node informs other nodes of the status information of each node in the cluster by sending messages. When the slave node of the offline node finds that its master node has been marked as offline, then it is time to come forward.

The slave node of the offline master node will elect a slave node as the latest master node, and execute the selected node to point to the SLAVEOF no one to become the new master node.

The new master node will cancel the slot assignments of the original master node and modify these slot assignments to themselves, that is, to modify the clusterNode structure and clusterState structure.

The new master node broadcasts a PONG instruction to the cluster, and other nodes will know that a new master node is generated and update the clusterNode structure and clusterState structure.

If the new master node sends a new SLAVEOF instruction to the remaining slave node of the original master node, making it its own slave node

Finally, the new master node will be responsible for the response of the slot of the original master node.

These are all the contents of this article entitled "what is the use of master-slave replication, Sentinel and clustering in Redis?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.