In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article introduces the relevant knowledge of "how to achieve the important features of writing security in Redis Cluster". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
I. Interface and architecture
1. Interface
The interface of Redis Cluster is basically forward compatible and is still of key-value type.
2. Architecture
Redis Cluster contains two components: server and client. A Redis Cluster can contain multiple server and can contain multiple clients. Each client can connect to any server to read and write data. Data stored in Redis Cluster is divided into multiple parts, dispersed in multiple server, and multiple copies of each piece of data are saved.
II. Realization
1. Node
In Redis Cluster, data is stored in multiple Redis server, each Redis server is a separate process, with independent IP and Port, also known as an instance, or Node. Client connects to the Node through this IP and Port.
Each node has a node id,node id that is a globally unique identity that is randomly generated when the cluster is created. The id of a node will never change, but the IP and port of node can be changed.
2 、 Node table
The information about which nodes a Redis Cluster contains, that is, the id of which nodes the cluster contains, that is, cluster membership, is stored in a table structure, which is called node table. Node table is similar to:
Node table keeps a copy on each node, and Redis Cluster copies node table to all nodes through the gossip protocol. We will continue to talk about node table replication later.
3. Change of cluster membership
When you add a node or delete a node, you only need to send the command to any node in the cluster, which will modify the local node table, and the change will eventually be copied to all nodes. The command to add a node is CLUSTER MEET, and the command to delete a node is CLUSTER FORGET.
Give an example to illustrate:
You are going to build a cluster of three master nodes. Before the cluster was created, the node table of all three nodes contained only itself. Send a command to one of the nodes A, CLUSTER MEET NodeB, Node A modifies its own node table, adds NodeB to its own node table, and connects Node B, sends its own node table to Node B. Node B receives the node table sent by Node An and updates its own node table. Then Node B knows that there is still Node An in the cluster.
At this time, send CLUSTER MEET NodeC to node A, node A will add node C to its own node table, and copy its own node table to node B. node B will update its local node table with the received node table, so as to know the addition of node C. Similarly, node A will send its own node table to node C, and node C will update its local node table to know that node An and node B already exist in the cluster to be joined.
4. Slot
As mentioned earlier, Redis Cluster will divide the data into multiple parts, that is, dividing the data into pieces. Each piece of data in a Redis Cluster is called a Slot. Redis Cluster splits the data into 16384 pieces, which means there are 16384 slots.
Redis Cluster uses the Hash mechanism to split the data. First, the key of the data calculates a hash value through the CRC16 algorithm. This hash value takes the remainder of 16384, and the remainder is the slot, which is called hash slot. Specific CRC16 algorithms can be found in the Redis official documentation. All key with the same remainder are in one slot, that is, a slot is actually a batch of key with the same hash remainder.
Each hash slot is saved in a node of the Redis Cluster. Which hash slot is stored in which instance forms a data structure similar to a map, which is called hash slot map. Hash slot map is similar to:
0-> NodeA 1-> NodeA 2-> NodeB... 16383-> NodeN
Like node table, hash slot map keeps a copy on each node, and Redis Cluster copies hash slot map to all nodes through the gossip protocol. Again, the replication of hash slot map will be discussed later.
"Redis official documentation" Redis Cluster Specification, https://redis.io/topics/cluster-spec.
5. Data fragmentation change
To modify the data sharding relationship, you can connect to any node and send CLUSTER ADDSLOTS, CLUSTER SETSLOT, CLUSTER DELSLOT commands to this node to modify the hash slot map on this node. The node will copy the change to all other nodes, and the other nodes will update their hash slot map with the received hash slot map.
The CLUSTER ADDSLOTS, CLUSTER DELSLOTS, and CLUSTER SETSLOT commands are used as follows:
CLUSTER ADDSLOTS slot1 [slot2]... [slotN] CLUSTER DELSLOTS slot1 [slot2]... [slotN] CLUSTER SETSLOT slot NODE node
CLUSTER ADDSLOTS is used to assign multiple slot to the currently connected node. For example, connect to node A to execute:
CLUSTER ADDSLOTS 1 2 3
This command assigns slot1, slot2, and slot3 to node A.
CLUSTER DELSLOTS is used to remove multiple slot from the currently connected node. For example, connect to node A to execute:
CLUSTER DELSLOTS 1 2 3
This command removes slot1, slot2, and slot3 from node A.
CLUSTER SETSLOT is used to assign a slot to a specified node, which may not be the currently connected node. In addition, this command can also set the MIGRATING and IMPORTING states, which we will talk about later. For example, connect to node A to execute:
CLUSTER SETSLOT 1 nodeB
This command assigns slot1 to node B.
6 、 Slave
Multiple copies of a slot will be saved, that is, a slot will be saved on multiple nodes, that is, the slot will be copied to multiple nodes. The replication of Redis Cluster is on a node-by-node basis, and all slot on a node will have the same replication.
Specifically, one of the nodes is responsible for handling all slot writes on this node, which is called master, and the rest of the nodes are called slave nodes. A master can have more than one slave. All writes to all slot on the same node are replicated asynchronously from the master node to all slave nodes. So slave will have the same slot as master.
Set the slave node through the SLAVEOF command. The SLAVEOF command is used to change the replication settings of a slave node. The SLAVEOF command has two formats:
SLAVEOF NO ONE
SLAVEOF host port
Specifically, the SLAVEOF NO ONE command stops the replication of a slave node and turns the slave node into a master node. The SLAVEOF host port command stops replication of a slave node, discards the dataset, and starts replication from the new master node specified by host and port.
The relationship between master and slave is recorded in hash slot table, which means that a slot is mapped to multiple nodes, one of which is master and the other is slave.
The hash slot map with master/slave information is similar to:
0-> NodeA,NadeA1 (slave) 1-> NodeA,NadeA1 (slave) 2-> NodeB,NadeB1 (slave)... 16383-> NodeN,NadeN1 (slave)
As part of hash slot map, master/slave information is also replicated to each node in the cluster through the gossip protocol.
7 、 Configuration
Node table and hash slot map, which are called Configuration, are also called metadata in other distributed systems.
As we mentioned earlier, hash slot map and node table keep a copy at each node, and any changes to these two messages are propagated (propogate) to all nodes through the gossip protocol. In this article, we will not continue to expand the gossip protocol, students who are interested in the implementation of Redis Cluster's gossip protocol can refer to the redis documentation.
In some distributed systems, metadata is stored in a separate architectural component. Redis Cluster does not have such a component for metadata storage, but instead stores the metadata on all nodes.
Hash slot map and node table are created when the cluster is created, and will be followed by subsequent cluster changes (such as failover and operation and maintenance operations such as capacity expansion and reduction, which will be described later). Changes are initiated on one Node and propagated to all other nodes through the gossip protocol. These two pieces of information will be saved on all nodes and will eventually be consistent.
8. Cluster creation
When you create a Redis Cluster, you first need to run multiple redis server,redis server in cluster mode, and the node id for these server has been generated. But these redis server do not form a cluster, that is, server do not know each other's existence. Next, run the CLUSTER MEET command to form a cluster of these nodes. But at this time, the cluster is still not running, and the slot needs to be allocated. After the slot is assigned to a specific node through the CLUSTER SLOTADD command, the cluster can process client commands.
9. Read and write operations on the client
When the client wants to read and write data, although client can connect to any server, in practice, client needs to connect to server to read and write data according to the actual needs. Client needs to calculate the hash slot according to the key and connect to the node responsible for the hash slot to read and write. In that case, client needs to know the mapping of hash slot-> node, that is, hash slot map.
As mentioned earlier, the hash slot map is saved on each node on the server side. Client can obtain hash slot map from any node and cache it locally to client. The next operation is performed directly according to the local cache, but the cache information expires. If client finds that hash slot map changes (that is, server returns an error when client reads and writes data, which will be described in more detail next), it will re-obtain the new hash slot map from the server. Through hash slot map, you can determine which node a key should exist on, and client connects to this node for read and write operations.
10 、 MOVED Redirection
Hash slot map will change, and these changes will be replicated to all nodes, but gossip guarantees that they will eventually be copied to all nodes, plus client will cache hash slot map,client and may send a key request to the wrong node for processing.
After receiving the request, the wrong node finds that the key should not be handled by itself and will return an error of MOVED to the client. In the error message, it will tell the client which node should be responsible for the slot. When Client receives the MOVED message, it sends the request again to the node specified in the message.
Client can update the node information of the slot to the locally cached hash slot map, but a better way is to retrieve the complete hash slot map and replace the local cache. Because in most cases, changes in hash slot map do more than just modify a slot.
Although client resends the request based on the node information in the MOVED message, client may still receive a MOVED error message from the new node again, because the hash slot map of the previous node may not be up-to-date. But because hash slot map will eventually be consistent on all nodes, client will eventually get the latest hash slot map after receiving MOVED errors several times.
11 、 Failover 、 currentEpoch 、 lastVoteEpoch
When the master goes down, Redis Cluster will choose a slave to replace the master.
If more than one slave exists, each slave may find that the master is down and try to turn itself into a master. If more than one slave becomes a master, then these new master will update the local hash slot map, update the slot responsible for the old master to themselves, and propagate their own updates to the hash slot map to other nodes. This results in differences in hash slot map between nodes.
As a result, because the connected nodes are different, the client gets different hash slot map, and for the same slot, different client will connect different nodes, resulting in differences in data on the nodes. So failover needs to ensure that only one slave is selected as the new master.
Redis Cluster uses techniques similar to the Raft algorithm to prevent multiple slave from being selected as master. Each node has two values called currentEpoch and lastVoteEpoch. When the cluster was first created, the currentEpoch of each node was 0.
When slave discovers that master is down, the slave adds currentEpoch (that is, currentEpoch++). And send FAILOVER_AUTH_REQUEST requests to all master with their own currentEpoch,master to receive FAILOVER_AUTH_REQUEST. If the currentEpoch in the request is larger than their own currentEpoch and lastVoteEpoch, record the currentEpoch value in the request to their own currentEpoch and lastVoteEpoch, and reply FAILOVER_AUTH_ACK to slave with the currentEpoch of master in the reply.
So you can see that the Epoch in FAILOVER_AUTH_ACK must be the same as the currentEpoch in slave. When Slave receives a FAILOVER_AUTH_ACK from most master, it becomes master.
The above process ensures that only one slave is selected, and let's give an example. Of the five master clusters, the master nodes are A, B, C, D, and EMagol A nodes have two slave, which are A1 and A2 respectively. Node A goes down, A1 increases its own currentEpoch=5 (4x1), A1 sends FAILOVER_AUTH_REQUEST to all master, nodes B, C, D receive FAILOVER_AUTH_REQUEST, update their currentEpoch and lastVoteEpoch to 5, and reply to A1 to FAILOVER_AUTH_ACK,A1 to win the election and become the new master. But at the same time, A2 also found that A was down and tried to elect master. A2 increases its own currentEpoch=5 (4x1), A2 sends FAILOVER_AUTH_REQUEST to all master, but at this time the lastVoteEpoch of B, C, D is already 5, so B, C, D will not reply to A2, E has not received the request of A1, so only E will reply to A2, but can not form a majority, so A2 can not be called master.
The process described above ensures that only one master is selected, but in addition, Redis Cluster optimizes that master will not send replies to other slave of this master after replying to a request.
12 、 Configuration epoch
After the failover is completed, the new master will modify the hash slot map, change the node of the corresponding slot record to itself, and propagate the change to the hash slot map to other nodes.
Although currentEpoch and lastVoteEpoch can guarantee that only one node in each failover can be selected as the new master, there are two failover times, and two different master may be selected, but the propagation of their changes to the hash slot map is asynchronous, that is, the later failover changes may arrive at a node before the first failover changes, resulting in inconsistencies in the information of hash slot map among nodes.
Redis Cluster solves this problem through configEpoch. Each node holds a value of configEpoch. This is equivalent to a column of data in node table called configEpoch, similar to the following table:
After each failover completion, the newly selected master overwrites the configEpoch with currentEpoch. The mechanism of Failover ensures that the new master of the two failover must have different currentEpoch, and the currentEpoch of the latter failover must be larger than the previous one. This ensures that even if a propagation protocol such as gossip is adopted, the last hash slot map change to failover will take effect, that is, a larger change to configEpoch will take effect, and eventually the hash slot map on all nodes will be consistent.
The configEpoch of a Slave node is the configEpoch of its master.
Since the hash slot map guaranteed by gossip is ultimately consistent, it is possible that the hash slot map of slave is older than master,failover and cannot be changed based on the old hash slot map, so there is an additional rule to follow in the previous failover process:
The configEpoch of the slave node is carried in the FAILOVER_AUTH_REQUEST. If the configEpoch of this slave is smaller than any one of the master configEpoch of all the slot that the slave is responsible for, then the master will not reply FAILOVER_AUTH_ACK to the slave.
13 、 Resharding
After the creation of the cluster, we will also have operation and maintenance requirements such as expansion, reduction and balancing of Redis cluster, which can essentially be solved by Resharding operation, which is to redistribute slot among nodes and transfer slot from one node to another. In Redis Cluster, the expansion requirement is essentially adding a new node, and then assigning some slot to the new node.
In essence, the scale-down requirement is to assign all slot on this node to other nodes, and then remove this node from the cluster. When the traffic between nodes is uneven, we have the need of balancing. Balancing is to allocate some slot on the nodes with large traffic to the nodes with less traffic.
The Resharding operation can be an adjustment to the entire hash slot map, that is, it can include the migration of multiple slot, which is the migration of a slot from one node to another. A slot migration operation includes the hash slot map changes described earlier, as well as a migration operation for key. To migrate a slot to another node, first migrate all the key on the slot to this node, when all the key has been migrated, then make the hash slot map change, when the hash slot map change is complete, this slot migration ends.
Redis Cluster uses CLUSTER SETSLOT to set up the migration. For example, migrate slot1 from node A to node B. Execute the following commands for node An and node B, respectively:
On node A: CLUSTER SETSLOT 1 MIGRATING NODEB
On node B: CLUSTER SETSLOT 1 IMPORTING NODEA
Where MIGRATING indicates that the data is moving out of this node, and IMPORTING means that the data is moving in to this node.
After executing these two commands, the slot1 in node An is no longer creating a new key. A special program called redis-trib is responsible for migrating all key from node A to node B. Redis-trib executes the following command:
CLUSTER GETKEYSINSLOT slot count
This command returns count key, and for each returned key,redis-trib, execute the following command:
MIGRATE target_host target_port key target_database id timeout
This command atomically migrates an key from node A to node B. Specifically, the MIGRATE command connects to the target node and sends the key to the target node. Once the target node receives the key, it deletes the key from its own database, and in the process, both node An and node B add locks.
After all the key has been migrated, execute the following command on both nodes:
CLUSTER SETSLOT slot NODE nodeA
It usually takes some time to migrate all the key, that is, after the migration starts and before the migration is completed, during this window period, the actual distribution of key is inconsistent with that recorded in hash slot map. Client accesses key according to hash slot map, and an error will occur.
Redis Cluster solves this problem through ASK redirection. According to the key of the hash slot map,slot1 on the client side, it will be sent to node A. After receiving this request, node A will reply to client if the key has been migrated to node B. after receiving the ASK redirection, it will send an ASKING command to node b, and then send a request for the key.
14. Actual storage of Configuration
Hash slot map and node table are both logical structures, and their actual storage structures in Redis Cluster are slightly different (see references 1, 2, 3, 4 at the end for details).
In the memory of the node, two variables are used to store the two information:
Myself variable: myself represents this node and is a variable of type ClusterNode. This variable contains the configEpoch of this node, as well as slaveof. If it is a slave node, it records its master node in slaveof, as well as a bitmap, which represents the slot values of all slot for which this node is responsible. This bitmap consists of 2048 byte. One is always 16384 (204858) bit. Each bit represents a slot,bit setting 1, which represents the node responsible for this slot.
Cluster variable: represents the state of the cluster, it contains the currentEpoch, lastVoteEpoch, and slots arrays, the index of the slots array represents slot, and each member of the array points to a node, which is a variable of type ClusterNode, the same type as the myself variable.
All Configuration changes are saved to disk, specifically to a file called node.conf, which is written by Redis Cluster and does not require manual configuration.
Node.conf is saved according to the node dimension. Each row corresponds to a node, and each row contains this information: id,ip:port,flag,slaveof,ping timestamp, pong timespamp,configEpoch,link status,slots.
After all the nodes are finished, the curruntEpoch and lastVoteEpoch variables are saved at the end of the file. Where the flag field is an enumerated type, which indicates whether the node is its own and whether the node type is master or slave.
If it is a slave node, the id of its master node is recorded in the slaveof field. In the case of a master node, there is an additional slots field at the end to record which slot the node is responsible for. The Flags field also records other very heavy
The desired state, this article will not continue to expand.
Similarly, the three fields ping timestamp, pong timestmap, and link staus will not be expanded in this article.
The specific node.conf file is similar to the following example:
[root@10.112.178.141 data] # cat nodes-6384.conf fb763117270d14205c41174605b15741co03a945 10.112.178.174slave 5e35bda1a44c8d781eb54e08be88a3bab42070f3 01596683852819 2 connected 3dc5890fb1591e3b20196f81eb5f2f99754253e8 10.112.178.141 connected 0-5461 f1967b687c9b2c27108cce08517e98e7a80d5e7e 10.112.178.171 slave 3dc5890fb1591e3b20196f81eb5f2f99754253e8 01596683850813 1 connected 2bbab7353e973e991566df3bb52afb4857a7bf25 10.112.178.171148383 slave 1f0a8cf1bfd0c915ef404482f3dc6bf5c7cf41f5 01596683848812 3 connected 5e35bda1a44c8d781eb54e08be88a3bab42070f3 10.112.178.142Vol 6383 master-015966838813 2 connected 5462-10923 1f0a8cf1bfd0c915ef404482f3dc6bf5c7cf41f5 10.112.178.141Fran 6384 myself,master-003 connected 10924-16383
When the node starts, it reads the node.conf file and loads the information into the variables myself and cluster. The Slot information is converted to bitmap and saved in the myself variable. And the slot information is inversely converted into a slot-to-node mapping saved in the cluster variable.
Hash slot map change or node table change is to modify the myself variable and cluater variable in memory, and each change will serialize the two variables and save them to node.conf.
15. View configuration
Redis Cluster provides two commands to view configuration:
The first is the CLUSTER SLOT command, which is used to display the information of the hash slot dimension. The CLUSTER SLOT command is shown as follows:
127.0.0.1 Grou7000 > cluster slots 1) 1) (integer) 5461 2) (integer) 10922 3) 1) "127.0.0.1" 2) (integer) 7001 4) 1) "127.0.0.1" 2) (integer) 7004 2) 1) (integer) 0 2) (integer) 5460 3) 1) "127.0.0.1" 2) (integer) 7000 4) 1). 0.0.1 "2) (integer) 7003 3) 1) (integer) 10923 2) (integer) 16383 3) 1)" 127.0.0.1 "2) (integer) 7002 4) 1)" 127.0.1 "2) (integer) 7005
The second is the CLUSTER NODE command, which is used to display the information of the node table dimension. The CLUSTER NODE command is shown as follows:
$redis-cli cluster nodes d1861060fe6a534d42d8a19aeb36600e18785e04 127.0.0.1 connected 6379 myself-0 1318428930 1 connected 0-1364 3886e65cc906bfd9b1f7e7bde468726a052d1dae 127.0.0.1 connected 6380 master-1318428930 1318428931 2 connected 1365-2729 d289c575dcbc4bdd2931585fd4339089e461a27d 127.0.0.1 redis-cli cluster nodes d1861060fe6a534d42d8a19aeb36600e18785e04 127.0.0.1 master-1318428931 1318428931 3 connected 2730-4095
CLUSTER NODE and CLUSTER SLOT commands can be connected to any node and executed. Both commands read the local information of this node. According to the characteristics of gossip, it is possible that these two commands do not show the latest configuration.
16 、 Conflict
Although the failover process mentioned earlier ensures that only one slave is selected by most master votes, and produces a unique configEpoch. But the Resharding process did not go through a majority of master votes.
When you perform a slot migration, you simply add one to the largest configEpoch of all the configEpoch in the cluster. And because Resharding generally includes multiple slot migrations, Redis cluster's current practice is that during a single resharding, all slot migrations use the same configEpoch as the first slot migration.
Both failover and resharding modify hash slot map, and if failover occurs in the process of resharding, it may lead to conflicts with changes to hash slot map. In addition, manual failover is not voted on by master, that is, the CLUSTER FAILOVER command (with the TAKEOVER parameter) is executed.
Conflict means that the same slot,slot is modified to map to different nodes, and these modifications have the same configEpoch.
In order to solve this problem, Redis cluster needs to have a conflict resolution mechanism. If a master finds the same configEpoch, compare the smaller id,id nodes of the two nodes and add their own currentEpoch as their own configEpoch.
III. Write Safety
Due to the existence of conflicts, the hash slot map on different nodes may be inconsistent. Depending on the connected nodes, some client may write the key of one slot to one node, while the other part of client may write the key of the same slot to another node. When the conflict is resolved, writes accepted on one of the nodes are lost.
In addition, because the data replication between master and slave is asynchronous, if failover occurs when slave does not receive the latest data during failover, this part of the write will be lost. Redis cluster has made an optimization in this respect. When a slave discovers that master has experienced a downtime, it will not immediately start the election process. It will wait for a time, which is calculated as follows:
DELAY = 500milliseconds + random delay between 0 and 500milliseconds + SLAVE_RANK * 1000 milliseconds
In this formula,
The first part is a fixed time of 500ms, which is to give master enough time and discover the fact that the node is down.
The second is to wait for a random period of time, which is to prevent multiple slave from finding master downtime at the same time, and then start the election at the same time, resulting in the master being carved up and all the elections unsuccessful.
The third part is that the rank,rank of slave mainly depends on the replication progress of the slave. The more data is copied, the smaller the rank, that is, the shorter the waiting time, the more likely it is to start the election first, and to be chosen as the new master. But this is only an optimization and does not completely prevent the possibility of data loss.
This is the end of the content of "how Redis Cluster implements the important features of write security". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.