Distributed consistency algorithm Raft 04/20 Update SLTechnology News&Howtos

Distributed consistency algorithm Raft

2025-04-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Since it was proposed in 1990, Paxos has almost become a synonym for distributed consistency algorithms for a long time. However, because it is difficult to understand and implement, there are only several well-known implementations such as Chubby, Zookeeper and libpaxos, among which ZAB used by Zookeeper has made a lot of improvements to Paxos. To this end, Stanford's Diego Ongaro and John Ousterhout in 2013 proposed a new consistency algorithm that is easier to understand and implement, namely Raft.

Both Raft and Paxos can serve as long as they ensure the normal performance of the nCompact 2x1 node. Compared with Paxos, its advantage is that it is easy to understand and implement. Raf divides the algorithm into several sub-problems, such as selecting leader, log replication, security and so on. Its process is as follows: at the beginning, Leader is elected in the cluster to be responsible for the management of log replication, Leader receives transaction requests (logs) from clients, copies them to other nodes in the cluster, and then notifies other nodes in the cluster to submit logs, and Leader is responsible for ensuring that other nodes synchronize with its logs. When the Leader goes down, the other nodes of the cluster re-initiate the election to select the new Leader.

Role

Raft involves three roles:

Leader: the leader, responsible for handling requests from the client, managing log replication, and maintaining a heartbeat with Follower to maintain its leadership position.

Follower: the follower, responsible for responding to log replication requests from Leader and election requests from Candidate. Initially, all nodes are Follower.

Candidate: the candidate who is responsible for initiating the election. After Raft starts or Leader goes down, a node changes from Follower to Candidate and initiates the election. After the election is successful, the node changes from Candidate to Leader.

The following is the state transition diagram of the Raft role:

Term (term of office)

The concept of Term (term of office) is used in Raft. One round of election is a Term (term of office), and only one Leader can be produced in a Term. Term is represented by a continuously increasing number, and all Follower initially have a Term of 1. One of the Follower timers expires to trigger the election, and its status changes to Candidate, where Term plus 1 becomes 2, and then the election starts. There are several possibilities:

1. If no Leader is elected or an exception occurs during the term of office in which the current Term is 2, the Term will be incremented to 3 and a new round of election will begin.

2. After the Leader is elected during the term of office in which the Term is 2, if the Leader goes down, the other Follower will be incremented into Candidate,Term and a new election will be launched.

3. If Leader or Candidate finds that their Term is smaller than other Follower, Leader or Candidate will be incremented into Follower,Term.

4. If Follower finds that its Term is smaller than other Follower, update Term is consistent with other Follower.

Each Term increment will lead to a new round of elections, and during the normal operation of the Raft, all nodes have the same Term. If the node does not fail, a Term (term of office) will be maintained forever, and when a node receives a request, the Term rejects the request for more hours than the current Term.

Election

Initially, all nodes are Follower, and the timer time is different. After a node timer triggers the election, the Term is incremented, and the node is converted from Follower to Candidate, initiating a voting request (RequestVote RPC) to other nodes. There are several possibilities:

1. Receive the votes of more than half of the nodes, convert Candidate to Leader, and send heartbeats to other nodes to maintain the leadership position.

2. If you receive an AppendEntries RPC request from another node and the node's Term is greater than the current node's Term, a new valid leader is found and converted to Follower, otherwise the request is rejected by keeping Candidate.

3. The election time-out, Term increment, re-launch the election.

During each round of Term, each node can only vote once. If more than one Candidate does not receive more than half of the votes, each Candidate Term increments, restarts the timer and re-initiates the election. Because the timer time is random, there is no problem of multiple Candidate initiating votes at the same time.

Log replication

To ensure the consistency of nodes, it is necessary to ensure that all nodes perform the same sequence of operations in order, and this is the purpose of log replication.

1. Leader receives the client transaction request (that is, the log), appends the log to the local Log, and copies it to other Follower through AppendEntries RPC.

2. After receiving the log, Follower appends it to the local Log and sends an ACK message to Leader.

3. After receiving more than half of the ACK messages from Follower, Leader sets the log to be submitted and formally submits the log, notifies the client, and sends an AppendEntries RPC request to notify Follower to submit the log.

Security.

1. Only one Leader can be elected during each Term period.

2. Leader will not delete or overwrite existing log entries, only append them.

3. If the Term term number of the log entry at the same index location is the same, then it is considered to be the same from the beginning to the index location.

4. If a log entry is submitted during a certain term, then the log entry must appear in all leaders with a larger Term term number.

5. If the log entry of Leader at one index location has been submitted, then other nodes with the same index location will not submit different log entries.

RequestVote RPC and AppendEntries RPC

Node communication in Raft uses two kinds of RPC, namely RequestVote RPC and AppendEntries RPC:

RequestVote RPC: request for a vote, initiated by Candidate during the election.

AppendEntries RPC: the additional entry RPC, initiated by Leader, is used for log replication and heartbeat mechanisms.

Reference documentation

Looking for an easy-to-understand consistency algorithm (extended version)

Detailed explanation of consistency algorithm Raft

Why Raft is a more understandable distributed consistency algorithm

Postscript

The Raft summarized in this paper, and the Paxos, 2PC and 3PC in the previous article are all distributed consistency algorithms based on non-Byzantine fault tolerance, that is, in addition to considering message loss, timeout and disorder, but not considering message tampering. Starting from the next article, we will summarize the distributed consistency algorithm based on Byzantine fault tolerance, which is widely used in Bitcoin, Ethernet Square, and other blockchain products.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.