How to analyze raft Protocol in RocketMQ 07/19 Update SLTechnology News&Howtos

How to analyze raft Protocol in RocketMQ

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces how to analyze raft protocol in RocketMQ. The content is very detailed. Interested friends can use it for reference. I hope it will be helpful to you.

Raft protocol is another famous protocol to solve consistency in distributed domain, which mainly includes two parts: Leader election and log replication.

1. Leader election

1.1 single node initiated voting

There are three states (roles) of nodes in Raft protocol:

Follower

Followers.

Candidate

Candidate.

Leader

Leader (Leader), usually what we call the master node.

The first three nodes have an initial state of Follower, and each node has a timeout (timer), which is set to a random value between 150ms~300ms. When the timer expires, the node status changes from Follower to Candidate, as shown in the following figure:

In general, the timer of one of the three nodes expires first, the node state changes to Candidate, and the node in the candidate state initiates the election vote. Let's first consider how the selection is made when only one node becomes Candidate.

When the node status is Candidate, a round of voting will be initiated. Since it is the first round of voting, set the current round of voting to 1, and first cast a vote for yourself. As shown in the NodeA node shown in the figure above, the Term is 1 and the vote Count is 1.

When the timer of a node expires, first cast a vote for yourself, and then initiate a vote to the other nodes in the group (canvassing is more appropriate) to send a voting request.

When the node in the cluster receives the voting request, if it has not voted in this round, it agrees, otherwise it opposes, and then returns the result and resets the timer.

When node A receives more than half of the approval votes, it upgrades to the Leader of the cluster, and then periodically sends a heartbeat to other nodes in the cluster to determine its leadership position, as shown in the following figure.

Node A, the Leader in the cluster is sending heartbeats to other nodes.

After receiving the heartbeat packet of Leader, the node returns the response result and resets its own timer. If the node in Flower status does not receive the heartbeat packet of Leader within the timeout, it will change from Flower node to Candidate, and the node will initiate the next round of voting.

For example, when a NodeA node goes down and stops sending heartbeats to its slave nodes, let's take a look at how the cluster reselects the master.

If the primary node is down, stop sending heartbeats to the nodes in the cluster. As the timer expires and node B becomes Candidate before node C, node B initiates a vote to other nodes in the cluster, as shown in the following figure.

Node B, first set the voting round to 2, then first cast an article for yourself, and then initiate a voting request to other nodes.

Node C receives the request, because its voting round is larger than its own voting round, and that round does not vote, it votes in favor and returns the result, and then resets the timer. Node B will naturally become the new Leader and send heartbeats regularly.

This is the end of the introduction of the elector of the three nodes. Some netizens may say that although the timers of each node are random, it is also possible that at the same time, or before a node receives a voting request from another node, it becomes Candidate, that is, in a round of voting, there are more than one node whose status is Candidate. Then how to choose the host?

The following is a 4-node cluster as an example to illustrate how to select the master in this case.

1.2 simultaneous voting initiated by multiple nodes

First, two nodes enter the Candidate status at the same time, and a new round of voting begins. The current voting number is 4. First, cast a vote for yourself, and then initiate a vote for other nodes in the cluster, as shown in the following figure:

Each node then receives a vote request, as shown below, to vote:

First of all, nodes C and D will return disapproval when they receive the voting request from nodes D and C, because in this round of voting, node An agrees with node C and node B agrees with node D. according to the above figure, node An agrees with node C and node B agrees with node D. then both C and D only get two votes at this time. Of course, if AMagine B thinks that C or D is the master node, then the choice can be ended. As shown in the above picture, C and D only get two votes. Less than half of them cannot be the master node, so what happens next? Please take a look at the following picture:

At this time, the countdown of the timers of Arecom B and D is respectively in the countdown. When the node becomes Candidate, or its own state is Candidate and the timer is triggered, a new round of voting is initiated. In the picture, node B and node D initiate a new round of voting at the same time.

The voting results are as follows: node An and node C agree that node B becomes leader, but since BD has initiated the fifth round of voting, the final voting round is updated to 6, as shown in the figure:

This is the end of the selection of the Raft protocol, and then let's think about what problems we should at least consider if we implement the Raft protocol, so as to provide some ideas for the next source code reading Dleger (RocketMQ multiple copies) module. 1.3Reflections on how to realize the Raft selection

Node state

Three node states need to be introduced: Follower (follower), Candidate (candidate), trigger point for voting, and Leader (master node).

A timer that enters the voting state

In the two states of Follower and Candidate, you need to maintain a timer, and each timing time is random from the 150ms-300ms, that is, the timing expiration of each node is different. In the Follower state, when the timer arrives, a round of voting will be triggered. The node needs to reset the timer after receiving the vote request, the Leader heartbeat request and responding.

Voting round Team

Candidate status of the node, each time a round of voting is initiated, Term plus one; Term storage.

Voting mechanism

Each round, a node can only vote in favor of one node, for example, the number of rounds maintained in node An is 3, and node B has already voted in favor of node B. if other nodes are received, the number of voting rounds is 3, they will vote against it. If you receive a node with a round of 4, you can vote in favor again.

Conditions for becoming a Leader

You must get the majority of the nodes in the cluster, that is, more than half of the nodes. For example, if there are three nodes in the cluster, you must get two votes. If one of the servers is down, can the other two nodes be selected? The answer is yes, because you can get 2 votes, more than half of the 3 in the initial cluster, so the number of machines in the cluster is as odd as possible, because the availability of 4 is the same as that of 3.

Warm Tip: the above conclusions are just some of my thoughts, we can take the above thinking, into the study of Dleger, the next article will learn from the perspective of source code analysis to learn how the god is to achieve the Raft protocol Leader selection, let's look forward to it.

2. Log replication

After completing the master selection work in the cluster, the client sends a request to the master node, and the master node is responsible for data replication to keep the data consistent in the cluster. The initial state is shown in the following figure:

The client initiates a request to the master node, such as set 5, to update the data to 5, as shown in the following figure:

After receiving the client request, the master node appends the data to the log of Leader (but not submitted), and then forwards the log to the slave node in the cluster in the next heartbeat packet, as shown in the following figure:

After receiving the log of Leader from the node, append it to the log file of the slave node and return the confirmation ACK. After receiving the confirmation message from the slave node, Leader sends the confirmation message to the client.

The above log replication is relatively simple, because only consider the normal situation, if an exception occurs in the middle, how to ensure data consistency?

What if one of the slave nodes sends a failure downtime when the Leader node broadcasts the log to the slave node?

At what stage should the log be submitted? After receiving the data change request from the client, the Leader node first appends it to the log file of the master node, then broadcasts it to the slave node, and receives the log information from the slave node. Should the log be submitted and returned to ACK, or when should it be submitted?

How to ensure that the log is unique.

How to deal with network partitions.

On how to analyze the raft protocol in RocketMQ to share here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.