In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains "what is the concept and function of Leader election". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "what is the concept and function of Leader election"!
I. Introduction 1. basic concepts
SID: Server ID, used to identify the machines in the ZooKeeper cluster. Each machine cannot be duplicated, and the value is consistent with myid.
ZXID: Transaction ID
Vote: vote, the specific data structure is followed by
Quorum: Majority of machines
logicalclock: logical clock (election round), zk server Leader election round
Server type:
In zk, three roles are introduced: Leader, Follwer, and Observer. All the machines in the zk cluster choose a machine called Leader through a Leader election process, and the Leader server provides read and write services for clients. Both Follower and Observer can provide read services, the only difference being that the Observer machine does not participate in the Leader election process, nor does it participate in the write success strategy for write operations. Therefore, the Observer exists to improve the read performance of the cluster without affecting the write performance.
Server Status:
LOOKING: Leader election phase;
FOLLOWING: Follower server and Leader remain synchronized.
LEADING: The state in which the Leader server leads as the master process;
OBSERVING: Observer status, indicating that the current server is Observer and does not participate in voting;
The purpose of the election is to select the appropriate Leader machine, and the Leader machine determines the transactional Proposal processing process to implement a two-phase commit protocol (specifically, ZAB protocol).
II. Start the main election process
During the startup process of zk server cluster, not only ZooKeeper Server object will be created through QuorumPeerMain, but QuorumPeer object will be generated at the same time, representing a machine in ZooKeeper cluster. Responsible for maintaining the running state of the machine during the whole machine operation, and will initiate Leader election according to the situation.
Quorum Peer is a separate thread that maintains the state of the zk machine.
This time, we mainly introduce election-related content, and the following articles are derived from startLeaderElection.
1. QuorumPeer maintains cluster machine state
Quorum Peer's responsibility is to constantly detect the current state of the zk machine and execute the corresponding logic. Simply put, it executes different logic according to the different states in which the service is located. In order to avoid excessive length and affect the reading experience, after deleting some logic, the code is as follows:
When the machine is in LOOKING state, QuorumPeer will conduct elections, but the specific logic is not responsible for the QuorumPeer, and the overall voting process is independent. From the perspective of logic execution, the whole process is designed into two main links:
The process of communicating with other zk cluster machines
Implement specific election algorithms
The default election algorithm used in QuorumPeer is FastLeaderElection.
III. Overall framework of the electoral process
Zk proposes a variety of election algorithms, but previous versions have been abandoned. FastLeaderElection is generally used by default, that is, electorArg=3 is set in the configuration file. During cluster startup, QuorumPeer implements different election policies depending on configuration:
QuorumCnxManager, Listener, SendWorker, RecvWorker division of labor is very clear to accurately say that QuorumCnxManager this class of responsibilities are also very clear, is responsible for listening to the port to send messages read messages, which:
Listeners listen for connections and maintain connections to other servers;
SendWorker is responsible for sending (voting) information to the corresponding server according to the connection information saved by the Listener;
RecvWorker obtains (vote) information of other servers and stores it in queue;
For each zk machine, you need to establish a TCP port listener, which is handed over to the Listener in QuorumCnxManager to handle, using Socket blocking IO (the default listening port is 3888, which is set in the config file). In order to avoid creating TCP connections repeatedly between two machines in the process of connecting to each other, zk has established rules for connections: only servers with SID are allowed to actively establish connections with other servers. The implementation is also relatively simple. In receiveConnection, the server will compare the SID of the server with which it has established a connection to determine whether to accept the request. If its SID is larger, it will disconnect and then take the initiative to establish a connection with the remote server. This logic is done by Listeners, and Listeners are thread-independent. The core code is as follows:
QuorumCnxManager is only responsible for information exchange with other servers, but is not responsible for information generation and processing. The processing of data must be handed over to the corresponding election algorithm for processing.
The above content is mainly to establish the connection communication process between zk servers. The specific election strategy zk is abstracted into Election. The main analysis is FastLeaderElection mode (the core part of the election algorithm):
2. Vote for yourself as the new Leader (I vote for me)
3. Verify your current vote against everyone else's vote who is better suited to be Leader
There are some points here that are still difficult to understand. For example, when the votes taken from the ballot box are null, it is necessary to determine whether the current server is a cluster and whether other servers remain connected. However, detailed comments have been marked. I believe you can understand it after reading it twice.
Here recvqueue is all the ballot boxes after receiving votes from other servers (one-way linked list of leading nodes), recvqueue.poll is to take the first vote, here we see the operation of poll method:
At this point, we begin to traverse whether more than half of the votes received by the current server have chosen the votes of the current server (after the above steps, the votes of the current server have been modified to the most appropriate). Let's look at the org.apache.zookeeper.server.quorum.FastLeaderElection#termPredicate method:
If the current vote is not more than half, it is easy to understand that the break continues to take the next vote for judgment.
However, the question arises. If more than half of the votes have been cast, why do we still have to take down the next vote or the current vote to see who is more suitable for the latter step?
Let's look at the following code:
When I first looked at it, I also found it hard to understand. Why should I put back the ballot paper after taking out the next ballot paper and judging that it was more suitable than the current ballot paper, and then break it?
I have written comments on the above code. The purpose of this while loop is to traverse the ballot box to prevent more suitable votes than the current one. If n==null, it means that no more suitable votes than the current "more than half votes" are found. To finish the work, modify the current host state:
proposedLeader == self.getId()) ? ServerState.LEADING: learningState()
The queue is then cleared and the final ballot is returned.
If any of the remaining ballots are more suitable than yourself, put them back into the ballot box and repeat the previous process to modify the current ballot broadcast.
Note: The ballot box is also the container recvset that currently receives votes. It is essentially a HashMap, and the key is the serverId of the voter. Therefore, receiving multiple votes is only an update of the votes. The design is very clever!
5. Cases where elections are not required
Although the last piece of code is not much, it is the most difficult to understand. The above comments analyze why the notification sender status can be received during the election process as FOLLOWING, LEADING, OBSERVING. Combined with the comments, we have to read it carefully several times. In fact, it is to deal with the election status in the following three cases:
A new Server(not an Observer) joins a functioning cluster
When the Leader hangs, not all followers can sense that the leader hangs at the same time. The server that senses it first will send a notification to other servers, but since other servers have not sensed it yet, the notification status they send to this server is FOLLOWING.
In this round of election, other servers have elected a new leader, but they have not notified the current server yet. The notification sent by these servers that have already known that the leader election has been completed is LEADING or FOLLOWING.
V. Summary
The above is the default election process of zk, analyzed according to the two states of ZAB protocol:
At the time of initialization, voting is carried out in the same round until a Leader is selected.
Crash Recovery Phase:
Leader server hangs, then go through a process similar to the initialization process, select Leader
If the Follower server hangs up, then in the process of executing the election, it will receive the Leader vote information from other servers (corresponding to the branch code in the above case without election), and it can also determine the Leader's ownership.
At this point, I believe everyone has a deeper understanding of "what is the concept and function of Leader election". Let's do it in practice. Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.