In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
Run 3000 times without error MIT6.824 Raft experimental example analysis, I believe many inexperienced people are helpless, for this reason this article summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.
A few days ago, in the distributed system communication group, my friends were discussing that the raft experiment of 6.824 would always make mistakes more than 2000 times, and the errors were in the two tests of Figure8 Unreleasable and UnreliableTurn 2. I actually encountered this problem myself, and here is my own solution.
To get to this point, first assume that your program has no major problems, except that in thousands of tests there will be election failures (livelock problems) or submitted log conflicts.
To put it bluntly, the above two tests are to mess up the network, write a bunch of logs, and then give you 10 seconds. If you don't choose a new Leader, you will make an error. There are two main types of errors: failed to reach agreement or apply error.
These two problems can be solved in the following two ways.
Election timeout cannot be reset casually
If you read the Students' Guide to Raft carefully, it is clear that the election timeout can only be reset in the following three situations:
Heartbeat request received from incumbent Leader. If AppEntries request parameter term is expired (args.Term < currentTerm), it cannot be reset;
The node starts an election;
The node voted for another node (it cannot be reset if it does not vote);
Original text:
you should only restart your election timer if a) you get an AppendEntries RPC from the current leader (i.e., if the term in the AppendEntries arguments is outdated, you should not reset your timer); b) you are starting an election; or c) you grant a vote to another peer.
In fact, this problem is easy to understand. It is mainly easy to make mistakes. If the code is changed, the election time reset position will be accidentally mistaken, and then the pit will be buried for the later investigation.
Check to see if you reset this time randomly, and my problem with another friend in the group appears in the third case.
Correctly handle RPC responses
Be careful when handling rpc responses, too. When receiving rpc responses, if currentTerm != Args.Term, this time RPC will be lost can not be used.
Of course, if the node role has changed, then ignore this rpc response.
summary
Debugging this problem is mainly to be careful, pay attention to details, read it several times: thesquareplanet.com/blog/students-guide-to-raft/
The script for batch testing is here: gist.github.com/jonhoo/f686cacb4b9fe716d5aa
How to use:
./ go-test-many.sh Number of parallel tests (default is number of CPUs) Which test
For example, to test 2C this test 2000 times, 8 tests in parallel.
./ go-test-many.sh 2000 8 2C
Test Figure 8Unreliable2C 2000 times.
./ go-test-many.sh 2000 8 TestFigure8Unreliable2C After reading the above content, do you know how to run MIT6.824 Raft experimental example analysis without error for 3000 times? If you still want to learn more skills or want to know more related content, welcome to pay attention to the industry information channel, thank you for reading!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.