Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Hadoop2.4 source code analysis

2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "hadoop2.4 source code analysis". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

ZKFailoverController is the coordinator of the entire HA. Next we will analyze several practical problems.

1. How do you coordinate the election? How was active elected?

What did 2.active do after the downtime and how did it switch over?

Next, let's analyze the first question of how to coordinate the election. How was active elected?

Step 1: looking at the NameNode source code, you can see that for NN using HA, it is necessary to enter Standby. Except for upgrad

Protected HAState createHAState (StartupOption startOpt) {if (! haEnabled | | startOpt = = StartupOption.UPGRADE) {return ACTIVE_STATE;} else {return STANDBY_STATE; / / standby status}}

Step 2: at this time, the HealthMonitor monitors the NN and finds that it is the status of the HEALTH, and executes:

If (healthy) {/ / sets the status to notify the callback function enterState (State.SERVICE_HEALTHY);}

EnterState will notify the callback function for processing. The election method is executed for the start of the HEALTH state.

Elector.joinElection (targetToData (localTarget))

Preempt the node and get the Active by creating a 00:00 node

CreateLockNodeAsync ()

For creating a node, the EVENT time of the ZK is triggered.

For event handling, see the source code:

Public synchronized void processResult (int rc, String path, Object ctx, String name) {if (isStaleClient (ctx)) return; LOG.debug ("CreateNode result:" + rc + "for path:" + path + "connectionState:" + zkConnectionState + "for" + this); Code code = Code.get (rc) / / for ease of use, a custom set of status if (isSuccess (code)) {/ / is returned successfully, and the zklocakpath node / / we successfully created the znode is created successfully. We are the leader. Start monitoring if (becomeActive ()) {/ / to change the NN on this node into active monitorActiveStatus (); / / continue to monitor node status} else {reJoinElectionAfterFailureToBecomeActive (); / / fail, continue election attempt} return } if (isNodeExists (code)) {/ / node exists, indicating that active,wait already exists and if (createRetryCount = = 0) {/ / znode exists and we did not retry the operation. So a different / / instance has created it. Become standby and monitor lock. BecomeStandby ();} / / if we had retried then the znode could have been created by our first / / attempt to the server (that we lost) and this node exists response is / / for the second attempt. Verify this case via ephemeral node owner. This / / will happen on the callback for monitoring the lock. MonitorActiveStatus (); / / but the effort to be an active cannot be stopped by return;} String errorMessage = "Received create error from Zookeeper. Code: "+ code.toString () +" for path "+ path; LOG.debug (errorMessage); if (shouldRetry (code)) {if (createRetryCount < maxRetryNum) {LOG.debug (" Retrying createNode createRetryCount: "+ createRetryCount); + + createRetryCount; createLockNodeAsync (); return;} errorMessage = errorMessage +". Not retrying further znode create connection errors. ";} else if (isSessionExpired (code)) {/ / This isn't fatal-the client Watcher will re-join the election LOG.warn (" Lock acquisition failed because session was lost "); return;} fatalError (errorMessage);}

For the machine that gets the Active, call the becomeActive () method

Private synchronized void becomeActive () throws ServiceFailedException {LOG.info ("Trying to make" + localTarget + "active..."); try {HAServiceProtocolHelper.transitionToActive (localTarget.getProxy (conf, FailoverController.getRpcTimeoutToNewActive (conf)), createReqInfo ()); String msg = "Successfully transitioned" + localTarget + "to active state"; LOG.info (msg); serviceState = HAServiceState.ACTIVE; recordActiveAttempt (new ActiveAttemptRecord (true, msg)) } catch (Throwable t) {String msg = "Couldn't make" + localTarget + "active"; LOG.fatal (msg, t); recordActiveAttempt (new ActiveAttemptRecord (false, msg + "\ n" + StringUtils.stringifyException (t); if (t instanceof ServiceFailedException) {throw (ServiceFailedException) t;} else {throw new ServiceFailedException ("Couldn't transition to active", t);}

Through a series of calls to RPC, the NameNode's

Synchronized void transitionToActive () throws ServiceFailedException, AccessControlException {namesystem.checkSuperuserPrivilege (); if (! haEnabled) {throw new ServiceFailedException ("HA for namenode is not enabled");} state.setState (haContext, ACTIVE_STATE);}

OVER

What did 2.active do after the downtime and how did it switch over?

After an active outage or an exception will lead to the disappearance of the ZK node or the monitoring of the status of the UNHEALTH, which will lead to a new round of election, the principle is the same as above.

"hadoop2.4 source code analysis" content is introduced here, thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report