3. Zookeeper-- implements the HA of NN and RM. 04/27 Update SLTechnology News&Howtos

3. Zookeeper-- implements the HA of NN and RM.

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

I. hdfs namenode HA1, Overview

When is in hadoop1.0, namenode in hdfs cluster has a single point of failure. When namenode is not available, the entire hdfs cluster service will be unavailable. In addition, if you need to temporarily design or do other operations on namenode, after stopping namenode, the hdfs cluster will not be available.

can solve the problem of single point of failure to some extent by means of HA.

2. Key points of namenode HA work

1) metadata management needs to be changed:

Each store a copy of the metadata in memory

Only namenode nodes in Active state can write Edits logs.

Both namenode can read edits

Shared edits is managed in a shared storage (two mainstream implementations of qjournal and NFS)

2) A state management function module is required.

Hadoop implements a zkfailover, which resides in the node where each namenode is located, and each zkfailover is responsible for monitoring its own namenode node and using zk for status identification. When state switching is needed, zkfailover is responsible for switching, and it is necessary to prevent the occurrence of brain split phenomenon.

3) it must be ensured that ssh can log in without a password between the two NameNode. For subsequent quarantine. Go to another namenode node by ssh and kill the namenode process completely. To prevent brain fissure.

4) Fence, that is, only one NameNode provides services at a time

3. Namenode HA automatic failover mechanism

Namenode HA automatic failover requires two additional components in addition to two namenode: zookeeper cluster service and ZKFailoverController (ZKFC).

(1) ZKFC

It is a client of zookeeper and is responsible for monitoring the status of namenode. A ZKFC process runs on each namenode.

1) Health monitoring:

ZKFC uses a health check command to periodically ping the NameNode on the same host, and ZKFC believes that the node is healthy as long as the NameNode returns to health in a timely manner. If the node crashes, freezes or enters an unhealthy state, the health monitor identifies the node as unhealthy.

2) ZooKeeper session management:

When the local NameNode is healthy, ZKFC maintains a session that is open in ZooKeeper. If the local NameNode is in the active state, the ZKFC also maintains a special znode lock that uses ZooKeeper's support for transient nodes (that is, temporary nodes), which are automatically deleted if the session terminates.

ZKFC creates a / hadoop-ha/namenodeHA cluster name / such a node on zookeeper, which has two child nodes: ActiveBreadCrumb: persistent node. The value of the node records the HA cluster name, active node alias, active node address, which is mainly used by other client that want to access the namenode service to obtain the active status, so it must be a persistent node. ActiveStandbyElectorLock: temporary node where the value of the node records the HA cluster name active node alias active node address. Play the role of mutex, only to obtain the right to use the node, you can modify the value of the above ActiveBreadCrumb node. Because it is a temporary node, when active namenode and zk remain connected, the node will always exist, and standby's namenode will also stay connected to zk, but when you find that the temporary node already exists, you will know that it has been occupied, so it will not do anything. When something goes wrong with the above active namenode, the ZKFC will disconnect from the zk, and the temporary node will disappear. At this point, standby namenode recreates the temporary node, which is equivalent to acquiring a lock, and you can modify the value of ActiveBreadCrumb. At this point, it naturally becomes the new active namenode.

3) selection based on ZooKeeper:

If the local NameNode is healthy and ZKFC finds that no other node currently holds an znode lock, it will acquire the lock for itself. If successful, it has won the choice and is responsible for running the failover process to make its local NameNode active.

4. HA configuration (1) Environment planning host role bigdata121/192.168.50.121namenode,journalnode,datanode,zkbigdata122/192.168.50.122namenode,journalnode,zkbigdata123/192.168.50.123zk software version hadoop2.8.4,zookeeper3.4.10,centos7.2

There is no need to repeat the deployment of jdk,zookeeper. Read the previous article.

Basic environment configuration:

Add hostname resolution / etc/hosts per machine

Each host should configure ssh key-free login to itself and to the other two hosts.

Turn off the firewall and selinux

(2) deployment

The full deployment of hadoop can be seen in the previous article, which focuses on the configuration of namenode HA.

Modify the configuration file:

Core-site.xml

Fs.defaultFS hdfs://mycluster hadoop.tmp.dir / opt/modules/HA/hadoop-2.8.4/data/ha_data ha.zookeeper.quorum bigdata121:2181,bigdata122:2181,bigdata123:2181

Hdfs-site.xml

Dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1 Nn2 dfs.namenode.rpc-address.mycluster.nn1 bigdata121:9000 dfs.namenode.rpc-address.mycluster.nn2 bigdata122:9000 dfs.namenode.http-address.mycluster.nn1 bigdata121: 50070 dfs.namenode.http-address.mycluster.nn2 bigdata122:50070 dfs.namenode.shared.edits.dir qjournal://bigdata121:8485 Bigdata122:8485/mycluster dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files / root/.ssh/id_rsa dfs.journalnode.edits.dir / opt/modules/HA/hadoop-2.8.4/data/jn dfs.permissions.enable false dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider Dfs.ha.automatic-failover.enabled true

Configuration files are synchronized to each node. Feel free to use scp or rsync.

(3) start the cluster

The first time you start:

Cd / opt/modules/HA/hadoop-2.8.41) start journalnode service on each journalnode node sbin/hadoop-daemon.sh start journalnode2) format namenode on nn1 And start bin/hdfs namenode-formatsbin/hadoop-daemon.sh start namenode3) synchronize the data of namenode from the nn1 to the local namenodebin/hdfs namenode-bootstrapStandby4 through the launched journalnode on nn2) start nn2sbin/hadoop-daemon.sh start namenode5) start all datanodesbin/hadoop-daemons.sh start datanode6 on nn1) View namenode status on two namenode bin/hdfs haadmin-getServiceState nn1bin/hdfs haadmin-getServiceState nn2 normal one is active One is standby7) manually convert to active and standbybin/hdfs haadmin-transitionToActive namenode names bin/hdfs haadmin-transitionToStandby namenode names Note: if you need to switch manually, you need to turn off the automatic switching in hdfs-site.xml. Or report a mistake. Or use-- forceactive to cast.

After the startup is complete, you can manually turn off the namenode of the active, and you can see that the namenode of the standby just now will be automatically converted to active. If the namenode that has just been turned off comes back online, it will become standby.

The second startup:

Just start-dfs.sh directly.

(4) Why is there no SNN?

When we started the entire HA cluster of namenode, we found that there was no SNN. Naively, I thought I needed to start it manually, so I started it manually, but I got it wrong.

Looking at the startup log of SNN, you can find that there is such an exception message.

Org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Failed to start secondary namenodejava.io.IOException: Cannot use SecondaryNameNode in an HA cluster. The Standby Namenode will perform checkpointing. At org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode. (SecondaryNameNode.java:189) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main (SecondaryNameNode.java:690)

It is clear that the responsibility of SNN is done by the namenode of standby, and the existence of SNN is no longer needed in the HA state. In fact, this is also very reasonable, it can be said to make full use of standby's namenode, so that it will not be idle there.

II. Yarn resourceManager HA1, working mechanism

In fact, similar to the ha of namenode above, it also monitors RM with the help of ZKFC.

A node with / yarn-leader-election/yarn cluster name will be created on zk

There are two child nodes below: ActiveBreadCrumb and ActiveStandbyElectorLock

The function is similar, do not repeat. The working mechanism is basically similar.

2. HA configuration (1) Planning host roles bigdata121zk, rmbigdata122zk, rmbigdata123zk (2) configuration files

Yarn-site.xml

Yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true yarn.log-aggregation.retain-seconds 604800 yarn.resourcemanager.ha.enabled true yarn. Resourcemanager.cluster-id cluster-yarn1 yarn.resourcemanager.ha.rm-ids rm1 Rm2 yarn.resourcemanager.hostname.rm1 bigdata121 yarn.resourcemanager.hostname.rm2 bigdata122 yarn.resourcemanager.zk-address bigdata121:2181,bigdata122:2181,bigdata123:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore

The configuration file is synchronized to other nodes.

(3) start cluster bigdata121: start yarnsbin/start-yarn.shbigdata122: start rmsbin/yarn-daemon.sh start resourcemanager to check the service status: bin/yarn rmadmin-getServiceState rm1bin/yarn rmadmin-getServiceState rm2

The test method is similar to namenode, and it is not repeated here.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.