Hadoop Series (8)-- Building Hadoop High availability Cluster based on ZooKeeper 07/08 Update SLTechnology News&Howtos

Hadoop Series (8)-- Building Hadoop High availability Cluster based on ZooKeeper

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

I. introduction to high availability

Hadoop high availability (High Availability) is divided into HDFS high availability and YARN high availability. The two implementations are basically similar, but HDFS NameNode requires much higher data storage and consistency than YARN ResourceManger, so its implementation is also more complex, so let's explain it first:

1.1 High availability overall architecture

The highly available architectures for HDFS are as follows:

Picture quoted from: https://www.edureka.co/blog/how-to-set-up-hadoop-cluster-with-hdfs-high-availability/

The HDFS high availability architecture mainly consists of the following components:

Active NameNode and Standby NameNode: two NameNode form mutual backup. One is in Active status, the main NameNode, and the other is in Standby status, which is a standby NameNode. Only the master NameNode can provide external read and write services.

The active / standby switching controller ZKFailoverController:ZKFailoverController runs as an independent process to control the active / standby handover of NameNode. ZKFailoverController can detect the health status of NameNode in time, and use Zookeeper to automatically elect and switch between master and slave in case of master NameNode failure. Of course, NameNode also supports manual master / slave switching that does not depend on Zookeeper.

Zookeeper cluster: provides active / standby election support for master / slave handover controllers.

Shared storage system: the shared storage system is the most critical part to achieve the high availability of NameNode. The shared storage system stores the metadata of HDFS generated during the operation of NameNode. The master NameNode and NameNode synchronize metadata through a shared storage system. When switching between master and slave, the new master NameNode can continue to provide services after confirming that the metadata is fully synchronized.

DataNode node: in addition to sharing the metadata information of the HDFS through the shared storage system, the primary NameNode and the standby NameNode also need to share the mapping between HDFS blocks and DataNode. DataNode reports the location of the block to both the primary NameNode and the standby NameNode. 1.2 Analysis of data synchronization mechanism of shared storage system based on QJM

Currently, Hadoop supports the use of Quorum Journal Manager (QJM) or Network File System (NFS) as a shared storage system. Take QJM cluster as an example: Active NameNode first submits EditLog to JournalNode cluster, and then Standby NameNode synchronizes EditLog regularly from JournalNode cluster. When Active NameNode goes down, Standby NameNode can provide services after confirming that metadata is fully synchronized.

It should be noted that writing EditLog to the JournalNode cluster follows the strategy of "more than half of the write is successful", so you must have at least 3 JournalNode nodes, of course, you can continue to increase the number of nodes, but you should ensure that the total number of nodes is odd. At the same time, if there is a 2N+1 JournalNode, then according to the principle of more than half writing, a maximum of N JournalNode nodes can be tolerated to hang up.

1.3 NameNode active / standby handoff

The process of switching between master and slave by NameNode is shown in the following figure:

After the initialization of the HealthMonitor, the internal thread is started to call the method of the HAServiceProtocol RPC interface of the corresponding NameNode regularly to detect the health status of the NameNode. If HealthMonitor detects a change in the health status of NameNode, it will call back the corresponding method of ZKFailoverController registration for processing. If ZKFailoverController determines that the master / slave switch is needed, it will first use ActiveStandbyElector to automatically elect the master / slave. ActiveStandbyElector interacts with Zookeeper to complete automatic active and standby elections. After the completion of the master / slave election, ActiveStandbyElector will call back the appropriate method of ZKFailoverController to notify the current NameNode to become the master NameNode or slave NameNode. ZKFailoverController calls the method of the HAServiceProtocol RPC interface of the corresponding NameNode to transition the NameNode to Active state or Standby state. 1.4 YARN High availability

The high availability of YARN ResourceManager is similar to that of HDFS NameNode, but unlike NameNode, ResourceManager does not have so much metadata information to maintain, so its status information can be written directly to Zookeeper and rely on Zookeeper for active and standby elections.

II. Cluster planning

According to the high availability design goal: there should be at least two NameNode (one master and one standby) and two ResourceManager (one master and one standby). At the same time, in order to meet the principle of "more than half write is successful", at least 3 JournalNode nodes are required. Three hosts are used to build the cluster. The cluster plan is as follows:

3. Pre-conditions all servers are installed with JDK. For installation steps, please see: install JDK under Linux; build a ZooKeeper cluster, and for steps, please see: SSH secret-free login is configured between all servers in Zookeeper stand-alone environment and cluster environment. IV. Download and decompress the cluster configuration 4.1

Download Hadoop. What I download here is the CDH version of Hadoop. The download address is http://archive.cloudera.com/cdh6/cdh/5/.

# tar-zvxf hadoop-2.6.0-cdh6.15.2.tar.gz 4.2 configure environment variables

Edit the profile file:

# vim / etc/profile

Add the following configuration:

Export HADOOP_HOME=/usr/app/hadoop-2.6.0-cdh6.15.2export PATH=$ {HADOOP_HOME} / bin:$PATH

Execute the source command to make the configuration take effect immediately:

# source / etc/profile4.3 modify configuration

Go to the ${HADOOP_HOME} / etc/hadoop directory and modify the configuration file. Each profile is as follows:

1. Hadoop-env.sh# specifies the installation location of JDK export JAVA_HOME=/usr/java/jdk1.8.0_201/2. Core-site.xml fs.defaultFS hdfs://hadoop001:8020 hadoop.tmp.dir / home/hadoop/tmp ha.zookeeper.quorum hadoop001:2181,hadoop002:2181 Hadoop002:2181 ha.zookeeper.session-timeout.ms 10000 3. Hdfs-site.xml dfs.replication 3 dfs.namenode.name.dir / home/hadoop/namenode/data dfs.datanode.data.dir / home/hadoop/datanode/data Dfs.nameservices mycluster dfs.ha.namenodes.mycluster nn1 Nn2 dfs.namenode.rpc-address.mycluster.nn1 hadoop001:8020 dfs.namenode.rpc-address.mycluster.nn2 hadoop002:8020 dfs.namenode.http-address.mycluster.nn1 hadoop001:50070 dfs.namenode.http-address.mycluster.nn2 hadoop002:50070 Dfs.namenode.shared.edits.dir qjournal://hadoop001:8485 Hadoop002:8485 Hadoop003:8485/mycluster dfs.journalnode.edits.dir / home/hadoop/journalnode/data dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files / root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 Dfs.client.failover.proxy.provider.mycluster org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.automatic-failover.enabled true 4. Yarn-site.xml yarn.nodemanager.aux-services mapreduce_shuffle yarn.log-aggregation-enable true Yarn.log-aggregation.retain-seconds 86400 yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id my-yarn-cluster yarn.resourcemanager.ha.rm-ids rm1 Rm2 yarn.resourcemanager.hostname.rm1 hadoop002 yarn.resourcemanager.hostname.rm2 hadoop003 yarn.resourcemanager.webapp.address.rm1 hadoop002:8088 yarn.resourcemanager.webapp.address.rm2 hadoop003:8088 yarn.resourcemanager.zk-address hadoop001:2181,hadoop002:2181 Hadoop003:2181 yarn.resourcemanager.recovery.enabled true yarn.resourcemanager.store.class org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore 5. Mapred-site.xml mapreduce.framework.name yarn 5. Slaves

Configure the hostname or IP address of all slave nodes, one per line. The DataNode service and NodeManager service on all slave nodes will be started.

Hadoop001hadoop002hadoop0034.4 distributor

Distribute the Hadoop installation package to the other two servers, and it is recommended that you also configure the environment variables of Hadoop on these two servers.

# distribute the installation package to hadoop002scp-r / usr/app/hadoop-2.6.0-cdh6.15.2/ hadoop002:/usr/app/#. Distribute the installation package to hadoop003scp-r / usr/app/hadoop-2.6.0-cdh6.15.2/ hadoop003:/usr/app/ 5. Start cluster 5.1 and start ZooKeeper

Start the ZooKeeper service on the three servers:

ZkServer.sh start5.2 starts Journalnode

Go to the ${HADOOP_HOME} / sbin directory of each of the three servers and start the journalnode process:

Hadoop-daemon.sh start journalnode5.3 initializes NameNode

Execute the NameNode initialization command on hadop001:

Hdfs namenode-format

After executing the initialization command, you need to copy the contents of the NameNode metadata directory to other unformatted NameNode. The metadata store directory is the directory that we specified using the dfs.namenode.name.dir attribute in hdfs-site.xml. Here we need to copy it to hadoop002:

Scp-r / home/hadoop/namenode/data hadoop002:/home/hadoop/namenode/5.4 initializes the HA state

Use the following command on any NameNode to initialize the HA state in the ZooKeeper:

Hdfs zkfc-formatZK5.5 starts HDFS

Go to the ${HADOOP_HOME} / sbin directory of hadoop001 and start HDFS. The NameNode service on hadoop001 and hadoop002, and the DataNode service on the three servers will be started:

Start-dfs.sh5.6 starts YARN

Go to the ${HADOOP_HOME} / sbin directory of hadoop002 and start YARN. At this point, the ResourceManager service on hadoop002 and the NodeManager service on the three servers will be started:

Start-yarn.sh

It should be noted that the ResourceManager service on hadoop003 is usually not started at this time and needs to be started manually:

Yarn-daemon.sh start resourcemanager 6. View cluster 6.1 to view processes

After a successful startup, the processes on each server should be as follows:

[root@hadoop001 sbin] # jps4512 DFSZKFailoverController3714 JournalNode4114 NameNode3668 QuorumPeerMain5012 DataNode4639 NodeManager [root@hadoop002 sbin] # jps4499 ResourceManager4595 NodeManager3465 QuorumPeerMain3705 NameNode3915 DFSZKFailoverController5211 DataNode3533 JournalNode [root@hadoop003 sbin] # jps3491 JournalNode3942 NodeManager4102 ResourceManager4201 DataNode3435 QuorumPeerMain6.2 View Web UI

The port numbers of HDFS and YARN are 50070 and 8080, respectively, and the interface should be as follows:

At this point, NameNode on hadoop001 is available:

NameNode on hadoop002 is on standby:

ResourceManager on hadoop002 is available:

ResourceManager on hadoop003 is in a standby state:

At the same time, there is also information about Journal Manager on the interface:

VII. The second start of the cluster

The initial startup of the cluster above involves some necessary initialization operations, so the process is a bit tedious. However, once the cluster is built, it is convenient to enable it again. The steps are as follows (first, make sure that the ZooKeeper cluster is started):

Starting HDFS in hadoop001 starts all services related to the high availability of HDFS, including NameNode,DataNode and JournalNode:

Start-dfs.sh

Start YARN in hadoop002:

Start-yarn.sh

At this time, the ResourceManager service on hadoop003 is usually not started and needs to be started manually:

Yarn-daemon.sh start resourcemanager reference material

The above construction steps are mainly referred to from the official documents:

HDFS High Availability Using the Quorum Journal ManagerResourceManager High Availability

For a detailed analysis of the high availability principle of Hadoop, it is recommended to read:

Hadoop NameNode High availability (High Availability) implementation parsing

For more articles in big data's series, please see the GitHub Open Source Project: big data's getting started Guide.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.