How to build hadoop2.7.2 Cluster 07/02 Update SLTechnology News&Howtos

How to build hadoop2.7.2 Cluster

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to build a hadoop2.7.2 cluster for you. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

Cluster planning: processes Hadoop1 192.168.111.143 jdk, hadoop NameNode, DFSZKFailoverController (zkfc), ResourceManagerHadoop2 192.168.111.144 jdk, hadoop NameNode, DFSZKFailoverController (zkfc), ResourceManagerHadoop3 192.168.111.145 jdk, hadoop, zookeeper DataNode, NodeManager, JournalNode, QuorumPeerMainHadoop4 192.168.111.146 jdk, hadoop, zookeeper DataNode, NodeManager, JournalNode, QuorumPeerMainHadoop5 192.168.111.147 jdk 、 hadoop 、 zookeeper DataNode 、 NodeManager 、 JournalNode 、 QuorumPeerMain1. Zookeeper cluster build 1.1 decompress tar-zxvf zookeeper-3.4.9.tar.gz-C / home/hbase1.2 modify configuration cd / home/hbase/zookeeper-3.4.9/conf/cp zoo_sample.cfg zoo.cfgvim zoo.cfg

Modify:

DataDir=/home/hbase/zookeeper-3.4.9/tmp

At the end of the zoo.cfg, add:

Server.1=hadoop3:2888:3888server.2=hadoop4:2888:3888server.3=hadoop5:2888:3888 then creates a tmp folder mkdir / home/hbase/zookeeper-3.4.9/tmp and then creates an empty file touch / home/hbase/zookeeper-3.4.9/tmp/myid and finally writes IDecho 1 > > / home/hbase/zookeeper-3.4.9/tmp/myid1.3 to this file to copy the configured zookeeper to other nodes scp-r / home/hbase/zookeeper -3.4.9 / hadoop4: / home/hbase/ scp-r / home/hbase/zookeeper-3.4.9/ hadoop5: / home/hbase/

Note: modify the contents of hadoop4 and hadoop5 corresponding to / home/hbase / zookeeper-3.4.9/tmp/myid

Hadoop4: echo 2 > > / home/hbase/zookeeper-3.4.9/tmp/myidhadoop5: echo 3 > > / home/hbase/zookeeper-3.4.9/tmp/myid2. Install configuration hadoop cluster (operate on hadoop1) 2.1 extract tar-zxvf hadoop-2.7.2.tar.gz-C / home/hbase/2.2 configuration HDFS# add hadoop to environment variables vim / etc/profileexport JAVA_HOME=/home/habse/jdk/jdk1.7.0_79export HADOOP_HOME=/home/habse/hadoop-2.7.2export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin#hadoop2.0 configuration files are all in $HADOOP Cd / home/habse/hadoop-2.7.2/etc/hadoop2.2.1 modify hadoop-env.shexport JAVA_HOME=/home/hbase/jdk/jdk1.7.0_792.2.2 modify core-site.xml fs.defaultFS hdfs://ns1 hadoop.tmp.dir under _ HOME/etc/hadoop / home/habse/hadoop-2.7.2/tmp ha.zookeeper.quorum hadoop3:2181 Hadoop4:2181,hadoop5:2181 2.2.3 modify hdfs-site.xml dfs.nameservices ns1 dfs.ha.namenodes.ns1 nn1 Nn2 dfs.namenode.rpc-address.ns1.nn1 hadoop1:9000 dfs.namenode.http-address.ns1.nn1 hadoop1:50070 dfs.namenode.rpc-address.ns1.nn2 hadoop2: 9000 dfs.namenode.http-address.ns1.nn2 hadoop2:50070 dfs.namenode.shared.edits.dir qjournal://hadoop3:8485 Hadoop4:8485 Hadoop5:8485/ns1 dfs.journalnode.edits.dir / home/hbase/hadoop-2.7.2/journal dfs.ha.automatic-failover.enabled true dfs.client.failover.proxy.provider.ns1 Org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence shell (/ bin/true) dfs.ha.fencing.ssh .private-key-files / root/.ssh/id_rsa dfs.ha.fencing.ssh.connect-timeout 30000 2.2.4 modify mapred-site.xml mapreduce.framework.name yarn 2.2.5 modify yarn-site.xml Yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id yrc yarn.resourcemanager.ha.rm-ids rm1 Rm2 yarn.resourcemanager.hostname.rm1 hadoop1 yarn.resourcemanager.hostname.rm2 hadoop2 yarn.resourcemanager.zk-address hadoop3:2181,hadoop4:2181 Hadoop5:2181 yarn.nodemanager.aux-services mapreduce_shuffle 2.2.6 modify slaves

Slaves specifies the location of the child node, and the slaves file on hadoop1 specifies the location of datanode and nodemanager

Hadoop3hadoop4hadoop52.2.7 configuration password-free login # first configure hadoop1 to hadoop2, hadoop3, hadoop4, hadoop5 password-free login # produce a pair of keys on hadoop1 ssh-keygen-t rsa# copy the public key to other nodes, including your own ssh-coyp-id hadoop1ssh-coyp-id hadoop2ssh-coyp-id hadoop3ssh-coyp-id hadoop4ssh-coyp-id hadoop5 # Note: configure ssh password-free login between two namenode Don't forget to configure hadoop2 to hadoop1 login-free to produce a pair of keys on hadoop2 ssh-keygen-t rsassh-coyp-id-I hadoop12.3 copy the configured hadoop to other nodes scp-r / home/habse/hadoop-2.7.2/ root@hadoop2:/home/habse/scp-r / home/habse/hadoop-2.7.2/ root@hadoop3:/home/habse / scp-r / home/habse/hadoop-2.7.2/ root@hadoop4:/ Home/habse/ scp-r / home/habse/hadoop-2.7.2/ root@hadoop5:/home/habse / 3. First startup 3.1 start zookeeper cluster (start zk on hadoop3, hadoop4, hadoop5 respectively) cd / home/hbase/zookeeper-3.4.9/bin/./zkServer.sh start# check status: one leader, two follower./zkServer.sh status3.2 start journalnode (executed on hadoop3, hadoop4, hadoop5 respectively) cd / home/habse/hadoop-2.7.2sbin/hadoop-daemon.sh start journalnode# run jps command to verify JournalNode process added on hadoop3, hadoop4, hadoop5 3.3.Formenting HDFS # execute commands on hadoop1: hdfs namenode-formathdfs namenode-bootstrapStandby3.4 format ZK (just execute on hadoop1) hdfs zkfc-formatZK3.5 launch HDFS (execute on hadoop1) sbin/start-dfs.sh

Note:

If you cannot find the host where the datanode is located when starting datanode, first check whether the slaves file is configured correctly, and delete and recreate it if there is no problem

3.6 start YARN (execute on hadoop1) sbin/start-yarn.sh

View the progress of each machine:

At this point, after the hadoop-2.7.2 is configured, you can count browser visits:

Http://192.168.111.143:50070

NameNode 'hadoop1:9000' (active)

Http://192.168.111.144:50070

NameNode 'hadoop2:9000' (standby)

Datanode:

So after the hadoop cluster installation is complete, first start zookeeper and journalnode, then format HDFS and ZKFC, and then start namenode,resourcemanager,datanode

4. Startup and shutdown4.1 Hadoop Startup

1. / zkServer.sh start (hadoop3, hadoop4, hadoop5)

2. / hadoop-daemon.sh start journalnode (hadoop3, hadoop4, hadoop5)

3. Hdfs zkfc formatZK (hadoop1)

4. Hdfs namenode bootstrapStandby (hadoop2)

5. Hdfs zkfc formatZK (hadoop1)

6. / start-dfs.sh (hadoop1)

7. / start-yarn.sh (hadoop1)

8. If a process is not started, execute the startup command on that machine alone

9. / yarn-daemon start proxyserver

10. / mapred-daemon start historyserver

Description:

The formatting work is only completed before starting hadoop for the first time (step 2, 3, 4, 5). Don't use it later. If you have any problems in the startup process, you can reformat it.

Start resourcemanager:./yarn-daemon.sh start resourcemanager separately

Start namnode:./hadoop-daemon.sh start namenode separately

Start zkfc:./yarn-daemon.sh start zkfc separately

4.2 Hadoop shutdown

1.. / stop-dfs.sh

2.. / stop-yarn.sh

3. / yarn-daemon stop proxyserver

4. / mapred-daemon stop historyserver

5. Active / standby handover test

If you kill the namenode process of hadoop1 with the current state of active, you can see that hadoop2 has changed from standby to active, and then start the namenode of hadoop1 and find that the status of hadoop1 is standby.

This is the end of the article on "how to build hadoop2.7.2 clusters". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.