How to use 4 nodes to build Hadoop2.x HA Test Cluster 04/28 Update SLTechnology News&Howtos

How to use 4 nodes to build Hadoop2.x HA Test Cluster

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

Editor to share with you how to use 4 nodes to build Hadoop2.x HA test cluster, I believe most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's learn about it!

Build Hadoop2.x HA1. Machine preparation

4 virtual machines

10.211.55.22 node1

10.211.55.23 node2

10.211.55.24 node3

10.211.55.25 node4

two。 Four host nodes arrange nodenamenodedatanodezkzkfcjnrmapplimanagernode11

eleven

Node211111

1node3

eleven

111node4

one

one hundred and eleven

Summary:

The number of node startup nodes node14node27node36node453. All machine preparations 3.1Hostname and each hosts dns file configuration

Modify the name of the virtual machine

Modify the dns of mac's node1 node2 node3 node4

Hostnamenode1 node2 node3 node4vi / etc/sysconfig/network Host and node1 node2 node3 node4vi / etc/hosts10.211.55.22 node110.211.55.23 node210.211.55.24 node310.211.55.25 node4

Restart

3.2 turn off the firewall service iptables stop & & chkconfig iptables off

Check

Service iptables status3.3 configuration Keyless

The dsa algorithm is used here.

Node1 node2 node3 node4's own machine configuration is key-free

$ssh-keygen-t dsa-P''- f ~ / .ssh/id_dsa $cat ~ / .ssh/id_dsa.pub > > ~ / .ssh/authorized_keys

Copy from node1 to node2 node3 node4

Scp ~ / .ssh/id_dsa.pub root@node2:~scp ~ / .ssh/id_dsa.pub root@node3:~scp ~ / .ssh/id_dsa.pub root@node4:~node2 node3 node4 appends itself: cat ~ / id_dsa.pub > > ~ / .ssh/authorized_keys

Copy from node2 to node1 node3 node4

Scp ~ / .ssh/id_dsa.pub root@node1:~scp ~ / .ssh/id_dsa.pub root@node3:~scp ~ / .ssh/id_dsa.pub root@node4:~node1 node3 node4 appends itself: cat ~ / id_dsa.pub > > ~ / .ssh/authorized_keys

Copy from node3 to node1 node2 node4

Scp ~ / .ssh/id_dsa.pub root@node1:~scp ~ / .ssh/id_dsa.pub root@node2:~scp ~ / .ssh/id_dsa.pub root@node4:~node1 node2 node4 appends itself: cat ~ / id_dsa.pub > > ~ / .ssh/authorized_keys

Copy from node4 to node1 node2 node3

Scp ~ / .ssh/id_dsa.pub root@node1:~scp ~ / .ssh/id_dsa.pub root@node2:~scp ~ / .ssh/id_dsa.pub root@node3:~node1 node2 node3 appends itself: cat ~ / id_dsa.pub > > ~ / .ssh/authorized_keys3.4 time synchronization ntp

All machines:

When yum install ntpntpdate-u s2m.time.edu.cn starts, you need to synchronize the insurance. It is best to set up local area network time synchronization and keep synchronization.

Check: date

3.5 install java jdk

Install jdk and configure environment variables

All machines:

Uninstall openjdk:java-versionrpm-qa | grep jdkrpm-e-- nodeps java-1.6.0-openjdk-javadoc-1.6.0.0-1.41.1.10.4.el6.x86_64...rpm-qa | grep jdk installs jdk:rpm-ivh jdk-7u67-linux-x64.rpm vi ~ / .bash_profile export JAVA_HOME=/usr/java/jdk1.7.0_67export PATH=$PATH:$JAVA_HOME/binsource ~ / .bash_profile

Check:

Upload software and decompression of java-version3.6

Upload hadoop-2.5.1_x64.tar.gz

Scp / Users/mac/Documents/happyup/study/files/hadoop/hadoop-2.5.1_x64.tar.gz root@node1:/homenode2node3node4

Upload zk

Scp / Users/mac/Documents/happyup/study/files/hadoop/ha/zookeeper-3.4.6.tar.gz root@node1:/homenode2node3

Decompress:

Node1 node2 node3 node4tar-xzvf / home/hadoop-2.5.1_x64.tar.gznode1 node2 node3tar-xzvf / home/zookeeper-3.4.6.tar.gz3.7 snapshot

Hadoop complete ha preparation work

3.1Hostname and configuration of each hosts dns file

3.2 turn off the firewall

3.3 configure mutual key-free for all machines

3.4 time synchronization ntp

3.5 install java jdk

3.6 upload the decompression software hadoop zk

At this time, make a snapshot, and other machines can also use it.

4.zk installation configuration 4.1 modify the configuration file zoo.cfgssh root@node1 cp / home/zookeeper-3.4.6/conf/zoo_sample.cfg / home/zookeeper-3.4.6/conf/zoo.cfgvi zoo.cfg where dataDir=/opt/zookeeper is added at the end: server.1=node1:2888:3888server.2=node2:2888:3888server.3=node3:2888:3888:wq4.2 modify the working directory to the datadir directory: mkdir / opt/zookeepercd / opt/zookeeperls vi myid Fill in 1: wq copy related files to node2 node3scp-r / opt/zookeeper/ root@node2:/opt modify to 2scp-r / opt/zookeeper/ root@node3:/opt modify to 34.3 synchronous configuration copy zk to node2 node3scp-r / home/zookeeper-3.4.6/conf root@node2:/home/zookeeper-3.4.6/confscp-r / home/zookeeper-3.4.6/conf root@node3:/home/zookeeper-3.4.6/conf4.4 add environment variables

Node1 node2 node3

Add PATHvi ~ / .bash_profile export ZOOKEEPER_HOME=/home/zookeeper-3.4.6PATH add: $ZOOKEEPER_HOME/binsource ~ / .bash_profile4.5 startup: cd zk bin directory: zkServer.sh startjps:3214 QuorumPeerMain start node1 node2 node35.hadoop installation configuration 5.1 hadoop-env.shcd / home/hadoop-2.5.1/etc/hadoop/vi hadoop-env.sh change: export JAVA_HOME=/usr/java/jdk1.7. 0_675.2 slavesvi slaves node2node3node45.3 hdfs-site.xml

Vi hdfs-site.xml

Dfs.nameservices cluster1 dfs.ha.namenodes.cluster1 nn1,nn2 dfs.namenode.rpc-address.cluster1.nn1 node1:8020 dfs.namenode.rpc-address.cluster1.nn2 node2:8020 dfs.namenode.http-address.cluster1.nn1 node1:50070 dfs.namenode.http-address.cluster1.nn2 node2:50070 dfs.namenode.shared.edits.dir qjournal://node2:8485;node3:8485 Node4:8485/cluster1 dfs.client.failover.proxy.provider.cluster1 org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider dfs.ha.fencing.methods sshfence dfs.ha.fencing.ssh.private-key-files / root/.ssh/id_dsa dfs.journalnode.edits.dir / opt/journal/data dfs.ha.automatic-failover.enabled true5.4 core-site.xml

Vi core-site.xml

Fs.defaultFS hdfs://cluster1 hadoop.tmp.dir / opt/hadoop ha.zookeeper.quorum node1:2181,node2:2181,node3:21815.5 mapred-site.xml

Vi mapred-site.xml

Cp mapred-site.xml.template mapred-site.xml mapreduce.framework.name yarn5.6 yarn-site.xml

Vi yarn-site.xml does not need to configure applicationmanager because it is the same as datanode

Yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.ha.enabled true yarn.resourcemanager.cluster-id rm yarn.resourcemanager.ha.rm-ids rm1,rm2 yarn.resourcemanager.hostname.rm1 node3 yarn.resourcemanager.hostname.rm2 node4 yarn.resourcemanager.zk-address node1:2181,node2:2181,node3:2181 5.7Sync profile

Synchronize to node2 node3 node4

Scp / home/hadoop-2.5.1/etc/hadoop/* root@node2:/home/hadoop-2.5.1/etc/hadoopscp / home/hadoop-2.5.1/etc/hadoop/* root@node3:/home/hadoop-2.5.1/etc/hadoopscp / home/hadoop-2.5.1/etc/hadoop/* root@node4:/home/hadoop-2.5.1/etc/hadoop5.8 modifies environment variables

Node1 node2 node3 node4

Vi ~ / .bash_profileexport HADOOP_HOME=/home/hadoop-2.5.1PATH add:: $HADOOP_HOME/bin:$HADOOP_HOME/sbinsource ~ / .bash_profile5.8 start

1. Start zk for node1 node2 node3

Startup: under the bin directory of cd zk: zkServer.sh startjps:3214 QuorumPeerMain starts node1 node2 node3 in turn

two。 Start journalnode for formatting namenode if this is a second reconfiguration, delete / opt/hadoop / opt/journal/data node1 node2 node3 node4

Execute the following in node2 node3 node4:

. / hadoop-daemon.sh start journalnodejps verifies whether there is a journalnode process

3. Format a namenode node1

Cd bin./hdfs namenode-format verifies the print log to see if there are any files generated in the working directory

4. Synchronize the edits file of this namenode to another node2 to start the copied namenode node1

Cd sbin./hadoop-daemon.sh start namenode verifies log log cd.. / logs tail-n50 hadoop-root-namenode

5. Perform synchronization of edits files

Do (node2) cd bin./hdfs namenode-bootstrapStandby without formatting to namenode to see if there are any files generated on node2

6. Stop all services to node1

Cd sbin./stop-dfs.sh

7. Be sure to start initializing zkfc,zk, on any namenode

Cd bin./hdfs zkfc-formatZK

8. Start

Resourcemanager in cd sbin:./start-dfs.shsbin/start-yarn.shjps:remanager nodemanagernode1:8088 or start-all.sh2.x needs to start node3 node4yarn-daemon.sh start resourcemanageryarn-daemon.sh stop resourcemanager manually

9. Check to see if the startup is successful and the test

Jpshdfs webui: http://node1:50070http://node2:50070 standbyrm webui: http://node3:8088http://node4:8088 upload file: cd bin./hdfs dfs-mkdir-p / usr/file./hdfs dfs-put / usr/local/jdk / usr/file closes a rm and closes a namenode effect.

10. The solution to the problem

1. Console output 2.jps3. The log of the corresponding node 4. Delete the hadoop working directory and delete the journode working directory before formatting. These are all the contents of the article "how to use 4 nodes to build a Hadoop2.x HA test cluster". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.