The Construction method of hadoop2.0 Cluster 04/27 Update SLTechnology News&Howtos

The Construction method of hadoop2.0 Cluster

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly explains the "hadoop2.0 cluster building method". The explanation content in this article is simple and clear, easy to learn and understand. Please follow the ideas of Xiaobian slowly and deeply to study and learn the "hadoop2.0 cluster building method" together.

hadoop 2.2.0 Cluster building

PS: Apache provides an installation package for hadoop-2.2.0 that is compiled on a 32-bit operating system and

Because hadoop relies on some C++ native libraries, if installed on 64-bit operations

Hadoop-2.2.0 needs to be recompiled on 64 operating systems

1. Preparation:(see Pseudo-distributed build)

1.1 Changing Linux Hostnames

1.2 modify IP

1.3 Modify the mapping relationship between host names and IPs

1.4 turn off the firewall

1.5ssh free landing

1.6. Install JDK, configure environment variables, etc.

2. Cluster planning:

PS：

In hadoop 2.0 it usually consists of two NameNodes, one active and

The other is in standby mode. Active NameNode provides external services

Standby NameNode does not provide external services, only synchronous active nameNode

so that it can switch quickly if it fails.

hadoop 2.0 officially provides two HDFS HA solutions, one is NFS,

The other is QJM. Here we use simple QJM. In this scenario,

Primary and standby NameNodes synchronize metadata information through a set of JournalNodes

A piece of data is considered successful as long as it is successfully written to most JournalNodes.

Usually an odd number of JournalNodes are configured

A zookeeper cluster is also configured here for ZKFC

(DFSZKFailoverController) failover when Active NameNode hangs

If it is dropped, it will automatically switch Standby NameNode to standby state.

3. Installation steps:

3.1. Installation Configuration Zooekeeper Cluster

3.1.1 decompression

tar -zxvf zookeeper-3.4.5.tar.gz -C /cloud/

3.1.2 modify the configuration

cd /cloud/zookeeper-3.4.5/conf/

cp zoo_sample.cfg zoo.cfg

vim zoo.cfg

Modified: dataDir=/cloud/zookeeper-3.4.5/tmp

Add at the end:

server.1=hadoop01:2888:3888

server.2=hadoop02:2888:3888

server.3=hadoop03:2888:3888

save and exit

Then create a tmp folder

mkdir /cloud/zookeeper-3.4.5/tmp

Create an empty file.

touch /cloud/zookeeper-3.4.5/tmp/myid

Finally write ID to the file

echo 1 > /cloud/zookeeper-3.4.5/tmp/myid

3.1.3 Copy the configured zookeeper to other nodes (first, in hadoop02,

hadoop03 Create a cloud directory under the root directory: mkdir /cloud)

scp -r /cloud/zookeeper-3.4.5/ hadoop02:/cloud/

scp -r /cloud/zookeeper-3.4.5/ hadoop03:/cloud/

Note: Modify hadoop02, hadoop03 corresponding

/cloud/zookeeper-3.4.5/tmp/myid Content

hadoop02：

echo 2 > /cloud/zookeeper-3.4.5/tmp/myid

hadoop03：

echo 3 > /cloud/zookeeper-3.4.5/tmp/myid

3.2. Installation Configuration Hadoop Cluster

3.2.1 decompression

tar -zxvf hadoop-2.2.0.tar.gz -C /cloud/

3.2.2 Configure HDFS (hadoop 2.0) All configuration files are in

$HADOOP_HOME/etc/hadoop

Add hadoop to environment variables

vim /etc/profile

export JAVA_HOME=/usr/java/jdk1.6.0_45

export HADOOP_HOME=/cloud/hadoop-2.2.0

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin

cd /cloud/hadoop-2.2.0/etc/hadoop

3.2.2.1 Modify hadoo-env.sh

export JAVA_HOME=/usr/java/jdk1.6.0_45

3,2.2.2 Modify core-site.xml

fs.defaultFS

hdfs://ns1

hadoop.tmp.dir

/cloud/hadoop-2.2.0/tmp

ha.zookeeper.quorum

hadoop01:2181,hadoop02:2181,hadoop03:2181

3,2.2.3 Modify hdfs-site.xml

dfs.nameservices

ns1

dfs.ha.namenodes.ns1

nn1,nn2

dfs.namenode.rpc-address.ns1.nn1

hadoop01:9000

dfs.namenode.http-address.ns1.nn1

hadoop01:50070

dfs.namenode.rpc-address.ns1.nn2

hadoop02:9000

dfs.namenode.http-address.ns1.nn2

hadoop02:50070

dfs.namenode.shared.edits.dir

qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/ns1

dfs.journalnode.edits.dir

/cloud/hadoop-2.2.0/journal

dfs.ha.automatic-failover.enabled

true

dfs.client.failover.proxy.provider.ns1

org.apache.hadoop.hdfs.server.namenode.ha.

ConfiguredFailoverProxyProvider

dfs.ha.fencing.methods

sshfence

dfs.ha.fencing.ssh.private-key-files

/root/.ssh/id_rsa

3.2.2.4 Changing Slaves

hadoop01

hadoop02

hadoop03

3.2.3 Configure YARN

3.2.3.1 Modify yarn-site.xml

yarn.resourcemanager.hostname

hadoop01

yarn.nodemanager.aux-services

mapreduce_shuffle

3.2.3.2 Modify mapred-site.xml

mapreduce.framework.name

yarn

3.2.4 Copy configured hadoop to other nodes

scp -r /cloud/hadoop-2.2.0/ hadoo02:/cloud/

scp -r /cloud/hadoop-2.2.0/ hadoo03:/cloud/

3.2.5 Start zookeeper cluster

(Start zk on hadoop01, hadoop02 and hadoop03 respectively)

cd /cloud/zookeeper-3.4.5/bin/

./ zkServer.sh start

View Status:

./ zkServer.sh status

One leader, two followers

3.2.6 Start journalnode (start all journalnodes on hadoop01)

cd /cloud/hadoop-2.2.0

sbin/hadoop-daemons.sh start journalnode

(Run jps command to verify that there is more JournalNode process)

3.2.7 Format HDFS

Execute the command on hadoop01:

hadoop namenode -format

After formatting, it will be displayed in hadoop.tmp.dir according to core-site.xml

Configuration generates a file, here I configure is/cloud/hadoop-2.2.0/tmp,

Then copy/cloud/hadoop-2.2.0/tmp to hadoop02's

/cloud/hadoop-2.2.0/below.

scp -r tmp/ hadoop02:/cloud/hadoop-2.2.0/

3.2.8 Format ZK(executed on hadoop01)

hdfs zkfc -formatZK

3.2.9 Start HDFS(executed on hadoop01)

sbin/start-dfs.sh

3.3.10 Start YARN(executed on hadoop01)

sbin/start-yarn.sh

At this point, hadoop 2.2.0 configuration is complete, you can count browser visits:

http://192.168.1.201:50070

NameNode 'hadoop01:9000' (active)

http://192.168.1.202:50070

NameNode 'hadoop02:9000' (standby)

Verify HDFS HA

First upload a file to hdfs

hadoop fs -put /etc/profile /profile

hadoop fs -ls /

Then kill the active NameNode.

kill -9

Via browser: 192.168.1.202:50070

NameNode 'hadoop02:9000' (active)

The NameNode on hadoop02 becomes active.

In the execution of orders:

hadoop fs -ls /

-rw-r--r-- 3 root supergroup 1926 2014-02-06 15:36 /profile

The file you just uploaded still exists!!!

Manually activate the dead NameNode.

sbin/hadoop-daemon.sh start namenode

Via browser: 192.168.1.201:50070

NameNode 'hadoop01:9000' (standby)

Verify YARN:

Run the WordCount program in the demo provided by hadoop:

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce

-examples-2.2.0.jar wordcount /profile /out

OK, done!!

Thank you for reading, the above is the "hadoop2.0 cluster building method" content, after the study of this article, I believe we have a deeper understanding of the hadoop2.0 cluster building method, the specific use of the situation also needs to be verified by practice. Here is, Xiaobian will push more articles related to knowledge points for everyone, welcome to pay attention!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.