In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly explains the "hadoop2.0 cluster building method". The explanation content in this article is simple and clear, easy to learn and understand. Please follow the ideas of Xiaobian slowly and deeply to study and learn the "hadoop2.0 cluster building method" together.
hadoop 2.2.0 Cluster building
PS: Apache provides an installation package for hadoop-2.2.0 that is compiled on a 32-bit operating system and
Because hadoop relies on some C++ native libraries, if installed on 64-bit operations
Hadoop-2.2.0 needs to be recompiled on 64 operating systems
1. Preparation:(see Pseudo-distributed build)
1.1 Changing Linux Hostnames
1.2 modify IP
1.3 Modify the mapping relationship between host names and IPs
1.4 turn off the firewall
1.5ssh free landing
1.6. Install JDK, configure environment variables, etc.
2. Cluster planning:
PS:
In hadoop 2.0 it usually consists of two NameNodes, one active and
The other is in standby mode. Active NameNode provides external services
Standby NameNode does not provide external services, only synchronous active nameNode
so that it can switch quickly if it fails.
hadoop 2.0 officially provides two HDFS HA solutions, one is NFS,
The other is QJM. Here we use simple QJM. In this scenario,
Primary and standby NameNodes synchronize metadata information through a set of JournalNodes
A piece of data is considered successful as long as it is successfully written to most JournalNodes.
Usually an odd number of JournalNodes are configured
A zookeeper cluster is also configured here for ZKFC
(DFSZKFailoverController) failover when Active NameNode hangs
If it is dropped, it will automatically switch Standby NameNode to standby state.
3. Installation steps:
3.1. Installation Configuration Zooekeeper Cluster
3.1.1 decompression
tar -zxvf zookeeper-3.4.5.tar.gz -C /cloud/
3.1.2 modify the configuration
cd /cloud/zookeeper-3.4.5/conf/
cp zoo_sample.cfg zoo.cfg
vim zoo.cfg
Modified: dataDir=/cloud/zookeeper-3.4.5/tmp
Add at the end:
server.1=hadoop01:2888:3888
server.2=hadoop02:2888:3888
server.3=hadoop03:2888:3888
save and exit
Then create a tmp folder
mkdir /cloud/zookeeper-3.4.5/tmp
Create an empty file.
touch /cloud/zookeeper-3.4.5/tmp/myid
Finally write ID to the file
echo 1 > /cloud/zookeeper-3.4.5/tmp/myid
3.1.3 Copy the configured zookeeper to other nodes (first, in hadoop02,
hadoop03 Create a cloud directory under the root directory: mkdir /cloud)
scp -r /cloud/zookeeper-3.4.5/ hadoop02:/cloud/
scp -r /cloud/zookeeper-3.4.5/ hadoop03:/cloud/
Note: Modify hadoop02, hadoop03 corresponding
/cloud/zookeeper-3.4.5/tmp/myid Content
hadoop02:
echo 2 > /cloud/zookeeper-3.4.5/tmp/myid
hadoop03:
echo 3 > /cloud/zookeeper-3.4.5/tmp/myid
3.2. Installation Configuration Hadoop Cluster
3.2.1 decompression
tar -zxvf hadoop-2.2.0.tar.gz -C /cloud/
3.2.2 Configure HDFS (hadoop 2.0) All configuration files are in
$HADOOP_HOME/etc/hadoop
Add hadoop to environment variables
vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.6.0_45
export HADOOP_HOME=/cloud/hadoop-2.2.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin
cd /cloud/hadoop-2.2.0/etc/hadoop
3.2.2.1 Modify hadoo-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_45
3,2.2.2 Modify core-site.xml
fs.defaultFS
hdfs://ns1
hadoop.tmp.dir
/cloud/hadoop-2.2.0/tmp
ha.zookeeper.quorum
hadoop01:2181,hadoop02:2181,hadoop03:2181
3,2.2.3 Modify hdfs-site.xml
dfs.nameservices
ns1
dfs.ha.namenodes.ns1
nn1,nn2
dfs.namenode.rpc-address.ns1.nn1
hadoop01:9000
dfs.namenode.http-address.ns1.nn1
hadoop01:50070
dfs.namenode.rpc-address.ns1.nn2
hadoop02:9000
dfs.namenode.http-address.ns1.nn2
hadoop02:50070
dfs.namenode.shared.edits.dir
qjournal://hadoop01:8485;hadoop02:8485;hadoop03:8485/ns1
dfs.journalnode.edits.dir
/cloud/hadoop-2.2.0/journal
dfs.ha.automatic-failover.enabled
true
dfs.client.failover.proxy.provider.ns1
org.apache.hadoop.hdfs.server.namenode.ha.
ConfiguredFailoverProxyProvider
dfs.ha.fencing.methods
sshfence
dfs.ha.fencing.ssh.private-key-files
/root/.ssh/id_rsa
3.2.2.4 Changing Slaves
hadoop01
hadoop02
hadoop03
3.2.3 Configure YARN
3.2.3.1 Modify yarn-site.xml
yarn.resourcemanager.hostname
hadoop01
yarn.nodemanager.aux-services
mapreduce_shuffle
3.2.3.2 Modify mapred-site.xml
mapreduce.framework.name
yarn
3.2.4 Copy configured hadoop to other nodes
scp -r /cloud/hadoop-2.2.0/ hadoo02:/cloud/
scp -r /cloud/hadoop-2.2.0/ hadoo03:/cloud/
3.2.5 Start zookeeper cluster
(Start zk on hadoop01, hadoop02 and hadoop03 respectively)
cd /cloud/zookeeper-3.4.5/bin/
./ zkServer.sh start
View Status:
./ zkServer.sh status
One leader, two followers
3.2.6 Start journalnode (start all journalnodes on hadoop01)
cd /cloud/hadoop-2.2.0
sbin/hadoop-daemons.sh start journalnode
(Run jps command to verify that there is more JournalNode process)
3.2.7 Format HDFS
Execute the command on hadoop01:
hadoop namenode -format
After formatting, it will be displayed in hadoop.tmp.dir according to core-site.xml
Configuration generates a file, here I configure is/cloud/hadoop-2.2.0/tmp,
Then copy/cloud/hadoop-2.2.0/tmp to hadoop02's
/cloud/hadoop-2.2.0/below.
scp -r tmp/ hadoop02:/cloud/hadoop-2.2.0/
3.2.8 Format ZK(executed on hadoop01)
hdfs zkfc -formatZK
3.2.9 Start HDFS(executed on hadoop01)
sbin/start-dfs.sh
3.3.10 Start YARN(executed on hadoop01)
sbin/start-yarn.sh
At this point, hadoop 2.2.0 configuration is complete, you can count browser visits:
http://192.168.1.201:50070
NameNode 'hadoop01:9000' (active)
http://192.168.1.202:50070
NameNode 'hadoop02:9000' (standby)
Verify HDFS HA
First upload a file to hdfs
hadoop fs -put /etc/profile /profile
hadoop fs -ls /
Then kill the active NameNode.
kill -9
Via browser: 192.168.1.202:50070
NameNode 'hadoop02:9000' (active)
The NameNode on hadoop02 becomes active.
In the execution of orders:
hadoop fs -ls /
-rw-r--r-- 3 root supergroup 1926 2014-02-06 15:36 /profile
The file you just uploaded still exists!!!
Manually activate the dead NameNode.
sbin/hadoop-daemon.sh start namenode
Via browser: 192.168.1.201:50070
NameNode 'hadoop01:9000' (standby)
Verify YARN:
Run the WordCount program in the demo provided by hadoop:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce
-examples-2.2.0.jar wordcount /profile /out
OK, done!!
Thank you for reading, the above is the "hadoop2.0 cluster building method" content, after the study of this article, I believe we have a deeper understanding of the hadoop2.0 cluster building method, the specific use of the situation also needs to be verified by practice. Here is, Xiaobian will push more articles related to knowledge points for everyone, welcome to pay attention!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.