In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail how to build a fully distributed cluster in hadoop. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.
Hadoop distributed cluster building (environment: on linux virtual machine)
1. Preparation work: (plan host name, ip and usage, build three sets first, and dynamically add the fourth one)
In the usage column, you can also use namenode,secondaryNamenode and jobTracker
Separate deployment, depending on the actual needs, not unique)
Hostname Machine ip usage
Cloud01 192.168.1.101 namenode/secondaryNamenode/jobTracker
Cloud02 192.168.1.102 datanode/taskTracker
Cloud03 192.168.1.103 datanode/taskTracker
Cloud04 192.168.1.104 datanode/taskTracker
two。 Configure the linux environment (refer to pseudo-distributed building below)
2.1 modify hostname (cloud01,cloud02,cloud03)
2.2 modify the ip of each machine (as assigned by yourself)
2.3 modify the mapping between hostname and ip
(only modify it on cloud01, copy it to other machines after the change, instruction:
Scp / etc/profile cloud02:/etc/
Scp / etc/profile cloud03:/etc/)
2.4 turn off the firewall
2.5 restart
3. Install jdk (refer to pseudo-distributed building, version take jdk1.6.0_45 as an example)
Just install it on one machine and copy it to another machine (the software is best managed)
For example, on cloud01, jdk is installed under / soft/java
(use instruction: scp-r / soft/java/ cloud02:/soft/
Scp-r / soft/java/ cloud03:/soft/
You can copy the jdk over. But let's not copy it for the time being, and copy it together after the following hadoop is installed.)
4. Install hadoop cluster (hadoop version takes hadoop-1.1.2 as an example)
4.1 upload the hadoop package to the / soft directory and decompress it to that directory (see pseudo-distributed build)
4.2 configure hadoop (this time 6 files need to be configured)
4.21hadoop-env.sh
On the ninth line
Export JAVA_HOME=/soft/java/jdk1.6.0_45 (pay attention to removing the previous #)
4.22core-site.xml
Fs.default.name
Hdfs://cloud01:9000
Hadoop.tmp.dir
/ soft/hadoop-1.1.2/tmp
4.23hdfs-site.xml
Dfs.replication
three
4.24mapred-site.xml
Mapred.job.tracker
Cloud01:9001
4.25masters (specify secondarynamenode address)
Cloud01
4.26slaves (specify child node)
Cloud02
Cloud03
4.3 copy the configured hadoop to the other two machines
Copy the soft folder directly (it contains jdk and hadoop, so it is highly recommended
Documents should be managed uniformly)
Directive:
Scp-r / soft/ cloud02:/
Scp-r / soft/ cloud03:/
4.4 configure ssh logon-free
It is login-free from primary node to child node.
That is, login-free from cloud01 to cloud02 and cloud03
Just generate it on cloud01.
Instruction: ssh-keygen-t rsa
And copy it to the other two machines.
Instruction: ssh-copy-id-I cloud02
Ssh-copy-id-I cloud03
4.5 formatting hadoop
You just need to format it on cloud01 (master node namenode).
Instruction: hadoop namenode-format
4.6 Verification
Start cluster directive: start-all.sh
If the startup process, error safemode related Exception
Execute command: hadoop dfsadmin-safemode leave (exit safe mode)
Start hadoop again
Then jps, check each machine to see if it is the same as the planned use)
OK, if it goes as planned, it will be done.
5. Add a node dynamically
(it is very common and practical in the actual production process.)
Cloud04 192.168.1.104 datanode/taskTracker
Add a linux through clone (take clone cloud01 as an example. This will not be the case in the actual production process.
Because virtual machines are rarely used in the actual production process, they are all direct servers. Note that when clone
You have to stop the machine that wants clone first)
5.2 modify hostname, ip address, configuration mapping file, turn off firewall, and then hadoop configuration
Add cloud04 to the file slaves, set no login, restart
If you clone, you no longer need to configure the mapping file and turn off the firewall. Because
The machine on your clone has been configured.
5.3 after restarting the machine, start datanode and taskTracker respectively
Instruction: hadoop-daemon.sh start datanode
Hadoop-daemon.sh start tasktracker
5.4 run the command refresh on cloud01, the node where namenode is located
Hadoop dfsadmin-refreshNodes
5.5 Verification
Ip:50070 of http://linux (hdfs management interface)
To see if there is one more node, if there is one more node, it is done!
6. Delete a node (here for collection)
6.1 modify the / soft/hadoop-1.1.2/conf/hdfs-site.xml file on cloud01
Add configuration information:
Dfs.hosts.exclude
/ soft/hadoop-1.1.2/conf/excludes
6.2 determine the machine to be removed from the shelf
The content of the file defined by dfs.hosts.exclude is one per line for each machine that needs to be offline.
6.3 force configuration reload
Instruction: hadoop dfsadmin-refreshNodes
6.4 shut down the node
Instruction: hadoop dfsadmin-report
You can view the nodes connected on the cluster now.
When Decommission is in progress, it will show:
Decommission Status: Decommission in progress
When the execution is complete, the following is displayed:
Decommission Status: Decommissioned
6.5 Edit the excludes file again
Once the machine is off the shelf, they can be removed from the excludes file
Log in to the machine to be taken off the shelf and you will find that the DataNode process is gone, but the TaskTracker still exists
It needs to be handled by hand.
This is the end of the article on "how to build a fully distributed hadoop cluster". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.