In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article will explain in detail how to install hadoop-0.20.2 and simple use. The content of the article is of high quality. Therefore, Xiaobian shares it with you as a reference. I hope you have a certain understanding of relevant knowledge after reading this article.
The installation steps are as follows:
1.1 Machine Description
Total 4 machines: sc706-26, sc706-27, sc706-28, sc706-29
IP addresses: 192.168.153.89, 192.168.153.90, 192.168.153.91, 192.168.153.92
Operating system: fedora12 for Linux
jdk version: jdk-6u19-linux-i586
Hadoop version is: hadoop-0.20.2
sc706-26 as NameNode, JobTracker, the other three as DataNode, TaskTracker
1.2 ping machine with machine name
Log in as root, modify the/etc/hosts file on NameNode and DataNode, and add the IP addresses and machine names of the four machines, as follows:
192.168.153.89 sc706-26
192.168.153.90 sc706-27
192.168.153.91 sc706-28
192.168.153.92 sc706-29
After setting up, verify whether the ping between the machines is connected. You can use the machine name or IP address, such as ping sc706-27 or ping 192.168.153.90
1.3 New Hadoop User
Hadoop requires that the deployment directory structure of hadoop on all machines be the same and have an account with the same username. My default path is/home/hadoop.
1.4 ssh setting and closing firewall (root, su -required)
1) After fedora is installed, the sshd service is started by default. If you are unsure, you can check [root@sc706-26 hadoop]# service sshd status.
If not started, start [root@sc706-26 hadoop]# service sshd start
Create ssh passwordless login on NameNode [hadoop@sc706-26 ~]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Two files will be generated in ~/.ssh/: id_dsa and id_dsa.pub, which appear in pairs. Append the id_dsa.pub file to authorized_keys on DataNode.
[hadoop@sc706-26 ~]$ scp id_dsa.pub sc706-27:/home/hadoop/ (note that there is no space between: after the target machine and the path to the file to be transmitted, i.e. there is no space between sc706: and/home/hadoop/)
scp id_dsa.pub sc706-28:/home/hadoop/
scp id_dsa.pub sc706-29:/home/hadoop/
Log in to DataNode,[hadoop@sc706-27 ~]$ cat id_dsa.pub >> ~/.ssh/authorized_keys, the same for the other two, add to NameNode. Note: After appending, you must modify the permissions of.ssh and authorized_keys on NameNode and DataNode. chmod command, parameter 755. After completion of the test, for example, ssh sc706-27, you can log in without password, and you can know that ssh is successfully set.
2) Turn off the firewall (NameNode and DataNode must both be turned off)
[root@sc706-26 ~]# service iptables stop
Note: Hadoop must be turned off before restarting every time.
1.5 Install jdk1.6(several machines are the same)
Download jdk-6u19-linux-i586.bin from http://java.sun.com and install [root@sc706-26 java]#chmod +x jdk-6u19-linux-i586.bin [root@sc706-26 java]# ./ jdk-6u19-linux-i586.bin, my installation path is: /usr/java/jdk1.6.0_19, after installation add the following statement to/etc/profile:
export JAVA_HOME=/usr/java/jdk1.6.0_19
export JRE_HOME=/usr/java/jdk1.6.0_19/jre
export CLASSPATH=.:$ JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
1.6 Install hadoop
Download hadoop-0.20.2.tar.gz from http://apache.etoak.com//hadoop/core/
[hadoop@sc706-26 ~]$ tar xzvf hadoop-0.20.2.tar.gz
Add the installation path of hadoop to/etc/profile:
export HADOOP_HOME=/home/hadoop/hadoop-0.20.2
export PATH=$HADOOP_HOME/bin:$PATH
To make/etc/profile work, source [hadoop@sc706-26 ~]$ source /etc/profile
1.7 Configure hadoop
Its configuration file is in the/conf directory
1) Configure Java environment
[hadoop@sc706-26 ~]$ vim hadoop-0.20.2/conf/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.6.0_19
2) Configure conf/core-site.xml, conf/hdfs-site.xml, conf/mapred-site.xml files
[hadoop@sc706-26 ~]$ vim hadoop-0.20.2/conf/core-site.xml
hadoop.tmp.dir
/home/hadoop/tmp
fs.default.name
hdfs://sc706-26:9000
[hadoop@sc706-26 ~]$ vim hadoop-0.20.2/conf/mapred-site.xml
mapred.job.tracker
hdfs://sc706-26:9001 Note: Can you add hdfs before sc706-26://Not very clear, I have two clusters, one can be used without adding
[hadoop@sc706-26 ~]$ vim hadoop-0.20.2/conf/hdfs-site.xml
dfs.name.dir
/home/hadoop/name
dfs.data.dir
/home/hadoop/data
dfs.replication
3 Note: If set to 1, there is only one copy of the data. If one of the datanodes has a problem, the whole job will fail.
3) Copy the complete hadoop on NameNode to DataNode. You can compress it first and then directly scp it or copy it with disk.
4) Configure conf/masters and conf/slaves on NameNode
masters:192.168.153.89
slaves:192.168.153.90
192.168.153.91
192.168.153.92
1.8 running Hadoop
1) Format file system
[hadoop@sc706-26 hadoop-0.20.2]$ hadoop namenode -format
Note: When formatting, it is necessary to prevent the namespace ID of NameNode from being inconsistent with the namespace ID of DataNode, because temporary file record information such as Name, Data and tmp will be generated every time the format is formatted, and a lot of formatting will be generated, which will lead to different IDs and cause hadoop not to run.
2) Start Hadoop
[hadoop@sc706-26 hadoop-0.20.2]$ bin/start-all.sh
3) Use the jps command to view the process. The results on NameNode are as follows:
25325 NameNode
25550 JobTracker
28210 Jps
25478 SecondaryNameNode
4) View cluster status
[hadoop@sc706-26 hadoop-0.20.2]$ hadoop dfsadmin -report
Make sure the correct number of DataNodes are running, mine is 3, so you can see which DataNodes are not running
5) View it in the web mode of hadoop
[hadoop@sc706-26 hadoop-0.20.2]$ links http://192.168.153.89 (i.e. master):50070
1.9 Run the Wordcount.java program
1) Create two files f1 and f2 on the local disk first
[hadoop@sc706-26 ~]$ echo "hello Hadoop goodbye hadoop" > f1
[hadoop@sc706-26 ~]$ echo "hello bye hadoop hadoop" > f2
2) Create an input directory on hdfs
[hadoop@sc706-26 ~]$ hadoop dfs -mkdir input
3) Copy f1 and f2 to the input directory of hdfs
[hadoop@sc706-26 ~]$ hadoop dfs -copyFromLocal /home/hadoop/f* input
4) Check whether there is an input directory on hdfs
[hadoop@sc706-26 ~]$ hadoop dfs -ls
5)Check if there are successful copies of f1 and f2 in the input directory
[hadoop@sc706-26 ~]$ hadoop dfs -ls input
6)Execute wordcount (make sure there is no output directory on hdfs)
[hadoop@sc706-26 hadoop-0.20.2]$ hadoop jar hadoop-0.20.2-examples.jar wordcount input output
7) Run complete, view results
[hadoop@sc706-26 hadoop-0.20.2]$ hadoop dfs -cat output/part-r-00000
About how to install hadoop-0.20.2 and simple use to share here, I hope the above content can be of some help to everyone, you can learn more knowledge. If you think the article is good, you can share it so that more people can see it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.