CentOS 6.4.How to install Hadoop-2.5.1 fully distributed clusters 07/09 Update SLTechnology News&Howtos

CentOS 6.4.How to install Hadoop-2.5.1 fully distributed clusters

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces how to install Hadoop-2.5.1 fully distributed cluster in CentOS 6.4. it has a certain reference value, and interested friends can refer to it. I hope you will gain a lot after reading this article.

Environment introduction:

Install the Hadoop-2.5.1 distributed cluster (2 machines, mainly experimental) on two servers with CentOS 6.4 (32-bit).

1. Modify the hostname and / etc/hosts file

1) modify hostname (not necessary)

Vim / etc/sysconfig/network

HOSTNAME=XXX

It takes effect after reboot.

2) / etc/hosts is the file of ip address and its corresponding hostname, which makes the machine know the correspondence between ip and hostname. The format is as follows:

# IPAddress HostName

192.168.1.67 MasterServer

192.168.1.241 SlaveServer

2. Configure password-free login SSH

1) generate the key:

Ssh-keygen-t dsa-P''- f ~ / .ssh/id_dsa

The above are two single quotation marks.

2) append id_dsa.pub (public key) to the authorized key:

Cat ~ / .ssh/id_dsa.pub > > ~ / .ssh/authorized_keys

3) copy the authentication file to other nodes:

Scp / .ssh/authorized_keys hadooper@192.168.1.241:~/.ssh/

To confirm the connection for the first time, type yes.

But I still ask for a password because the permissions of .ssh and authorized_keys are incorrect. For more information, please see: http://www.linuxidc.com/Linux/2014-10/107762.htm

3. Install jdk on each node

1) the selected version is jdk-6u27-linux-i586.bin, download address: http://pan.baidu.com/s/1dDGi5QL

2) upload it to the hadooper user directory and add execution permission

Chmod 777 jdk-6u27-linux-i586.bin

3) installation

. / jdk-6u27-linux-i586.bin

4) configure environment variables: vi / etc/profile add the following three lines

# JAVA_HOME

Export JAVA_HOME=/usr/lib/jvm/jdk1.6/jdk1.6.0_27

Export PATH=$JAVA_HOME/bin:$PATH

5) execute source / etc/profile to make the configuration of environment variables effective

6) execute java-version to check the jdk version and verify whether it is successful.

4. Hadoop installation

Hadoop is installed on each node. Upload hadoop-2.5.1.tar.gz to the user's hadooper directory.

1) decompression

Tar-zvxf hadoop-2.5.1.tar.gz

2) add environment variables:

# vim / etc/profile, add the following at the tail

Export HADOOP_HOME=/home/hadooper/hadoop/hadoop-2.5.1

Export HADOOP_COMMON_HOME=$HADOOP_HOME

Export HADOOP_HDFS_HOME=$HADOOP_HOME

Export HADOOP_MAPRED_HOME=$HADOOP_HOME

Export HADOOP_YARN_HOME=$HADOOP_HOME

Export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop

Export CLASSPATH=.:$JAVA_HOME/lib:$HADOOP_HOME/lib:$CLASSPATH

Export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH

The setting takes effect immediately:

Source / etc/profile

3) modify Hadoop configuration file

(1) core-site.xml

Fs.defaultFS

Hdfs://MasterServer:9000

(2) hdfs-site.xml

Dfs.replication

three

(3) mapred-site.xml

Mapreduce.framework.name

Yarn

Mapreduce.jobhistory.address

MasterServer:10020

Mapreduce.jobhistory.webapp.address

MasterServer:19888

Jobhistory is a history server that comes with Hadoop to record Mapreduce history jobs. By default, jobhistory is not started and can be started with the following command:

Sbin/mr-jobhistory-daemon.sh start historyserver

(4) yarn-site.xml

Yarn.nodemanager.aux-services

Mapreduce_shuffle

Yarn.resourcemanager.address

MasterServer:8032

Yarn.resourcemanager.scheduler.address

MasterServer:8030

Yarn.resourcemanager.resource-tracker.address

MasterServer:8031

Yarn.resourcemanager.admin.address

MasterServer:8033

Yarn.resourcemanager.webapp.address

MasterServer:8088

(5) slaves

SlaveServer

(6) add JAVA_HOME to hadoop-env.sh and yarn-env.sh respectively

Export JAVA_HOME=/usr/lib/jvm/jdk1.6/jdk1.6.0_27

5. Run Hadoop

1) formatting

Hdfs namenode-format

2) start Hadoop

Start-dfs.sh

Start-yarn.sh

You can also use one command:

Start-all.sh

3) stop Hadoop

Stop-all.sh

4) jps view process

7692 ResourceManager

8428 JobHistoryServer

7348 NameNode

14874 Jps

7539 SecondaryNameNode

5) check the running status of the cluster through the browser

(1) http://192.168.1.67:50070

(2) http://192.168.1.67:8088/

(3) http://192.168.1.67:19888

6. Run the wordcount example that comes with Hadoop

1) create an input file:

Echo "My first hadoop example. Hello Hadoop in input." > input

2) set up a directory

Hadoop fs-mkdir / user/hadooper

3) upload files

Hadoop fs-put input / user/hadooper

4) execute wordcount program

Hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar wordcount / user/hadooper/input / user/hadooper/output

5) View the results

Hadoop fs-cat / user/hadooper/output/part-r-00000

Hadoop 1

My 1

Example.Hello 1

First 1

Hadoop 1

In 1

Input. one

Thank you for reading this article carefully. I hope the article "how to install a fully distributed Hadoop-2.5.1 cluster in CentOS 6.4" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.