How to build Hadoop 2.7.3 Cluster under CentOS 6.7 07/03 Update SLTechnology News&Howtos

How to build Hadoop 2.7.3 Cluster under CentOS 6.7

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces how to build a Hadoop 2.7.3 cluster under CentOS 6.7. it has a certain reference value, and interested friends can refer to it. I hope you will gain a lot after reading this article.

There are three operation modes of Hadoop cluster: stand-alone mode, pseudo-distribution mode and full distribution mode. We set up a third fully distributed mode here, even with a distributed system, running on multiple nodes.

1 Environment prepares 1.1 to configure DNS

Enter the configuration file and add the ip mapping relationship between the master node and the slave node:

# vim / etc/hosts 10.0.0.45 master 10.0.0.46 slave1 10.0.0.47 slave21.2 turn off the firewall # service iptables stop / / close the service # chkconfig iptables off / / turn off boot 1.3 configuration password-free login

(1) each node first enters the / root/.ssh directory to generate the key:

# ssh-keygen-t rsa / / enter the command and enter continuously

(2) on the primary node, copy the public key to a specific file:

[root@master .ssh] # cp id_rsa.pub authorized_keys

(3) copy the public key generated from each slave node to the master node:

[root@slave1 .ssh] # scp id_rsa.pub master:/root/.ssh/id_rsa_ slave1.pub[ root @ slave2 .ssh] # scp id_rsa.pub master:/root/.ssh/id_rsa_slave2.pub

(4) merge the public key of the slave node on the master node:

[root@master .ssh] # cat id_rsa_slave1.pub > > authorized_ Keys [root @ master .ssh] # cat id_rsa_slave2.pub > > authorized_keys

(5) copy the merged public key on the master node to the slave node:

[root@master .ssh] # scp authorized_keys slave1:/root/.ssh [root@master .ssh] # scp authorized_keys slave2:/root/.ssh

The configuration is complete, and ssh access is performed on each node. If you can access it without a password, the configuration is successful.

1.4 configure the java environment

First download jdk and save it to the specified directory. Set the environment variable:

# vim / etc/profileexport JAVA_HOME=/usr/java/jdk1.8.0_112export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarexport PATH=$PATH:$JAVA_HOME/bin# source / etc/profile / / make the configuration effective

Verify that the configuration is successful:

# java-vesrion

If the following configuration information appears, the configuration of the java environment is successful:

2 deploy Hadoop cluster

The process of installing and configuring Hadoop on each node is basically the same, so after installing Hadoop on each node, you can make a unified configuration on the master node master, and then copy the modified configuration file to each slave node through the scp command. The deployment process is described below.

2.1 install Hadoop

Download the hadoop installation package at http://archive.apache.org/dist/hadoop/common/hadoop-2.7.3/, and extract:

# tar xvf hadoop-2.7.3.tar.gz

Configure environment variables:

# vim / etc/profileexport HADOOP_HOME=/home/hadoop-2.7.3 export PATH=$PATH:$HADOOP_HOME/binexport PATH=$PATH:$HADOOP_HOME/sbinexport HADOOP_CONF_DIR=$ {HADOOP_HOME} / etc/hadoop

2.2 modify the configuration file

Enter the Hadoop configuration file directory $HADOOP_HOME/etc/hadoop, and you can see that there are many configuration files. The Hadoop cluster configuration is mainly to modify the following files:

Core-site.xml

Hdfs-site.xml

Yarn-site.xml

Mapred-site.xml

Slaves 、 hadoop-env.sh 、 yarn-env.sh

The following describes the specific configuration of the file, and modifies the configuration information according to the actual situation:

(1) core-site.xml

Fs.defaultFS hdfs://master:9000 hadoop.tmp.dir / home/hadoop _ tmp

(2) hdfs-site.xml

Dfs.permissions.enabled false dfs.support.append true dfs.replication 2 dfs.datanode.data.dir file:///home/dfs_data dfs.namenode.name.dir file:///home / dfs_name dfs.namenode.rpc-address master:9000 dfs.namenode.secondary.http-address slave1:50090 dfs.namenode.secondary.https-address slave1:50091 dfs.webhdfs.enabled true

(3) yarn-site.xml

Yarn.resourcemanager.hostname master yarn.nodemanager.aux-services mapreduce_shuffle yarn.nodemanager.resource.memory-mb 20480 yarn.scheduler.maximum-allocation-mb 10240 yarn.nodemanager.resource.cpu-vcores 5 yarn.nodemanager.vmem-check-enabled false

(4) mapred-site.xml

Mapreduce.framework.name yarn

(5) slaves

To start the Hadoop cluster, you need to read this file to determine the hostname of the slave node, thus starting daemons such as DataNode, NodeManager, and so on, so you need to add the hostname of the slave node to this file.

Slave1slave2

(6) hadoop-env.sh

Modify the following:

Export JAVA_HOME=/usr/java/jdk1.8.0_112

(7) yarn-env.sh

Add the following:

Export JAVA_HOME=/usr/java/jdk1.8.0_112

At this point, all the configurations on the master node have been completed, and you just need to copy the configuration information to each slave node:

# scp / home/hadoop-2.7.3/etc/hadoop/* slave1:/home/hadoop-2.7.3/etc/hadoop/# scp / home/hadoop-2.7.3/etc/hadoop/* slave2:/home/hadoop-2.7.3/etc/hadoop/

2.3 start Hadoop

(1) initialization is required when starting HDFS for the first time, which is performed on the primary node:

# cd / home/hadoop-2.7.3 #. / bin/hadoop namenode-format

(2) start HDFS:

#. / sbin/start-dfs.sh

After a successful startup, visit http://master:50070/ to see the HDFS Web interface.

(3) start YARN:

#. / sbin/start-yarn.sh

After a successful startup, visit http://master:8088/ to see the YARN Web interface.

You can also directly execute the following command to start at one click, but this is not recommended for the first startup:

# start-all.sh

At this point, the Hadoop cluster environment has been set up, and you can "play" happily on it according to your business needs.

Thank you for reading this article carefully. I hope the article "how to build Hadoop 2.7.3 Cluster under CentOS 6.7" shared by the editor will be helpful to you. At the same time, I also hope you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.