How to build a fully distributed cluster in Hadoop2.7.4 04/27 Update SLTechnology News&Howtos

How to build a fully distributed cluster in Hadoop2.7.4

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article is about how Hadoop 2.7.4 fully distributed clusters are built. Xiaobian thinks it is quite practical, so share it with everyone for reference. Let's follow Xiaobian and have a look.

Configuring Linux Environment

Configure the network for each VM (NAT networking mode)

Modifications via Linux graphical interface (Desktop version Centos): Enter Linux GUI-> Right click on two small computers at the top right-> Click Edit connections-> Select current network System eth0 -> Click edit button-> Select IPv4 -> Method Select manual -> Click add button-> Add IP: 192.168.1.101 Subnet mask: 255.255.0 Gateway: 192.168.1.1 -> apply

How to modify profiles

vi /etc/sysconfig/network-scripts/ifcfg-eth0

DEVICE="eth0"

BOOTPROTO="static" ###

HWADDR="00:0C:29:3C:BF:E7"

IPV6INIT="yes"

NM_CONTROLLED="yes"

ONBOOT="yes"

TYPE="Ethernet"

UUID="ce22eeca-ecde-4536-8cc2-ef0dc36d4a8c"

IPADDR="192.168.1.101" ###

NETMASK="255.255.255.0" ###

GATEWAY="192.168.1.1" ###

Modify host names for individual virtual machines

vi /etc/sysconfig/network

NETWORKING=yes

HOSTNAME=node-1

Modify the mapping relationship between host names and IPs

vi /etc/hosts

192.168.1.101 node-1

192.168.1.102 node-2

192.168.1.103 node-3

turn off the firewall

#View firewall status

service iptables status

#Turn off the firewall

service iptables stop

#View firewall startup status

chkconfig iptables --list

#Turn off the firewall and start up

chkconfig iptables off

Configuration ssh login free

#Generate ssh login free key

ssh-keygen -t rsa (four carriage returns)

After executing this command, two files id_rsa (private key) and id_rsa.pub (public key) will be generated.

Copy the public key to the target machine where you want to log in without privacy

ssh-copy-id node-2

ssh-copy-id node-3

synchronous cluster time

Common manual time synchronization

date -s "2018-03-03 03:03:03"

Or network synchronization:

yum install ntpdate

ntpdate cn.pool.ntp.org

Install JDK and configure environment variables

Uploading jdk

rz jdk-8u65-linux-x64.tar.gz

Unzip jdk

tar -zxvf jdk-8u65-linux-x64.tar.gz -C /root/apps

Add java to environment variables

vim /etc/profile

#Add at the end of the file

export JAVA_HOME=/root/apps/jdk1.8.0_65

export PATH=$PATH:$JAVA_HOME/bin

export CLASSPATH=.:$ JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

#Refresh configuration

source /etc/profile

Install hadoop 2.7.4

Upload hadoop installation package to server

hadoop-2.7.4-with-centos-6.7.tar.gz

Extract the installation package

tar zxvf hadoop-2.7.4-with-centos-6.7.tar.gz

Note: Hadoop 2.x configuration file directory: $HADOOP_HOME/etc/hadoop

Configure hadoop's core profile

Profile hadoop-env.sh

vi hadoop-env.sh

export JAVA_HOME=/root/apps/jdk1.8.0_65

Profile core-site.xml

Description: Specify the file system schema (URI) used by HADOOP, and the address of HDFS's NameNode

fs.defaultFS

hdfs://node-1:9000

Description: Specify the storage directory where hadoop runtime generates files, default/tmp/hadoop-${user.name} -->

hadoop.tmp.dir

/home/hadoop/hadoop-2.4.1/tmp

Configuration file hdfs-site.xml

dfs.replication

dfs.namenode.secondary.http-address

node-2:50090

Profile mapred-site.xml

mv mapred-site.xml.template mapred-site.xml

vi mapred-site.xml

mapreduce.framework.name

yarn

Profile yarn-site.xml

yarn.resourcemanager.hostname

node-1

yarn.nodemanager.aux-services

mapreduce_shuffle

Configuration file slaves, in which the host name of the slave node is written

vi slaves

node-1

node-2

node-3

Add hadoop to environment variables

vim /etc/proflie

export JAVA_HOME=/root/apps/jdk1.8.0_65

export HADOOP_HOME=/root/apps/hadoop-2.7.4

export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

source /etc/profile

Format namenode (essentially initialize namenode)

hdfs namenode -format (hadoop namenode -format)

Start hadoop and verify that it starts successfully

Start HDFS first

sbin/start-dfs.sh

Restart YARN

sbin/start-yarn.sh

Use the jps command to verify

27408 NameNode

28218 Jps

27643 SecondaryNameNode (secondarynamenode)

28066 NodeManager

27803 ResourceManager

27512 DataNode

http://192.168.1.101:50070 (HDFS management interface)

http://192.168.1.101:8088 (MR Management Interface)

Thank you for reading! About "Hadoop 2.7.4 fully distributed cluster how to build" this article is shared here, I hope the above content can be of some help to everyone, so that everyone can learn more knowledge, if you think the article is good, you can share it to let more people see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.