Install hadoop Cluster (Multi Cluster) 09/20 Update SLTechnology News&Howtos

Install hadoop Cluster (Multi Cluster)

2025-09-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Configure the environment

This document installs the hadoop cluster environment, with one master as the namenode node and one slave as the datanode node:

(1) master:

Os: CentOS release 6.5 (Final)

Ip: 172.16.101.58

User:root

Hadoop-2.9.0.tar.gz

(2) slave:

Os: CentOS release 6.5 (Final)

Ip: 172.16.101.59

User:root

Hadoop-2.9.0.tar.gz

prerequisite

(1) both master and slave have installed the java environment and configured the environment variables

(2) master node decompresses hadoop-2.9.0.tar.gz and configures environment variables

(3) this document is installed by root users, so you need root users on master to log in to slave nodes using ssh without passwords using root users.

Configure cluster files

Execute on the master node (this document configures the file on the master node and then copies it to other slave nodes via scp)

(1) slaves file: write this file as the hostname or ip of DataNode, one per line, default to localhost, so in pseudo-distributed configuration, the node acts as both NameNode and DataNode.

[root@sht-sgmhadoopdn-01 hadoop] # cat slaves

172.16.101.59

(2) File core-site.xml

[root@sht-sgmhadoopdn-01 hadoop] # cat / usr/local/hadoop-2.9.0/etc/hadoop/core-site.xml

Fs.defaultFS

Hdfs://172.16.101.58:9000

Hadoop.tmp.dir

/ usr/local/hadoop-2.9.0/tmp

Abase for other temporary directories.

(3) File hdfs-site.xml

[root@sht-sgmhadoopdn-01 hadoop] # cat / usr/local/hadoop-2.9.0/etc/hadoop/hdfs-site.xml

Dfs.namenode.secondary.http-address

172.16.101.58:50090

Dfs.replication

one

Dfs.namenode.name.dir

File:/usr/local/hadoop-2.9.0/tmp/dfs/name

Dfs.datanode.data.dir

File:/usr/local/hadoop-2.9.0/tmp/dfs/data

(4) File mapred-site.xml

[root@sht-sgmhadoopdn-01 hadoop] # cat / usr/local/hadoop-2.9.0/etc/hadoop/mapred-site.xml

Mapreduce.framework.name

Yarn

Mapreduce.jobhistory.address

172.16.101.58:10020

Mapreduce.jobhistory.webapp.address

172.16.101.58:19888

(5) File yarn-site.xml

[root@sht-sgmhadoopdn-01 hadoop] # cat / usr/local/hadoop-2.9.0/etc/yarn-site.xml

Yarn.resourcemanager.hostname

172.16.101.58

Yarn.nodemanager.aux-services

Mapreduce_shuffle

Once configured, copy the / usr/local/hadoop-2.9.0 file on Master to each node. Because you have previously run in pseudo-distributed mode, it is recommended that you delete the previous temporary files before switching to cluster mode.

[root@sht-sgmhadoopdn-01 local] # rm-rf. / hadoop-2.9.0/tmp

[root@sht-sgmhadoopdn-01 local] # rm-rf. / hadoop-2.9.0/logs

[root@sht-sgmhadoopdn-01 local] # tar-zcf hadoop-2.9.0.master.tar.gz / usr/local/hadoop-2.9.0

[root@sht-sgmhadoopdn-01 local] # scp hadoop-2.9.0.master.tar.gz sht-sgmhadoopdn-02:/usr/local/

Execute on the Slave node

[root@sht-sgmhadoopdn-02 local] # tar-zxf hadoop-2.9.0.master.tar.gz

Start the hadoop cluster

Execute on the master node:

# formatting HDFS is required for the first startup, but not for later startup

[root@sht-sgmhadoopdn-01 hadoop-2.9.0] # hdfs namenode-format

[root@sht-sgmhadoopdn-01 hadoop-2.9.0] # start-dfs.sh

[root@sht-sgmhadoopdn-01 hadoop-2.9.0] # start-yarn.sh

[root@sht-sgmhadoopdn-01 hadoop-2.9.0] # mr-jobhistory-daemon.sh start historyserver

[root@sht-sgmhadoopdn-01 hadoop-2.9.0] # jps

20289 JobHistoryServer

19730 ResourceManager

18934 NameNode

19163 SecondaryNameNode

20366 Jps

Execute on the Slave node:

[root@sht-sgmhadoopdn-02 hadoop] # jps

32147 DataNode

535 Jps

32559 NodeManager

Execute on the master node:

[root@sht-sgmhadoopdn-01 hadoop] # hdfs dfsadmin-report

Configured Capacity: 75831140352 (70.62 GB)

Present Capacity: 21246287872 (19.79 GB)

DFS Remaining: 21246263296 (19.79 GB)

DFS Used: 24576 (24 KB)

DFS Used%: 0.005%

Under replicated blocks: 0

Blocks with corrupt replicas: 0

Missing blocks: 0

Missing blocks (with replication factor 1): 0

Pending deletion blocks: 0

Live datanodes (1): # number of slave surviving

Name: 172.16.101.59 50010 (sht-sgmhadoopdn-02)

Hostname: sht-sgmhadoopdn-02

Decommission Status: Normal

Configured Capacity: 75831140352 (70.62 GB)

DFS Used: 24576 (24 KB)

Non DFS Used: 50732867584 (47.25 GB)

DFS Remaining: 21246263296 (19.79 GB)

DFS Used%: 0.005%

DFS Remaining%: 28.02%

Configured Cache Capacity: 0 (0B)

Cache Used: 0 (0B)

Cache Remaining: 0 (0B)

Cache Used%: 100.00%

Cache Remaining%: 0.005%

Xceivers: 1

Last contact: Wed Dec 27 11:08:46 CST 2017

Last Block Report: Wed Dec 27 11:02:01 CST 2017

Console management platform

NameNode http://172.16.101.58:50070

Execute distributed instance MapReduce Job

[root@sht-sgmhadoopdn-01 hadoop] # hdfs dfs-mkdir-p / user/root/input

[root@sht-sgmhadoopdn-01 hadoop] # hdfs dfs-put / usr/local/hadoop-2.9.0/etc/hadoop/*.xml input

[root@sht-sgmhadoopdn-01 hadoop] # hadoop jar / usr/local/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.0.jar grep input output 'dfs [a Murray z.] +'

17-12-27 11:25:33 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

17-12-27 11:25:34 INFO client.RMProxy: Connecting to ResourceManager at / 172.16.101.58:8032

17-12-27 11:25:36 INFO input.FileInputFormat: Total input files to process: 9

17-12-27 11:25:36 INFO mapreduce.JobSubmitter: number of splits:9

17-12-27 11:25:37 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled

17-12-27 11:25:37 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1514343869308_0001

17-12-27 11:25:38 INFO impl.YarnClientImpl: Submitted application application_1514343869308_0001

11:25:38 on 17-12-27 INFO mapreduce.Job: The url to track the job: http://sht-sgmhadoopdn-01:8088/proxy/application_1514343869308_0001/

17-12-27 11:25:38 INFO mapreduce.Job: Running job: job_1514343869308_0001

17-12-27 11:25:51 INFO mapreduce.Job: Job job_1514343869308_0001 running in uber mode: false

17-12-27 11:25:51 INFO mapreduce.Job: map 0 reduce 0

17-12-27 11:26:14 INFO mapreduce.Job: map 11% reduce 0

17-12-27 11:26:15 INFO mapreduce.Job: map 67% reduce 0

17-12-27 11:26:29 INFO mapreduce.Job: map 100% reduce 0

17-12-27 11:26:32 INFO mapreduce.Job: map 100 reduce 100%

17-12-27 11:26:34 INFO mapreduce.Job: Job job_1514343869308_0001 completed successfully

17-12-27 11:26:34 INFO mapreduce.Job: Counters: 50

[root@sht-sgmhadoopdn-01 hadoop] # hdfs dfs-cat output/*

17-12-27 11:30:08 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

1 dfsadmin

1 dfs.replication

1 dfs.namenode.secondary.http

1 dfs.namenode.name.dir

1 dfs.datanode.data.dir

You can also visit console through a browser to view detailed analysis information:

ResourceManager-http://172.16.101.58:8088

Stop the hadoop cluster

Execute on the master node:

[root@sht-sgmhadoopdn-01 hadoop] # stop-yarn.sh

[root@sht-sgmhadoopdn-01 hadoop] # stop-dfs.sh

[root@sht-sgmhadoopdn-01 hadoop] # mr-jobhistory-daemon.sh stop historyserver

Reference link:

Http://www.powerxing.com/install-hadoop-cluster/

Http://hadoop.apache.org/docs/r2.9.0/hadoop-project-dist/hadoop-common/ClusterSetup.html

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.