Developers learn Linux (14): CentOS7 installation and configuration big data platform Hadoop2.9.0 07/13 Update SLTechnology News&Howtos

Developers learn Linux (14): CentOS7 installation and configuration big data platform Hadoop2.9.0

2025-07-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

1. Preface

"the shift of things in big clouds" was a hot topic at that time, referring to big data, cloud computing, the Internet of things and the mobile Internet, among which big data talked much about Hadoop. Of course, Hadoop does not represent big data, but big data deals with the field of a more famous open source framework just, usually said big data includes big data's storage, big data's analytical processing and big data's query display, this article mentioned Hadoop is only in which big data's analysis and processing link plays a role, Apache provides an open source family bucket, including Hadoop, HBase, Zookeeper, Spark, Hive and Pig and some other frameworks. However, due to space limitations, this article only covers the pseudo-distributed deployment of Hadoop, including MapReduce and HDFS.

two。 Prepare for

JDK files: jdk-8u131-linux-x64.tar.gz

Official download address: http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

Hadoop files: hadoop-2.9.0.tar.gz

Official download address: http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.9.0/hadoop-2.9.0.tar.gz

3. Installation

3.1 install Oracle JDK

3.1.1 Uninstall Open JDK

Although Open JDK is an open source implementation of JDK, I still prefer Oracle JDK and may think it comes from a more orthodox background. So first check to see if Open JDK is installed on the server and run the command:

Rpm-qa | grep jdk

If the server already has Open JDK installed, uninstall Open JDK in the following way, as shown in the following figure:

3.1.2 install Oracle JDK

Some people may feel a little uncomfortable with Oracle JDK, at least I am. At that time, Sun launched Java but was acquired by Oracle because of poor management, so Sun JDK, which was often said before, is now called Oracle JDK. Download the JDK installation package mentioned earlier to the / root directory.

Install as follows:

Cd / root

Tar-zxf / root/jdk-8u131-linux-x64.tar.gz-C / usr/local

This way JDK is installed in the / usr/local/jdk1.8.0_131 directory.

The following is to set the environment variables, because there are interactive shell and non-interactive shell in Linux. Interactive shell is the shell executed in the process of interaction with the user, usually the shell that logs in to the Linux system through SSH and then waits for user input or commands to wait for user input or confirmation, while non-interactive shell is shell without user intervention, such as the startup of some service, etc. Interactive shell reads all users' environment variable settings from / etc/profile, and non-interactive shell reads all users' environment variable settings from / etc/bashrc. Therefore, there may be no problem with executing shell scripts in an interactive environment, and there may be problems when executing shell scripts in a non-interactive environment where the configuration information of environment variables cannot be found.

To be on the safe side, we configure Java-related environment variables in both / etc/profile and / etc/bashrc, and put the added content at the end of the original file, as follows:

Export JAVA_HOME=/usr/local/jdk1.8.0_131

Export JRE_HOME=/usr/local/jdk1.8.0_131

Export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

Export PATH=$PATH:$JAVA_HOME/bin

You can view the modified file through tail:

[root@hadoop ~] # tail / etc/profile-n 5

Export JAVA_HOME=/usr/local/jdk1.8.0_131

Export JRE_HOME=/usr/local/jdk1.8.0_131

Export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

Export PATH=$PATH:$JAVA_HOME/bin

[root@hadoop ~] # tail / etc/bashrc-n 5

Export JAVA_HOME=/usr/local/jdk1.8.0_131

Export JRE_HOME=/usr/local/jdk1.8.0_131

Export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

Export PATH=$PATH:$JAVA_HOME/bin

[root@hadoop ~] # source / etc/profile

[root@hadoop ~] # source / etc/bashrc

[root@hadoop ~] # java-version

Java version "1.8.0,131"

Java (TM) SE Runtime Environment (build 1.8.0_131-b11)

Java HotSpot (TM) 64-Bit Server VM (build 25.131-b11, mixed mode)

[root@hadoop ~] #

Note that for our configuration to take effect immediately, we should execute source / etc/profile and source / etc/bashrc to read the updated configuration information immediately.

3.2 install Hadoop

Download the Hadoop installation package mentioned earlier to the / root directory.

Install Hadoop under the following name:

[root@hadoop ~] # cd / root

[root@hadoop] # tar-zxf / root/hadoop-2.9.0.tar.gz-C / usr/local

In this way, Hadoop2.9.0 is installed in the / usr/local/hadoop-2.9.0 directory, which we can check with the following command:

[root@hadoop ~] # / usr/local/hadoop-2.9.0/bin/hadoop version

Hadoop 2.9.0

Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git-r 756ebc8394e473ac25feac05fa493f6d612e6c50

Compiled by arsuresh on 2017-11-13T23:15Z

Compiled with protoc 2.5.0

From source with checksum 0a76a9a32a5257331741f8d5932f183

This command was run using / usr/local/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar

[root@hadoop ~] #

After all, it is not convenient to execute hadoop with such a long list of commands, especially when manual input is needed. We can still configure the environment parameters related to JAVA to the environment variable configuration files source / etc/profile and source / etc/bashrc by configuring Hadoop environment parameters, and add the following configuration at the end of the two files:

Export HADOOP_HOME=/usr/local/hadoop-2.9.0

Export HADOOP_INSTALL=$HADOOP_HOME

Export HADOOP_MAPRED_HOME=$HADOOP_HOME

Export HADOOP_COMMON_HOME=$HADOOP_HOME

Export HADOOP_HDFS_HOME=$HADOOP_HOME

Export YARN_HOME=$HADOOP_HOME

Export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native

Export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin

After adding the content to save, don't forget to execute the following command to refresh the environment variable information:

[root@hadoop ~] # source / etc/profile

[root@hadoop ~] # source / etc/bashrc

At this time, there is no need to bring the path information to execute the relevant commands of hadoop, as follows:

[root@hadoop ~] # hadoop version

Hadoop 2.9.0

Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git-r 756ebc8394e473ac25feac05fa493f6d612e6c50

Compiled by arsuresh on 2017-11-13T23:15Z

Compiled with protoc 2.5.0

From source with checksum 0a76a9a32a5257331741f8d5932f183

This command was run using / usr/local/hadoop-2.9.0/share/hadoop/common/hadoop-common-2.9.0.jar

4. Configuration

4.1 user configuration

For ease of management and maintenance, we create a separate system account to run hadoop-related scripts and tasks, which is called hadoop and is created with the following script:

Useradd hadoop-s / bin/bash-m

The above command creates a user and user group named hadoop, and nginx users cannot log in to the system (- s / sbin/nologin limit), which can be viewed through the id command:

[root@hadoop ~] # id hadoop

Uid=1001 (hadoop) gid=1001 (hadoop) groups=1001 (hadoop)

Users created by the above command do not have a password and need to use passwd to set the password:

[root@hadoop ~] # passwd hadoop

Changing password for user hadoop.

New password:

BAD PASSWORD: The password is shorter than 8 characters

Retype new password:

Passwd: all authentication tokens updated successfully.

Because the user hadoop is sometimes required to execute some high-privilege commands, give it sudo permission, open the / etc/sudoers file, find the line "root ALL= (ALL) ALL", and add a line below:

Hadoop ALL= (ALL) ALL

Then save the file (remember that if you are editing with an editor such as vim, use ": wq!" when saving finally. Command to force saving to this read-only file). The result of the modification is shown in the following figure:

4.2 login-free configuration

Although this article is about the pseudo-distributed deployment of Hadoop, there are also some distributed operations in the middle, which requires the ability to log in with ssh. Note that the ssh here is not SSH (Servlet+Spring+Hibernate) in Java. SSH here is the abbreviation of Secure Shell and a service protocol for remote login between Linux servers.

If you are currently logged in as root, switch to hadoop:

[root@hadoop hadoop] # su hadoop

[hadoop@hadoop] $cd ~

[hadoop@hadoop ~] $pwd

/ home/hadoop

You can see that the work path of the hadoop user is / home/hadoop, and then we log in to the machine with ssh. When we log in for the first time, we will be prompted whether to continue to log in, and then enter "yes", and then prompt us to enter the password (here is localhost) of the user currently used for ssh login (here is localhost). After entering the correct password, we can log in, and then log out after entering "exit", as shown below:

[hadoop@hadoop ~] $ssh localhost

Hadoop@localhost's password:

Last login: Sat Dec 2 11:48:52 2017 from localhost

[hadoop@hadoop ~] $rm-rf / home/hadoop/.ssh

[hadoop@hadoop ~] $ssh localhost

The authenticity of host 'localhost (:: 1)' can't be established.

ECDSA key fingerprint is aa:21:ce:7a:b2:06:3e:ff:3f:3e:cc:dd:40:38:64:9d.

Are you sure you want to continue connecting (yes/no)? Yes

Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

Hadoop@localhost's password:

Last login: Sat Dec 2 11:49:58 2017 from localhost

[hadoop@hadoop ~] $exit

Logout

Connection to localhost closed.

After doing this, the directory / home/hadoop/.ssh and the known_hosts file in that directory are created.

In this way, each login will prompt for a password, but during the Hadoop run, the command will be executed on the remote server in the form of no interaction with shell, so it needs to be set to password-free login. We need to create a key file (enter all the way) by using the following command:

[hadoop@hadoop ~] $cd / home/hadoop/.ssh/

[hadoop@hadoop .ssh] $ssh-keygen-t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/ home/hadoop/.ssh/id_rsa):

Enter passphrase (empty for no passphrase):

Enter same passphrase again:

Your identification has been saved in / home/hadoop/.ssh/id_rsa.

Your public key has been saved in / home/hadoop/.ssh/id_rsa.pub.

The key fingerprint is:

19:b3:11:a5:6b:a3:26:03:c9:b9:b3:b8:02:ea:c9:25 hadoop@hadoop

The key's randomart image is:

+-[RSA 2048]-+

|. |

| | o |

| = |

|. O B |

| = S |

|. O o. | |

| | oEo.o o |

| | +. + o + |

| =. | |

+-+

Then add the contents of the key file to the authorized_keys file and grant 600 permissions.

[hadoop@hadoop .ssh] $cat id_rsa.pub > > authorized_keys

[hadoop@hadoop .ssh] $chmod 600 authorized_keys

At this point, there is no need to enter a password using the ssh localhost command, as follows:

[hadoop@hadoop .ssh] $ssh localhost

Last login: Sat Dec 2 11:50:44 2017 from localhost

[hadoop@hadoop ~] $exit

Logout

Connection to localhost closed.

Note: a similar operation was mentioned in part 9 of this series on password-free login for git users, and it was also mentioned that git file transfer also uses the ssh protocol.

Configuration of 4.3hadoop

4.3.1 change the owner of the hadoop installation directory

First, check whether the owner and user group of the hadoop installation directory / usr/local/hadoop-2.9.0 is hadoop. If not, you need to set it through chown:

[hadoop@hadoop .ssh] $ls-lh / usr/local/hadoop-2.9.0

Total 128K

Drwxr-xr-x. 2 root root 194 Nov 14 07:28 bin

Drwxr-xr-x. 3 root root 20 Nov 14 07:28 etc

Drwxr-xr-x. 2 root root 106 Nov 14 07:28 include

Drwxr-xr-x. 3 root root 20 Nov 14 07:28 lib

Drwxr-xr-x. 2 root root 239 Nov 14 07:28 libexec

-rw-r--r--. 1 root root 104K Nov 14 07:28 LICENSE.txt

-rw-r--r--. 1 root root 16K Nov 14 07:28 NOTICE.txt

-rw-r--r--. 1 root root 1.4K Nov 14 07:28 README.txt

Drwxr-xr-x. 3 root root 4.0K Nov 14 07:28 sbin

Drwxr-xr-x. 4 root root 31 Nov 14 07:28 share

Here are the commands to change owner and group:

[hadoop@hadoop .ssh] $sudo chown-R hadoop:hadoop / usr/local/hadoop-2.9.0

We trust you have received the usual lecture from the local System

Administrator. It usually boils down to these three things:

# 1) Respect the privacy of others.

# 2) Think before you type.

# 3) With great power comes great responsibility.

[sudo] password for hadoop:

If you check it again, you can see that the command was executed successfully.

[hadoop@hadoop .ssh] $ls-lh / usr/local/hadoop-2.9.0

Total 128K

Drwxr-xr-x. 2 hadoop hadoop 194 Nov 14 07:28 bin

Drwxr-xr-x. 3 hadoop hadoop 20 Nov 14 07:28 etc

Drwxr-xr-x. 2 hadoop hadoop 106 Nov 14 07:28 include

Drwxr-xr-x. 3 hadoop hadoop 20 Nov 14 07:28 lib

Drwxr-xr-x. 2 hadoop hadoop 239 Nov 14 07:28 libexec

-rw-r--r--. 1 hadoop hadoop 104K Nov 14 07:28 LICENSE.txt

-rw-r--r--. 1 hadoop hadoop 16K Nov 14 07:28 NOTICE.txt

-rw-r--r--. 1 hadoop hadoop 1.4K Nov 14 07:28 README.txt

Drwxr-xr-x. 3 hadoop hadoop 4.0K Nov 14 07:28 sbin

Drwxr-xr-x. 4 hadoop hadoop 31 Nov 14 07:28 share

4.3.2 change the configuration of hadoop

The configuration files for hadoop are stored in the / usr/local/hadoop-2.9.0/etc/hadoop directory, and there are mainly several configuration files:

Core-site.xml

Hdfs-site.xml

Mapred-site.xml

Yarn-site.xml

Among them, the latter two are mainly related to YARN configuration.

Change the core-site.xml to the following:

Fs.defaultFS

Hdfs://localhost:9000

Then change the hdfs-site.xml to the following:

Dfs.replication

one

5. Verify configuration

5.1NameNode formatting

After the above configuration, Hadoop is configured successfully, but it does not work, and initialization needs to be carried out. Because we have configured the relevant environment variables of Hadoop, we can directly execute the following command:

Hdfs namenode-format

If there is no problem, you can see the following output:

There is a sentence: "INFO common.Storage: Storage directory / tmp/hadoop-hadoop/dfs/name has been successfully formatted."

5.2 start the NameNode and DataNode daemons

Use the start-dfs.sh command to enable NameNode and DataNode daemon. The first time you execute, you will be asked whether to connect. Enter "yes" (because ssh password-free login has been configured), as shown below (please be sure to run it with the created hadoop user. If it is not hadoop, remember to use the su hadoop command to switch to the hadoop user):

[hadoop@hadoop hadoop] $start-dfs.sh

17-12-02 13:54:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

Starting namenodes on [localhost]

Localhost: starting namenode, logging to / usr/local/hadoop-2.9.0/logs/hadoop-hadoop-namenode-hadoop.out

Localhost: starting datanode, logging to / usr/local/hadoop-2.9.0/logs/hadoop-hadoop-datanode-hadoop.out

Starting secondary namenodes [0.0.0.0]

The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.

ECDSA key fingerprint is aa:21:ce:7a:b2:06:3e:ff:3f:3e:cc:dd:40:38:64:9d.

Are you sure you want to continue connecting (yes/no)? Yes

0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.

0.0.0.0: starting secondarynamenode, logging to / usr/local/hadoop-2.9.0/logs/hadoop-hadoop-secondarynamenode-hadoop.out

17-12-02 13:54:42 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

Then we can check the startup status through jps, a tool provided by java:

[hadoop@hadoop hadoop] $jps

11441 Jps

11203 SecondaryNameNode

10903 NameNode

11004 DataNode

The above process will occur if the startup is successful. If there are no NameNode and DataNode processes, please check the configuration, or check the log under / usr/local/hadoop-2.9.0/logs to see the configuration error.

At this point, you can enter http://localhost:50070/ in the browser to view the information of NameNode and DataNode and the information of HDFS. The interface is as follows:

If the virtual machine is in bridging mode, you can also view it outside the virtual machine, like the CentOS7 you use, you need to pay attention to two points:

1. Change "SELINUX=enforcing" to "SELINUX=disabled" in the / etc/sysconfig/selinux file

two。 Disable the firewall by executing systemctl disable firewalld.

5.3 execute the WordCount program

WordCount in Hadoop, like the Hello World program in other programming languages, is how the program is written through a simple program.

5.3.1 introduction to HDFS

To run WordCount, you need to use HDFS,HDFS, the cornerstone of Hadoop. It can be understood that Hadoop has to deal with a large number of data files, these data files need a reliable way to store, even in the case of some machine hard disk damage, the data stored in the data file will not be lost. When the amount of data is relatively small, the disk array (Redundant Arrays of Independent Disks,RAID) can do this, and now HDFS implements this function in a software way.

HDFS also provides some commands for operating on the file system, we know that Linux itself provides some operations on the file system, such as mkdir, rm, mv, etc., the same operation is also provided in HDFS, but there are some changes in the execution mode, for example, the execution of mkdir commands in HDFS should be written as hdfs dfs-mkdir, similarly, ls commands should be written as hdfs dfs-ls when executed under HDFS.

Here are some HDFS commands:

Cascade create HDFS directory: hdfs dfs-mkdir-p / user/haddop

Check the HDFS directory: hdfs dfs-ls / user

Create a HDFS directory: hdfs dfs-mkdir / input

Check the HDFS directory: hdfs dfs-ls /

Delete the HDFS directory: hdfs dfs-rm-r-f / input

Delete the HDFS directory: hdfs dfs-rm-r-f / user/haddop

Cascade create HDFS directory: hdfs dfs-mkdir-p / user/hadoop/input

Note: directories created in HDFS can only be viewed in HDFS, and cannot be seen outside of HDFS (for example, under the command line on a Linux system). Repeat the important thing several times: follow the prompts in Section 3.2 of this article to configure the Hadoop installation path information into the environment variable.

5.3.2 execute the WordCount program

First, change the work path to the installation directory of Hadoop: / usr/local/hadoop-2.9.0

Then create a directory in HDFS: hdfs dfs-mkdir-p / user/hadoop/input

Then specify the data source to be analyzed. You can put some text data in the input directory under the HDFS system you just created. For simplicity, put some xml under the hadoop installation path for configuration in the input directory:

Hdfs dfs-put / usr/local/hadoop-2.9.0/etc/hadoop/*.xml / user/hadoop/input

At this point, you can view it through HDFS:

[hadoop@hadoop ~] $hdfs dfs-ls / user/hadoop/input

17-12-17 10:27:27 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

Found 8 items

-rw-r--r-- 1 hadoop supergroup 7861 2017-12-17 10:26 / user/hadoop/input/capacity-scheduler.xml

-rw-r--r-- 1 hadoop supergroup 884 2017-12-17 10:26 / user/hadoop/input/core-site.xml

-rw-r--r-- 1 hadoop supergroup 10206 2017-12-17 10:26 / user/hadoop/input/hadoop-policy.xml

-rw-r--r-- 1 hadoop supergroup 867 2017-12-17 10:26 / user/hadoop/input/hdfs-site.xml

-rw-r--r-- 1 hadoop supergroup 620 2017-12-17 10:26 / user/hadoop/input/httpfs-site.xml

-rw-r--r-- 1 hadoop supergroup 3518 2017-12-17 10:26 / user/hadoop/input/kms-acls.xml

-rw-r--r-- 1 hadoop supergroup 5939 2017-12-17 10:26 / user/hadoop/input/kms-site.xml

-rw-r--r-- 1 hadoop supergroup 690 2017-12-17 10:26 / user/hadoop/input/yarn-site.xml

Of course, you can also view it under the Web interface provided by Hadoop. Enter the URL http://localhost:50070/explorer.html in the browser, and then enter the file path under HDFS, as shown in the following figure:

Then execute the MapReduce job with the following command:

Hadoop jar / usr/local/hadoop-2.9.0/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar grep input output 'dfs [a murz.] +'

The purpose of this job is to find out the words that begin with dfs in the files under the / user/hadoop/input/ HDFS directory and count the number of occurrences. If there are no errors in the execution of the program, we will see two files in the / user/hadoop/output/ HDFS directory:

Hdfs dfs-ls / user/hadoop/output

17-12-17 10:41:28 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

Found 2 items

-rw-r--r-- 1 hadoop supergroup 0 2017-12-17 10:36 / user/hadoop/output/_SUCCESS

-rw-r--r-- 1 hadoop supergroup 29 2017-12-17 10:36 / user/hadoop/output/part-r-00000

Let's view it in HDFS with the following command:

Hdfs dfs-cat / user/hadoop/output/*

17-12-17 10:42:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... Using builtin-java classes where applicable

1 dfsadmin

1dfs.replication

You can also take it out of the HDFS file system and put it locally:

[hadoop@hadoop ~] $hdfs dfs-get / user/hadoop/output / home/hadoop/output

The above command is to copy everything in the HDFS file path / user/hadoop/output to / home/hadoop/output, so you can view the contents of the file with the familiar Linux name.

Note:

1. The / user/hadoop/output HDFS directory cannot exist when the program is executed, otherwise an error will be reported when it is executed again, which can be re-specified or deleted, such as the hdfs dfs-rm-f-r / user/hadoop/output command.

2. If you need to close Hadoop, please perform stop-dfs.sh naming.

3. After the formatting of the NameNode and DataNode nodes of Hadoop is successfully executed once, it does not have to be executed the next time.

5.4 enable YARN mode

YARN, whose full name is Yet Another Resource Negotiator,YARN, is separated from MapReduce and is responsible for resource management and task scheduling. YARN runs on top of MapReduce and provides high availability and high scalability. The above starts Hadoop through tart-dfs.sh, which only starts the MapReduce environment. We can start YARN and let YARN be responsible for resource management and task scheduling.

To use YARN, you first need to configure it through mapred-site.xml. This file does not exist by default in / usr/local/hadoop-2.9.0/etc/hadoop, but there is a template file called mapred-site.xml.template.

First, rename it to mapred-site.xml:

Cp / usr/local/hadoop-2.9.0/etc/hadoop/mapred-site.xml.template / usr/local/hadoop-2.9.0/etc/hadoop/mapred-site.xml

Then modify the contents of the file as follows:

Mapreduce.framework.name

Yarn

Also modify the contents of the yarn-site.xml file in the same directory as follows:

Yarn.nodemanager.aux-services

Mapreduce_shuffle

Yarn.nodemanager.env-whitelist

JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME

At this point, you can start YARN through start-yarn.sh and stop YARN through stop-yarn.sh.

Start YARN:

Please confirm that start-dfs.sh has been executed correctly before executing the following command:

Start-yarn.sh

In order to see the task running in Web, you need to open the history server and execute the following command:

Mr-jobhistory-daemon.sh start historyserver

At this point, you can check the startup status through jps:

[hadoop@hadoop ~] $jps

7559 JobHistoryServer

8039 DataNode

8215 SecondaryNameNode

8519 NodeManager

8872 Jps

8394 ResourceManager

7902 NameNode

After starting YARN, you can view the execution of the task in the Web interface. The URL is http://localhost:8088/, and the interface is as follows:

6. Summary

This article mainly describes how to deploy Hadoop under CentOS7, including the supporting components of Hadoop and the configuration of Hadoop, and briefly introduces the command and usage of HDFS as a distributed file system, and finally demonstrates how to run MapReduce programs by running a simple MapReduce example.

Disclaimer: this article was first posted on my Wechat Subscription account: zhoujinqiaoIT, and then it will be posted on my CSDN, 51CTO and oschina blog at the same time. I will be responsible for answering questions here.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.