How to install hadoop in linux 07/19 Update SLTechnology News&Howtos

How to install hadoop in linux

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to install hadoop linux", the explanation content in the article is simple and clear, easy to learn and understand, please follow the small series of ideas slowly in-depth, together to study and learn "how to install hadoop linux"!

Linux Hadoop installation methods: 1. Install ssh service;2. Use ssh for password-free authentication login;3. Download Hadoop installation package;4. Decompress Hadoop installation package;5. Configure the corresponding files in Hadoop.

Operating environment: ubuntu 16.04 system, Hadoop version 2.7.1, Dell G3 computer.

How to install hadoop on linux?

[Big Data] Linux installation Hadoop(2.7.1) detailed explanation and WordCount running

I. Introduction

After completing the Storm environment configuration, I thought about tinkering with the Hadoop installation. There are many tutorials on the Internet, but none of them are particularly suitable, so I still encountered a lot of trouble in the installation process, and finally I kept consulting the information. Finally, I solved the problem. It feels very good. The following nonsense is not much to say. Start to get to the point.

The configuration environment of this machine is as follows:

Hadoop(2.7.1)

Ubuntu Linux(64-bit systems)

The configuration process is explained in several steps below.

Second, install ssh service

Enter the shell command and enter the following command to check whether the ssh service has been installed. If not, use the following command to install it:

sudo apt-get install ssh openssh-server

The installation process is relatively easy and pleasant.

Third, use ssh for password-free authentication login

1. Create ssh-key, here we use rsa mode, use the following command:

ssh-keygen -t rsa -P ""

2. A graphic appears, the graphic appears is the password, don't worry about it

cat ~/.ssh/id_rsa.pub >> authorized_keys

3. Then you can log in without password verification, as follows:

ssh localhost

Successful screenshots are as follows:

4. Download Hadoop installation package

There are also two ways to download Hadoop installation

1. Download directly on the official website, http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz

2. Use shell to download, command as follows:

wget http://mirrors.hust.edu.cn/apache/hadoop/core/stable/hadoop-2.7.1.tar.gz

The second method seems to be faster, after a long wait, finally downloaded.

V. Decompress the Hadoop installation package

Extract the Hadoop installation package using the following command

tar -zxvf hadoop-2.7.1.tar.gz

After decompression, the folder of hadoop 2.7.1 appears

VI. Configure the corresponding files in Hadoop

The files that need to be configured are as follows: hadoop-env.sh, core-site.xml, mapred-site.xml.template, hdfs-site.xml. All files are located under hadoop2.7.1/etc/hadoop. The specific configurations are as follows:

1.Core-site.xml is configured as follows:

hadoop.tmp.dir file:/home/leesf/program/hadoop/tmp Abase for other temporary directories. fs.defaultFS hdfs://localhost:9000

The path of hadoop.tmp.dir can be set according to your own habits.

2.mapred-site.xml.template is configured as follows:

mapred.job.tracker localhost:9001

3. hdfs-site.xml is configured as follows:

dfs.replication 1 dfs.namenode.name.dir file:/home/leesf/program/hadoop/tmp/dfs/name dfs.datanode.data.dir file:/home/leesf/program/hadoop/tmp/dfs/data

The paths to dfs.namenode.name.dir and dfs.datanode.data.dir can be set freely, preferably under the directory hadoop.tmp.dir.

In addition, if you can't find jdk when running Hadoop, you can directly put the path of jdk in hadoop.env.sh, as follows:

export JAVA_HOME="/home/leesf/program/java/jdk1.8.0_60"

VII. Running Hadoop

After configuration is complete, run hadoop.

1. Initialize HDFS system

Use the following command in the hadop2.7.1 directory:

bin/hdfs namenode -format

Screenshots are as follows:

The process requires ssh authentication, and you have already logged in, so type y between initializations.

Successful screenshots are as follows:

Indicates initialization is complete.

2. Open NameNode and DataNode daemons

Open with the following command:

sbin/start-dfs.sh, a successful screenshot is as follows:

3. View process information

Use the following command to view process information

JPS, screenshots are as follows:

Both DataNode and NameNode are open.

4. View Web UI

Enter http://localhost:50070 in your browser to view the relevant information. The screenshot is as follows:

At this point, the hadoop environment has been set up. Let's start with a WordCount example using hadoop.

8. Run WordCount Demo

1. Create a new file locally, and the author creates a new words document under the home/leesf directory, which can be filled in at will.

2. Create a folder in HDFS to upload local words files. Enter the following command under hadoop 2.7.1 directory:

bin/hdfs dfs -mkdir /test means that a test directory has been created under the root directory of hdfs

To view the directory structure under HDFS root directory, use the following command

bin/hdfs dfs -ls /

Specific screenshots are as follows:

Indicates that a test directory has been created under the root directory of HDFS

3. Upload the local words document to the test directory

Upload using the following command:

bin/hdfs dfs -put /home/leesf/words /test/

Use the following command to view

bin/hdfs dfs -ls /test/

Screenshots of the results are as follows:

It means that the local words document has been uploaded to the test directory.

4. Run wordcount

Run wordcount using the following command:

bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /test/words /test/out

Screenshots are as follows:

After running, generate a file named out under/test directory, and use the following command to view the files under/test directory

bin/hdfs dfs -ls /test

Screenshots are as follows:

indicates that there is already a file directory named Out under the test directory

Enter the following command to view files in the out directory:

bin/hdfs dfs -ls /test/out, the screenshot of the result is as follows:

It has been successfully run, and the results are saved in part-r-00000.

5. View results

Use the following command to view the results:

bin/hadoop fs -cat /test/out/part-r-00000

Screenshots of the results are as follows:

At this point, the process is complete.

Thank you for reading, the above is "linux how to install hadoop" content, after the study of this article, I believe we have a deeper understanding of how to install hadoop linux this problem, the specific use of the situation also needs to be verified by practice. Here is, Xiaobian will push more articles related to knowledge points for everyone, welcome to pay attention!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.