Apache Hadoop 2.4.1 how to implement single Node installation 07/19 Update SLTechnology News&Howtos

Apache Hadoop 2.4.1 how to implement single Node installation

2025-07-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article will explain in detail how to achieve single-node installation of Apache Hadoop 2.4.1. The editor thinks it is very practical, so I share it with you as a reference. I hope you can get something after reading this article.

I. purpose

How to install and configure a single-node Hadoop, so you can use Hadoop MapReduce and Hadoop Distributed File System (HDFS) to quickly present a simple operation.

II. Platforms supported by prerequisites for installation

Hadoop supports GNU/Linux systems and is used as a development and product platform. It has been proved that Hadoop can support clusters of 2000 nodes on the GNU/Linux platform.

The Windows system is also supported, but the following documentation only describes the installation of Hadoop on Linux. For the installation of Hadoop on Windws systems, please refer to wiki page.

Required software

The software required by Linux includes:

Java ™must be installed. For the recommended Java version, please refer to HadoopJavaVersions.

Ssh must be installed and sshd is running because Hadoop scripts are needed to manage remote daemons.

Install softwar

If you do not have the above software in your cluster, please install it.

For example, under Ubuntu:

$sudo apt-get install ssh

$sudo apt-get install rsync

Download the required software

Get the Hadoop distributed clustering software and download a recent stable version from Apache Download Mirrors (the current stable version is 2.4.1).

4. Prepare to start Hadoop cluster

Extract the downloaded Hadoop software. In the installation directory, edit the file etc/hadoop/hadoop-env.sh and define the following parameters:

# set the installation directory export JAVA_HOME=/usr/java/latest of JAVA

# set the installation directory of hadoop, if your installation directory is / usr/local/hadoop export HADOOP_PREFIX=/usr/local/hadoop

Try entering the following command in Termimal:

$bin/hadoop

After entering the above command, the help document for using hadoop scripts will be displayed in Terminal.

Next, you can start your Hadoop cluster in the following three modes

Local mode (stand-alone mode)

For distributed mode

Fully distributed mode

Stand-alone installation (run an example)

By default, Hadoop, as a simple Java program, is run in a non-distributed mode. It is more likely to be used to modulate programs.

In the following example, copy the conf directory in the Hadoop installation file and use it as input. Then find the file that matches the given regular expression in the conf file. The output is written to the given output directory.

$mkdir input $cp etc/hadoop/*.xml input $bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs [Amurz.] +' $cat output/*

VI. Pseudo-distributed installation

Hadoop can also run pseudo-distributed mode in a single node, with the Hadoop daemon running in a split Java program.

Configuration

The configured files and properties are as follows:

Etc/hadoop/core-site.xml:

Fs.defaultFS hdfs://localhost:9000

Etc/hadoop/hdfs-site.xml:

Dfs.replication 1

Configure ssh keyless login

Use the following command to check whether your ssh can log in without a key

$ssh localhost

If you cannot log in to localhost without a key, execute the following command:

$ssh-keygen-t dsa-P''- f ~ / .ssh/id_dsa $cat ~ / .ssh/id_dsa.pub > > ~ / .ssh/authorized_keys

Execution

The following commands run a MapReduce job, and if you want to execute a YARN job, please refer to the next section: running YARN on a single node.

Format the file system:

$bin/hdfs namenode-format

Open the NameNode daemon and the DataNode daemon:

$sbin/start-dfs.sh

The output of the Hadoop daemon log is in the $HADOOP_LOG-DIRdiewctory directory (default is in the $HADOOP_HOME/logs directory).

Browse the web interface of NameNode; at:

NameNode-http://localhost:50070/

Generate the HDFS directory required to execute the MapReduce job:

$bin/hdfs dfs-mkdir / user $bin/hdfs dfs-mkdir / user/

Upload the input file (etc/hadoop) to HDFS and rename it to input:

$bin/hdfs dfs-put etc/hadoop input

Run the example provided by Hadoop:

$bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.4.1.jar grep input output 'dfs [a murz.] +'

Check the output file:

Copy the output files from HDFS to the local file system and check them:

$bin/hdfs dfs-get output output $cat output/*

View the output file directly in HDFS:

$bin/hdfs dfs-cat output/*

When you are finished, you can stop the daemon using the following command:

$sbin/stop-dfs.sh

Run YARN on a single node

You can run a MapReduce job on YARN in pseudo-distributed mode by setting some parameters and running the ResourceManager and NodeMangaer daemons.

Execute the following command to make sure that the above 1-4 steps have been performed.

Configure the parameters for the following files:

Etc/hadoop/mapred-site.xml:

Mapreduce.framework.name yarn

Etc/hadoop/yarn-site.xml:

Yarn.nodemanager.aux-services mapreduce_shuffle

Open the ResourceManager daemon and the NodeManager daemon:

$sbin/start-yarn.sh

Browse the web interface of ResourceManager; the default valid path is:

ResourceManager-http://localhost:8088/

Run a MapReduce job.

When you are finished, you can stop the YARN daemon using the following command:

$sbin/stop-yarn.sh

This is the end of the article on "Apache Hadoop 2.4.1 how to achieve single-node installation". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.