Introduction to Apache Hadoop Chapter 4 04/08 Update SLTechnology News&Howtos

Introduction to Apache Hadoop Chapter 4

2025-04-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

YARN running on a single node

You can run MapReduce job on YARN in pseudo-distributed mode by setting several parameters, in addition to the daemons running ResourceManager and the NodeManager daemons.

Here are the steps to run.

(1) configuration

Etc/hadoop/mapred-site.xml:

Mapreduce.framework.name

Yarn

one

two

three

four

five

six

Etc/hadoop/yarn-site.xml:

Yarn.nodemanager.aux-services

Mapreduce_shuffle

one

two

three

four

five

six

(2) start ResourceManager daemon and NodeManager daemon

$sbin/start-yarn.sh

one

(3) browse the network interfaces of ResourceManager, and their addresses are:

ResourceManager-http://localhost:8088/

one

(4) run MapReduce job

(5) after all operations are completed, stop the daemon:

$sbin/stop-yarn.sh

one

Operation method of fully distributed mode

For information on building a fully distributed mode, see the "installation configuration on Apache Hadoop clusters" section below.

Installation configuration on an Apache Hadoop cluster

This section describes how to install, configure, and manage Hadoop clusters, ranging in size from a small cluster of several nodes to a very large cluster of thousands of nodes.

precondition

Make sure that all the necessary software is installed on each node in your cluster. To install the Hadoop cluster, you usually unpack the installation software to all the machines in the cluster. Refer to the previous section, "installation configuration on a single node in Apache Hadoop".

Typically, one machine in the cluster is designated as NameNode and another machine is designated as ResourceManager. These are all master. Whether other services, such as Web application proxy servers and MapReduce Job History servers, run on dedicated hardware or shared infrastructure depends on the load.

The remaining machines in the cluster act as DataNode and NodeManager. These are all slave.'

Many people know that I have big data training materials, and they naively think that I have a full set of big data development, hadoop, spark and other video learning materials. I would like to say that you are right. I do have a full set of video materials developed by big data, hadoop and spark.

If you are interested in big data development, you can add a group to get free learning materials: 763835121

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.