How to use Yarn in hadoop 04/26 Update SLTechnology News&Howtos

How to use Yarn in hadoop

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

Hadoop how to use Yarn, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.

1.mapred-site.xml configuration

Go to the appropriate folder to edit the mapred-site.xml file and add the mapreduce.framework attribute.

Location of the configuration file

Configuration of mapred-site.xml

2.yarn-site.xml configuration yarn-site.xml configuration

Similarly, you can add nodemanager services to yarn-site.xml.

3. Start the yarn-related process and verify that it started successfully

Start the yarn-related process. / start-yarn.sh # execute this command in the sbin directory to start yarn#. Note that start HDFS before that. As can be seen from the console output, # starts the resourcemanager and nodemanager processes, and jps verification outputs the corresponding process number. # after startup, you can access the management interface of yarn by visiting http://localhost:8088.

. / stop-yarn.sh # stop yarn related processes

4. Run the hadoop sample program on yarn

Task is running: RUNNING task ends: FINISHED

As in the previous article, we still run the program that calculates PI in the example of hadoop's native jar package. There are several points that need to be explained here.

1) the web page can track the execution status of the task at any time. It is running when it is submitted, and becomes finished after execution. As shown in the image above.

2) after yarn is configured, the service of yarn will be linked during calculation. We can see that ResourceManger is connected from the console output. ResourceManager is the resource manager for yarn.

Compute PI logs after configuring yarn

3) compare the output on the console before yarn is configured. You can see that the log is more concise after configuring yarn. The (partial) log before configuration is shown in the following figure, and the (partial) log after configuration is shown in the figure above. The log information before configuration tells us that we are performing a MapReduce process, such as map task,reduce task, etc. After configuration, there is only mapreduce job. It can be understood as a map reduce job running on yarn. Although the running time is not necessarily fast after configuration, the unified management of yarn is more optimized for the whole cluster.

Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.