Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Yarn of Hadoop

2025-04-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

1 Overview

Yarn is a resource scheduling platform, which is responsible for providing server computing resources for computing programs, which is equivalent to a distributed operating system platform, while computing programs such as MapReduce are equivalent to applications running on the operating system.

2 basic architecture of Yarn

3 working mechanism of Yarn

Detailed explanation of the working mechanism:

1) the MR program is submitted to the node where the client is located.

2) YarnRunner applies for an Application from ResourceManager.

3) RM returns the resource path of the application to YarnRunner.

4) the program submits the resources needed for running to the HDFS.

5) after the program resources are submitted, apply to run mrAppMaster.

6) RM initializes the user's request into a Task.

7) one of the NodeManager gets the Task task.

8) the NodeManager creates the container Container and generates the MRAppmaster.

9) Container copies resources locally from the HDFS.

10) MRAppmaster applies to RM to run MapTask resources.

11) RM assigns the task of running MapTask to the other two NodeManager, and the other two NodeManager pick up the task and create the container.

12) MR sends a startup script to the two NodeManager that receive the task, and the two NodeManager respectively start MapTask,MapTask to sort the data partitions.

13) after waiting for all MapTask to run, MrAppMaster applies for a container from RM and runs ReduceTask.

14) ReduceTask obtains the data of the corresponding partition from MapTask.

15) after the program is run, MR will apply to RM to log itself off.

4 the whole process of job submission 4.1 YARN of the job submission process

The whole process of homework submission is explained in detail:

1) Job submission

Client calls the job.waitForCompletion () method to submit MapReduce jobs to the entire cluster. Client applies to RM for an assignment id. RM returns the submission path and job id of the job resource to Client. Client submits jar packages, slice information, and configuration files to the specified resource submission path. After Client has submitted the resources, apply to RM to run MrAppMaster.

2) Job initialization

When RM receives a request from Client, add the job to the capacity scheduler. An idle NM gets the Job. The NM creates the Container and produces the MRAppmaster. Download the resources submitted by Client locally.

3) Task assignment

MrAppMaster requests RM to run multiple MapTask task resources. RM assigns the running MapTask task to the other two NodeManager, and the other two NodeManager pick up the task and create the container.

4) Task running

MR sends a program startup script to two NodeManager that receive the task, and each of these NodeManager starts MapTask,MapTask to sort the data partitions. MrAppMaster waits for all MapTask to finish running, then applies to RM for a container and runs ReduceTask. ReduceTask gets the data of the corresponding partition from MapTask. After the program is run, MR will apply to RM to log itself off.

5) Progress and status updates

Tasks in YARN return their progress and status (including counter) to the application manager, and the client requests progress updates from the application manager every second (through mapreduce.client.progressmonitor.pollinterval settings) to show to the user.

6) Job completion

In addition to requesting the progress of the job from the application manager, the client checks whether the job is completed by calling waitForCompletion () every 5 seconds, and the interval can be set through mapreduce.client.completion.pollinterval. After the job is completed, the application manager and Container will clean up the work status, and the job information will be stored by the job history server for later user verification. 4.2 MapReduce of the job submission process

5 Resource scheduler

Currently, there are three main types of Hadoop job schedulers: FIFO, Capacity Scheduler, and Fair Scheduler. The default resource scheduler for Hadoop2.7.2 is Capacity Scheduler.

[yarn-default.xml]

The class to use as the resource scheduler. Yarn.resourcemanager.scheduler.class org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

FIFO scheduler

Capacity scheduler

Fair scheduler

6 speculative execution of tasks

The completion time of a job depends on the slowest task completion time. A job consists of several Map tasks and Reduce tasks. Due to hardware aging, software Bug, etc., some tasks may run very slowly. 99% of the Map tasks in the system have been completed, and only a few Map are always slow to finish. What should I do?

Speculative execution mechanism

Start a backup task for the lagging task and run it at the same time.

Prerequisites for performing speculative tasks

There can be only one backup task per Task. The completed Task of the current Job must be no less than 0.055%. The conjecture execution parameter setting must be enabled. The mapred-site.xml file is enabled by default.

Cannot enable speculative execution mechanism

There is a serious load skew between tasks for special tasks, such as tasks writing data to the database.

Schematic:

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 209

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report