VII. Basic principles of yarn 07/12 Update SLTechnology News&Howtos

VII. Basic principles of yarn

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

[TOC]

1. The difference between Hadoop1.x and Hadoop2.x architecture

in Hadoop1.x, MapReduce itself is responsible for resource scheduling and business logic operation, which has a large degree of coupling, and only supports MapReduce framework at that time.

adds yarn to Hadoop2.x, which is responsible for resource scheduling. MapReduce is only responsible for business logic operations, and yarn can operate on other distributed computing frameworks, such as spark.

II. Basic structure of yarn

figure 2.1 yarn basic structure

It is mainly divided into RM (ResourceManager), NM (NodeManager) and AM (ApplicationMaster).

1. Explanation of related nouns

Resources:

In the context of YARN, resources refer specifically to computing resources, including CPU and memory. Each process of the computer takes up a certain amount of CPU and memory, and the task needs to apply for resources from the RM before it is allowed to start its own process on the NM.

Queue:

YARN divides the resources of the entire cluster into queues, and each user's task must be submitted to the specified queue. At the same time, limit the size of each queue to prevent the tasks of one user from occupying the entire cluster, affecting the use of other users.

Vcore 、 Mem:

Logical CPU and logical memory, each NM reports to RM how much vcore and memory it has available, and the specific value is configured by the cluster administrator. For example, a machine with 48 cores and 128 gigabytes of memory can be configured with 120 gigabytes of memory, meaning it can provide so many resources to the outside world. The specific value may be adjusted according to the actual situation. The logical resources of each NM add up to the total resources of the entire cluster.

MinResources & MaxResources:

In order to make each queue get certain resources without wasting the idle resources of the cluster, the resource setting of the queue is "flexible". Each queue has two resource values: min and max. Min means that as long as the demand can be met, the cluster will certainly provide so many resources; if the resource demand exceeds the min value and the cluster still has free resources, it can still be met; but it also limits that resources cannot be applied indefinitely so as not to affect other tasks, and the allocation of resources will not exceed the max value. This refers to the total number of resources obtained by the entire queue

Container:

The process started on the NM after the task has applied for resources is collectively referred to as Container. For example, in MapReduce it can be Mapper or Reducer, in Spark it can be Driver or Executor.

RM (ResourceManager), NM (NodeManager), AM (ApplicationMaster):

ResourceManager (rm): processing client requests, launching / monitoring ApplicationMaster, monitoring NodeManager, resource allocation and scheduling NodeManager (nm): resource management on a single node, processing commands from ResourceManager, processing commands from ApplicationMaster: data segmentation, requesting resources for applications, and assigning to internal tasks, task monitoring and fault tolerance. In fact, this is where the driver side of the MapReduce task runs. Used for task scheduling. The previous RM and NM are used for resource scheduling 2. Overview of yarn working mechanism

1) the user uses the client to submit a task job to RM, specifying which queue to submit to and how many resources are required. Users can set the corresponding parameters for each calculation engine, or use the default settings if they are not specifically specified.

2) after receiving the request submitted by the task, RM first selects a NM based on whether the resource and queue meet the requirements, and informs it to start a special container called ApplicationMaster (AM), and the subsequent process is initiated by it. There is only one AM per task.

3) after registering with RM, AM applies for container from RM according to the needs of its own task, including quantity, amount of resources required, location and other factors.

4) if the queue has sufficient resources, RM allocates container to the NM with sufficient remaining resources, and AM notifies NM to start container.

5) after container starts, it performs specific tasks and processes the data allocated to itself. In addition to starting container, NM is also responsible for monitoring its resource usage and whether it failed to exit. If the actual memory used by container exceeds the memory specified at the time of application, it will be killed to ensure that other container can run normally.

6) after each container reports its progress to AM, AM logs out the task to RM and exits, RM notifies NM to kill the corresponding container, and the task ends.

III. The working mechanism of yarn

figure 3.1 how yarn works

(0) the Mr program is submitted to the node where the client is located.

(1) Yarnrunner applies for an Application from Resourcemanager

(2) RM returns the resource path of the application to yarnrunner.

(3) the program submits the resources needed for running to the HDFS and places them in the directory specified by hdfs. As shown in the figure above, the resources mainly have three parts, one is the planning file of all the slice information, the second is a running configuration file of job (including some running parameters of job, etc.), and the third is the jar package of the MapReduce program.

(4) after the program resources are submitted, apply to run mrAppMaster.

(5) RM initializes the user's request into a task.

(6) one of the NodeManager gets the task task.

(7) the NodeManager creates the container Container and generates the MRAppmaster.

(8) Container copies resources from HDFS to local.

(9) MRAppmaster applies to RM to run maptask resources.

(10) RM assigns the task of running maptask to the other two NodeManager, and the other two NodeManager pick up the task and create the container

(11) MR sends program startup scripts to two NodeManager that receive the task, and the two NodeManager respectively start maptask,maptask to sort the data partitions.

(12) MrAppMaster waits for all maptask to finish running, then apply for a container from RM and run reduce task.

(13) reduce task obtains the data of the corresponding partition from maptask.

(14) after the program has been run, appMaster will apply to RM to cancel itself.

Yarn Resource Job Scheduler

Yarn allocates resources to tasks, and this process involves resource scheduling. Currently, there are three resource schedulers supported by yarn: FIFO, capacity scheduler, and fair scheduler. Currently, the default is capacity scheduler.

1. Set which resource scheduler / / yarn-default.xml The class to use as the resource scheduler. Yarn.resourcemanager.scheduler.classorg.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler2 、 FIFO

figure 4.1 FIFO Scheduler

Advantages: the scheduling algorithm is simple, and JobTracker (the place where the task is submitted by job is sent) has a light workload.

Disadvantages: ignoring the differences in requirements for different jobs. For example, if jobs such as statistical analysis of large amounts of data occupy computing resources for a long time, the interactive jobs submitted later may not be processed for a long time, thus affecting the user experience.

3. Capacity scheduler-- yahoo development

figure 4.2 capacity scheduler Scheduler

1. Multi-queue support, using FIFO for each queue

two。 In order to prevent jobs of the same user from monopolizing resources in the queue, the scheduler limits the amount of resources for jobs submitted by the same user.

3. First, calculate the ratio between the number of running tasks in each queue and the computing resources it should be allocated (the sum of the total resources that should be allocated to the existing tasks in the queue), and select a queue with the lowest ratio.

4. Secondly, the tasks in the queue are sorted according to the priority of the job and the time order of submission, taking into account the user resource limit and memory limit.

5. The three queues are executed in the order of tasks at the same time. For example, job1,job21 and job31 are respectively at the top of the queue, running first and running at the same time.

This scheduling does not support priority by default, but this option can be turned on in the configuration file. If priority is supported, the scheduling algorithm is FIFO with priority.

Priority preemption is not supported, and once a job starts execution, its resources will not be preempted by high-priority jobs until it is finished.

The percentage of resources available to jobs submitted by the same user in the queue is limited so that jobs belonging to the same user cannot monopolize resources. That is, the percentage of resources that the same user can use in different queues

4. Fair scheduler--Facebook development

figure 4.3 fair scheduler Scheduler

1. Multiple queues and multiple users are supported, the amount of resources in each queue can be configured, and jobs in the same queue fairly share all resources in the queue.

two。 For example, there are three cohorts, An and B, C. The job in each queue allocates resources according to priority, and the higher the priority, the more resources are allocated, but each job is assigned resources to ensure fairness. In the case of limited resources, each job ideal situation, there is a gap between the obtained computing resources and the actual computing resources, this gap is called vacancy. In the same queue, the larger the resource shortage of job is, the more priority will be given to the resources obtained first. Jobs are executed successively according to the level of vacancy, and you can see that there are multiple jobs running at the same time in the figure above.

5. Speculative execution of tasks 1. Definition of speculative execution

Speculative execution (Speculative Execution) refers to running MapReduce in a cluster environment, which may be due to program Bug, uneven load or other problems, resulting in inconsistent speed of multiple TASK under one JOB. For example, some tasks have been completed, but some tasks may only run 10%. According to the barrel principle, these tasks will become the short board of the whole JOB. If the cluster starts speculative execution, in order to maximize the deficiency. Hadoop starts a backup task for the task, allowing speculative task to process a piece of data with the original task at the same time, and Kill another task after the run is completed.

2. The mechanism of speculative operation.

It is found that a slow task, such as a task, runs much slower than the average speed of the task. Start a backup task for the lag task and run it at the same time. Whoever finishes running first will adopt the result. And after running, the unfinished tasks will be directly kill off.

3. Speculate on the premise of running the task

1) there can be only one backup task per task

2) the current task completed by job must be not less than 5%

3) conjecture execution cannot be enabled: there is a serious load skew between tasks; special tasks, such as tasks, write data to the database

4) enable speculative execution parameter settings. It is enabled by default in mapred-site.xml.

The conjecture that starts the map task executes mapreduce.map.speculative true If true, then multiple instances of some map tasks may be executed in parallel. The conjecture that starts the reduce task executes mapreduce.reduce.speculative true If true, then multiple instances of some reduce tasks may be executed in parallel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.