How to analyze the scheduling mode of yarn 04/25 Update SLTechnology News&Howtos

How to analyze the scheduling mode of yarn

2025-04-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article analyzes "how to analyze the scheduling mode of yarn". The content is detailed and easy to understand. Friends who are interested in "how to analyze the scheduling mode of yarn" can follow the editor's train of thought to read it in depth. I hope it will be helpful to everyone after reading. Let's follow the editor to learn more about "how to analyze the scheduling mode of yarn".

Hadoop YARN supports scheduling of both memory and CPU resources (only memory is supported by default. If you want to schedule CPU further, you need to do some configuration yourself). The editor will introduce how YARN schedules and isolates these resources. Yarn has two resource allocation modes for its own runtime jobs: Capacity Scheduler and Fair Scheduler.

In YARN, resource management is done jointly by ResourceManager and NodeManager, in which the scheduler in ResourceManager is responsible for the allocation of resources, while NodeManager is responsible for the supply and isolation of resources. After ResourceManager assigns resources on a NodeManager to tasks (this is the so-called "resource scheduling"), NodeManager needs to provide corresponding resources for tasks as required, and even ensure that these resources should be exclusive, providing a basic guarantee for task operation, which is called resource isolation.

Based on the above considerations, YARN allows users to configure the physical memory resources available on each node. Note that this is "available", because the memory on a node will be shared by several services, such as part for YARN, part for HDFS, part for HBase, etc. YARN can only be configured for its own use, and the configuration parameters are as follows:

(1) yarn.nodemanager.resource.memory-mb

Indicates the total amount of physical memory available to YARN on this node. The default is 8192 (MB). Note that if your node memory resources are not enough for 8GB, you need to reduce this value, and YARN will not intelligently detect the total physical memory of the node.

(2) yarn.nodemanager.vmem-pmem-ratio

Each time a task uses 1MB physical memory, the maximum amount of virtual memory that can be used is 2.1 by default.

(3) yarn.nodemanager.pmem-check-enabled

Whether to start a thread to check the amount of physical memory being used by each task, and if the task exceeds the allocation value, kill it directly. The default is true.

(4) yarn.nodemanager.vmem-check-enabled

Whether to start a thread to check the amount of virtual memory being used by each task, and if the task exceeds the allocation value, kill it directly. The default is true.

(5) yarn.scheduler.minimum-allocation-mb

The minimum amount of physical memory that can be applied for by a single task is 1024 (MB) by default. If the amount of physical memory requested by a task is less than this value, the corresponding value is changed to this number.

(6) yarn.scheduler.maximum-allocation-mb

The maximum amount of physical memory that can be requested by a single task is 8192 (MB) by default.

By default, YARN uses thread monitoring to determine whether a task is overusing memory, and if it finds an excess, it kills it directly. Due to the lack of flexibility in Cgroups's control of memory (that is, the task cannot exceed the memory limit at any time, it will be killed or reported to OOM if it is exceeded), and the memory of the Java process will double at the moment of creation, and then plummet to the normal value. In this case, the thread monitoring method is more flexible (when it is found that the memory of the process tree instantly doubles beyond the set value, it can be considered normal and will not kill the task) Therefore, YARN does not provide a Cgroups memory isolation mechanism.

Capacity Scheduler

Capacity Scheduler supports the following features:

(1) the guarantee of computing power. Multiple queues are supported, and a job can be submitted to a queue. Each queue is configured with a certain percentage of computing resources, and all jobs submitted to the queue share the resources in that queue.

(2) flexibility. Free resources are allocated to queues that do not reach the resource usage limit, and when a queue that does not reach the resource limit needs resources, it will be allocated to them as soon as there are free resources.

(3) support priority. Queues support job priority scheduling (default is FIFO)

(4) multiple leases. Consider a variety of constraints to prevent a single job, user, or queue from monopolizing resources in the queue or cluster.

(5) Resource-based scheduling. Support resource-intensive jobs, allowing jobs to use more resources than the default value, which in turn can accommodate jobs with different resource requirements. Currently, however, only scheduling of memory resources is supported.

To configure the capacity scheduling mode, we first need to set the yarn.resourcemanager.scheduler.class property to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.CapacityScheduler in the yarn-site.xml file

1. Parameters related to resource allocation

(1) capacity: the resource capacity of the queue (percentage). When the system is very busy, you should ensure that the capacity of each queue is met, and if there are fewer applications per queue, you can share the remaining resources with other queues. Note that the sum of the capacity of all queues should be less than 100.

(2) maximum-capacity: the upper limit (percentage) of resource usage for the queue. Because of resource sharing, the amount of resources used by a queue may exceed its capacity, and the maximum amount of resources used can be limited by this parameter.

M minimum-user-limit-percent: minimum resource guarantee per user (percentage). There is a limit to the amount of resources available to each user in a queue at any time. When applications with multiple users are running simultaneously in a queue, the amount of resources used by each user fluctuates between a minimum and a maximum, where the minimum depends on the number of applications running and the maximum is determined by minimum-user-limit-percent. For example, suppose minimum-user-limit-percent is 25. When two users submit applications to the queue, the amount of resources available to each user cannot exceed 50%, and if three users submit applications, the amount of resources available per user cannot exceed 33%. If four or more users submit applications, the amount of resources available to each user cannot exceed 25%.

(3) user-limit-factor: the maximum amount of resources (percentage) that each user can use. For example, assuming that the value is 30, the amount of resources used by each user cannot exceed 30% of the queue capacity at any one time.

two。 Parameters related to limiting the number of applications

(1) maximum-applications: the maximum number of applications waiting and running in the cluster or queue, which is a strong limit. Once the number of applications in the cluster exceeds this limit, subsequently submitted applications will be rejected. The default value is 10000. The upper limit of the number of all queues can be set by parameter yarn.scheduler.capacity.maximum-applications (which can be regarded as the default value), while a single queue can be set to its own value by parameter yarn.scheduler.capacity..maximum- applications.

(2) maximum-am-resource-percent: the upper limit of the proportion of resources used to run application ApplicationMaster in the cluster. This parameter is usually used to limit the number of active applications. The parameter type is floating point, and the default is 0.1, which means 10%. The upper limit of the proportion of ApplicationMaster resources for all queues can be determined by the parameter yarn.scheduler.capacity. Maximum-am-resource-percent setting (can be seen as the default), while a single queue can be set through the parameter yarn.scheduler.capacity.. Maximum-am-resource-percent sets a value that suits you.

3. Queue access and access control parameters

(1) state: the queue status can be STOPPED or RUNNING. If a queue is in STOPPED state, the user cannot submit the application to the queue or its subqueues. Similarly, if the ROOT queue is in the STOPPED state, the user cannot submit the application to the cluster, but the running application can still run and end normally, so that the queue can exit gracefully.

(2) acl_submit_applications: define which Linux users / user groups can submit applications to a given queue. It is important to note that this property is inherited, that is, if a user can submit an application to a queue, it can submit the application to all of its subqueues. When you configure this property, you use "," to split between users or user groups, and spaces between users and user groups, such as "user1, user2 group1,group2".

(3) acl_administer_queue: assign an administrator to the queue who can control all the applications of the queue, such as killing any application, etc. Again, this property is inherited, and if a user can submit an application to a queue, it can submit the application to all of its subqueues.

Fair Scheduler

The Fair Scheduler organizes operations by resource pools (pool) and distributes resources fairly among these resource pools. By default, each user has a separate resource pool so that each user can get an equivalent cluster resource, no matter how many jobs they submit. It is also possible to set the resource pool for a job by the user's Unix group or job configuration (jobconf) property. Within each resource pool, the fair sharing (fair sharing) method is used to share capacity (capacity) between running jobs. Users can also give corresponding weights to the resource pool and share the cluster in a disproportionate manner.

In addition to providing fair sharing methods, the Fair Scheduler allows you to assign resource pool guarantees (guaranteed) to a minimum of shared resources, which is useful when ensuring that sufficient resources are always available to specific users, groups, or production applications. When a resource pool contains jobs, it can at least get its minimum shared resources, but when the resource pool does not fully need the guaranteed shared resources it has, the extra parts will be split between other resource pools.

To configure the fair scheduling mode, we first need to set the yarn.resourcemanager.scheduler.class property to org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler in the yarn-site.xml file

Fair Scheduler allows users to specifically place queue information in a profile (default is fair-scheduler.xml). For each queue, the administrator can configure the following options:

(1) minResources: the minimum resource guarantee is set to "X mb, Y vcores". When the minimum resource guarantee of a queue is not satisfied, it will obtain resources first than other peer queues. For different scheduling policies (described in more detail later), the meaning of the minimum resource guarantee is different. For fair policy, only memory resources are considered. That is, if a queue uses more memory resources than its minimum resources, it is considered to be satisfied. For the drf policy, consider the amount of resources used by the main resource, that is, if the amount of the main resource of a queue exceeds its minimum, it is considered to be satisfied.

(2) maxResources: the maximum amount of resources that can be used. Fair scheduler ensures that the amount of resources used by each queue will not exceed the maximum available resources of the queue.

(3) maxRunningApps: the maximum number of applications running at the same time. By limiting this number, the intermediate output produced when the excess Map Task is running at the same time can be prevented from exploding the disk.

(4) minSharePreemptionTimeout: the minimum shared quantity preempts the time. If the amount of resources used by a resource pool during that time has been lower than the minimum amount of resources, it begins to preempt resources.

(5) schedulingMode/schedulingPolicy: the scheduling mode of the queue, which can be fifo, fair or drf.

(6) aclSubmitApps: a list of Linux users or user groups that can submit the application to the queue, which is "*" by default, indicating that any user can submit the application to the queue. It is important to note that this property is inherited, that is, the list of child queues inherits the list of parent queues. When you configure this property, you use "," to split between users or user groups, and spaces between users and user groups, such as "user1, user2 group1,group2".

(7) aclAdministerApps: list of administrators for this queue. The administrator of a queue can manage the resources and applications in the queue, such as killing any application.

Administrators can also add maxRunningJobs attributes to individual users to limit the maximum number of applications they can run at the same time. In addition, the administrator can set the default values for the above properties with the following parameters:

(1) userMaxJobsDefault: the default value of the user's maxRunningJobs attribute.

(2) defaultMinSharePreemptionTimeout: the default value of the minSharePreemptionTimeout attribute of the queue.

(3) defaultPoolSchedulingMode: the default value of the schedulingMode attribute of the queue.

(4) fairSharePreemptionTimeout: fair sharing quantity preempts time. If the amount of resources used by a resource pool during that time has been less than half of the amount of fair sharing, it begins to preempt resources.

On how to analyze the scheduling mode of yarn is shared here, I hope that the above content can make you improve. If you want to learn more knowledge, please pay more attention to the editor's updates. Thank you for following the website!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.