Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to understand the resource management and scheduling system YARN

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today, I will talk to you about how to understand the resource management and scheduling system YARN, which may not be understood by many people. In order to make you understand better, the editor has summarized the following contents for you. I hope you can get something according to this article.

As a general resource management system, YARN aims to deploy short jobs and long services into a cluster and provide them with unified resource management and scheduling functions. To sum up, it mainly solves the following two problems: 1. Improve the utilization rate of cluster resources, 2. Automated deployment of services.

1. The basic structure of YARN:

YARN generally adopts master/slave architecture, and ResourceManager is responsible for the unified management and scheduling of resources on each NodeManager for master,NodeManager and slave,ResourceManager. When users submit an application, they need to provide an ApplicationMaster to track and manage the application, which is responsible for requesting resources from ResourceManager and asking NodeManager to start tasks that can occupy certain resources. Because different ApplicationMaster are not assigned to different nodes, they will not affect each other before.

YARN: mainly composed of ResourceManager, NodeManager, ApplicationMaster, and Container.

ResourceManager (RM): is a global resource manager, responsible for the management and allocation of resources throughout the system, mainly composed of scheduler (Schedule) and application manager (Applications Manager).

1 、. Scheduler: the main function is to allocate resources in the system to each application according to resource capacity, queues and other constraints.

2. Application Manager: responsible for all applications in the entire system, including application submission, negotiating resources with the scheduler and starting ApplicationMaster, monitoring the running status of ApplicationMaster and restarting it in case of failure.

ApplicationMaster (AM): each application submitted by the user contains a separate AM, and its main functions include negotiating with the RM scheduler to obtain resources (represented by Container), further allocating the resulting resources to internal tasks, communicating with NM to start / stop tasks, etc., and monitoring the running status of all tasks.

NodeManager (NM): NM is the resource manager on each node. On the one hand, it regularly reports the resource usage of this node and the running status of each Container to RM, on the other hand, it receives and processes various requests from AM, such as task start or stop.

Container: the basic resource allocation unit of yarn, which abstracts the running environment of the application and provides a resource isolation environment for the application. It encapsulates memory, cup, disk, network and so on.

2. High availability of YARN:

ResourceManager HA: Active/Standby ResouceManager is introduced to solve the single point of failure of ResourceMangager by residual method.

ResourceManager Recovery: built-in restart recovery function

NodeManager Recovery:NodeManager has a built-in restart recovery function.

III. YARN work flow

When a user submits an application to yarn, yarn will run the program in two phases. The first phase is to start ApplicationMaster;. The second stage is for ApplicationMaster to create an application, request resources for him, and monitor its running status.

1. Submit the application

2. Start ApplicationMaster

3. ApplicationMaster registration

4. Obtain resources

5. Request to start Container

6. Container monitoring

7. Log out of ApplicationMaster

YARN Resource Scheduler

Hierarchical queue management mechanism: organization: sub-queue, minimum capacity, maximum capacity

Multi-tenant Resource Scheduler: Capacity/Fair Scheduler

1. Capacity Scheduler: resources are divided by queues. Each queue can set a minimum guarantee and upper limit for the use of resources. At the same time, each user can also set a certain upper limit for the use of resources to prevent the abuse of resources. When there are remaining resources in one queue, you can temporarily share the remaining resources with other queues. Capacity Scheduler has the following characteristics: capacity guarantee, flexibility, multi-lease, security assurance (ACL), dynamic update of configuration files.

2. Fair Scheduler: queues are classified as unit resources, and a certain proportion of minimum guarantee and upper limit are set for each queue. Like Capacity, the differences are mainly reflected in the following aspects: fair sharing of resources, flexible configuration of scheduling policies, improvement of response time of applets, and transfer of applications between queues. Instead of expressing resources as a percentage, Fail Scheduler represents the actual number of resources.

Can be based on node label scheduling, and resource preemption scheduling

5. Resource isolation of YARN

CPU isolation mechanism

VI. Ecosystem with YARN as the core

Various application type frameworks can be run on YARN, including offline computing framework MapReduce, real-time computing framework Strom, DAG computing framework Tez, etc., which really realizes a cluster multi-purpose. This kind of cluster has become a lightweight elastic computing platform. It is said that Yarn has adopted the Cgroups lightweight isolation scheme, and that it is flexible. Because YARN can adjust the resources occupied by various computing frameworks or applications according to their load and demand, so as to achieve cluster resource sharing and flexible resource contraction.

With the development of YARN towards a better resource management system, long-term services such as Web Server and Mysql Server can eventually be deployed on YARN. In this way, Yarn will become a unified service deployment and management platform, and finally form an ecosystem with Yarn as the core.

Resource management system Mesos: the design motivation is to solve the problem of resource isolation and sharing built by different frameworks in a diversified environment. Although its design motivation is slightly different from YARN, its architecture and implementation strategy are similar to YARN. Companies currently using Mesos include Twitter, Douban and so on.

The evolution of resource management system architecture: centralized architecture (MRv1 JobTracker), two-tier scheduling architecture (YARN, Mesos), shared state architecture (Omega).

After reading the above, do you have any further understanding of how to understand the resource management and scheduling system YARN? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report