In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
How to understand the distributed resource scheduling framework Yarn? for this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.
As a resource management and task scheduling framework, the importance of Yarn is self-evident. Especially in the big data interview in recent years, it is one of the key knowledge of the interview questions. In order to be more fully prepared during the interview, the editor sorted out an interview question about the distributed resource scheduling framework Yarn, big data, including the architecture of Yarn, workflow, scheduler Scheduler.
1. The architecture of Yarn
Yarn is a framework for resource management and task scheduling, which mainly consists of three modules: ResourceManager (RM), NodeManager (NM),
ApplicationMaster (AM).
(1) ResourceManager is responsible for the monitoring, allocation and management of all resources
(2) ApplicationMaster is responsible for scheduling and coordinating each specific application.
(3) NodeManager is responsible for the maintenance of each node. It has absolute control over all applications,RM and the right to allocate resources. Each AM negotiates resources with RM and communicates with NodeManager to execute and monitor task.
2. The workflow of Yarn
(1) client submits the application to RM, including the necessary information about the ApplicationMaster that starts the application, such as ApplicationMaster program, command to start ApplicationMaster, user program, etc.
(2) ResourceManager starts a container to run ApplicationMaster. ApplicationMaster in startup registers itself with ResourceManager, and keeps heartbeat with RM after successful startup.
(3) ApplicationMaster sends a request to ResourceManager to apply for the corresponding number of container.
(4) ResourceManager returns the containers information of ApplicationMaster's application. A successful container is initialized by ApplicationMaster. After the startup information of container is initialized, AM communicates with the corresponding NodeManager, requiring NM to start container. AM and NM keep their heartbeats to monitor and manage tasks running on NM.
(5) during the operation of container, ApplicationMaster monitors container. Container reports its progress and status to the corresponding AM through the RPC protocol.
(6) during the running of the application, client communicates with AM directly to obtain the application status, progress update and other information.
(7) after the application has finished running, ApplicationMaster logs itself out to ResourceManager and allows the container belonging to it to be withdrawn.
3. Scheduler, the scheduler of Yarn.
In Yarn, Scheduler is responsible for allocating resources to applications. There are three schedulers to choose from: FIFO Scheduler and Capacity Scheduler,FairScheduler.
(1) FIFO Scheduler
FIFO Scheduler arranges applications into a queue in the order of submission, which is a first-in, first-out queue. When allocating resources, resources are first allocated to the top application in the queue, and then to the next one after the top application needs are met, and so on.
(2) Capacity Scheduler
The Capacity scheduler allows multiple organizations to share the entire cluster, and each organization can gain some of the computing power of the cluster. By assigning dedicated queues to each organization, and then allocating certain cluster resources to each queue, the entire cluster can provide services to multiple organizations by setting up multiple queues. In addition, the queue can be divided vertically, so that multiple members within an organization can share the queue resources. Within a queue, resources are scheduled using a first-in-first-out (FIFO) strategy.
(3) Fair Scheduler
In the Fair scheduler, we do not need to occupy a certain amount of system resources in advance, and the Fair scheduler dynamically adjusts system resources for all running job. As shown in the figure below, when the first large job is submitted, only this job is running, and it acquires all the cluster resources; when the second small task is submitted, the Fair scheduler allocates half of the resources to the small task, allowing the two tasks to share cluster resources fairly.
This is the answer to the question about how to understand the distributed resource scheduling framework Yarn. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.