Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the main roles of yarn architecture

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains what roles the yarn architecture mainly includes. Interested friends may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn what roles the yarn architecture mainly includes.

It has been a completely new architecture (YARN) for the mapduce computing framework since the Hadoop0.23 version. The old version of MRv1 Jobtracker has a single point and many functions, which is responsible for resource management and scheduling and job life cycle management (task scheduling, tracking task process status, task processing fault tolerance), so that when a large number of tasks need to be processed, a single jobtracker always has bottlenecks in memory or other resources, and there will be restrictions in scalability, resource utilization, running tasks other than mapreduce, and so on.

MRv2 Yarn framework separates resource scheduling from task management monitoring, resource manager NodeManager is responsible for resource scheduling, each application (job) is controlled by an AppMaster for scheduling management and monitoring of task, and can monitor the status of AppMaster. If there is a problem, you can restart it in other nodes.

MRv1 mapreduce framework uses slot as resource representation unit, and map slot and reduce slot are separated, so resources can not be shared, resource utilization is not high. Yarn uses node cpu, memory and other resources as resource representation unit, which greatly improves resource utilization.

MRv1 only supports batch mapreduce computing. MRv2 yarn framework provides ApplicationMaster plug-in framework library to support other computing except mapreduce, such as real-time near real-time stream processing, MPI, etc., which makes hadoop become a basic computing framework for resource and data sharing, reduce cluster operation and maintenance costs, and improve resource utilization.

MRv2 JobHistory is separated from JobTracker to reduce the pressure on JobTracker.

The picture above shows the architecture of yarn, which mainly includes the following roles

ResourceManager (RM): mainly receives client task requests, receives and monitors NodeManager (NM) resource reports, is responsible for resource allocation and scheduling, starts and monitors ApplicationMaster (AM).

NodeManager: mainly resource management on the node. Start Container to run task computing, and report resources and container to RM and task processing to AM.

ApplicationMaster: mainly task management and scheduling of a single Application (Job), request for resources to RM, send launch Container instructions to NM, and receive task processing status information of NM.

Here is a brief introduction to the process of submitting a job

1. Client submit a job to RM, and enter the Scheduler queue in RM for scheduling

2. Based on the resources reported by NM (NM will report resources and container usage regularly), RM requests an appropriate NM launch container to start running AM.

3. After AM starts, register on RM, so that client can find the information of AM, so that client can communicate with AM directly.

4. After AM is started, according to the task of Job-related split, it will negotiate with RM to apply for container resources.

5. After the RM is assigned to the AM container resource, request launch container from the corresponding NM according to the information of the container

6. NM starts container to run task, and reports progress status information to AM during operation, similar to the report of task in MRv1; at the same time, NM will also regularly report the use of container to RM.

7. During the execution of application (job), client can communicate with AM to obtain the progress and status information related to application.

8. After the completion of application (job), AM notifies RM clear of its own relevant information, and closes it to release the container it occupies.

At this point, I believe you have a deeper understanding of what roles the yarn architecture mainly includes. You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report