Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the parameters related to RM and NM in Hadoop YARN configuration

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

What are the parameters related to RM and NM in Hadoop YARN configuration? in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Note that before configuring these parameters, you should fully understand the meaning of these parameters in order to prevent the hidden dangers caused by misallocation to the cluster. In addition, these parameters need to be configured in yarn-site.xml.

1. ResourceManager-related configuration parameters

(1) yarn.resourcemanager.address

Parameter explanation: the address exposed by ResourceManager to the client. The client submits the application to RM through this address, kills the application, and so on.

Default value: ${yarn.resourcemanager.hostname}: 8032

(2) yarn.resourcemanager.scheduler.address

Parameter explanation: the access address exposed by ResourceManager to ApplicationMaster. ApplicationMaster uses this address to request resources from RM, release resources, and so on.

Default value: ${yarn.resourcemanager.hostname}: 8030

(3) yarn.resourcemanager.resource-tracker.address

Parameter explanation: the address exposed by ResourceManager to NodeManager.. NodeManager reports heartbeat, collection tasks and so on to RM through this address.

Default value: ${yarn.resourcemanager.hostname}: 8031

(4) yarn.resourcemanager.admin.address

Parameter explanation: the access address exposed by ResourceManager to the administrator. The administrator sends management commands to RM through this address.

Default value: ${yarn.resourcemanager.hostname}: 8033

(5) yarn.resourcemanager.webapp.address

Parameter explanation: ResourceManager external web ui address. Users can view all kinds of cluster information in the browser through this address.

Default value: ${yarn.resourcemanager.hostname}: 8088

(6) yarn.resourcemanager.scheduler.class

Parameter explanation: the enabled resource scheduler main class. Currently available are FIFO, Capacity Scheduler, and Fair Scheduler.

Default value:

Org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

(7) yarn.resourcemanager.resource-tracker.client.thread-count

Parameter interpretation: the number of Handler that processed RPC requests from NodeManager.

Default value: 50

(8) yarn.resourcemanager.scheduler.client.thread-count

Parameter interpretation: the number of Handler that processed RPC requests from ApplicationMaster.

Default value: 50

(9) yarn.scheduler.minimum-allocation-mb/ yarn.scheduler.maximum-allocation-mb

Parameter explanation: the minimum amount of / * memory resources that can be applied for by a single user. For example, when you run a MapRedce job with settings of 1024 and 3072, each Task can apply for at least 1024MB memory and up to 3072MB memory.

Default value: 1024max 8192

(10) yarn.scheduler.minimum-allocation-vcores / yarn.scheduler.maximum-allocation-vcores

Parameter explanation: the minimum number of / * virtual CPU that can be applied for per unit. For example, if set to 1 and 4, each Task can apply for at least one virtual CPU and up to four virtual CPU when running the MapRedce job. What is virtual CPU, you can read my article: "Analysis of YARN Resource Scheduler."

Default value: 1Accord32

(11) yarn.resourcemanager.nodes.include-path / yarn.resourcemanager.nodes.exclude-path

Parameter explanation: NodeManager blacklist and whitelist. If you find problems with several NodeManager, such as high failure rates and high task failure rates, you can add them to the blacklist. Note that these two configuration parameters can take effect dynamically. (just call a refresh command)

Default value: "

(12) yarn.resourcemanager.nodemanagers.heartbeat-interval-ms

Parameter interpretation: NodeManager heartbeat interval

Default: 1000 (milliseconds)

2. Configuration parameters related to NodeManager

(1) yarn.nodemanager.resource.memory-mb

Parameter explanation: the total physical memory available for NodeManager. Note that this parameter is immutable and cannot be dynamically modified throughout the run once it is set. In addition, the default value of this parameter is 8192MB, which will be used by YARN even if your machine memory is not enough (is it stupid? Therefore, this value is configured by a must However, Apache is already trying to make this parameter dynamically modifiable.

Default value: 8192

(2) yarn.nodemanager.vmem-pmem-ratio

Parameter explanation: the maximum number of virtual memory available for each use of 1MB physical memory.

Default value: 2.1

(3) yarn.nodemanager.resource.cpu-vcores

Parameter explanation: the total number of available virtual CPU of NodeManager.

Default value: 8

(4) yarn.nodemanager.local-dirs

Parameter explanation: the location where the intermediate results are stored, similar to the mapred.local.dir in 1. 0. Note that this parameter usually configures multiple directories to share the disk IO load.

Default value: ${hadoop.tmp.dir} / nm-local-dir

(5) yarn.nodemanager.log-dirs

Parameter explanation: log storage address (multiple directories can be configured).

Default value: ${yarn.log.dir} / userlogs

(6) yarn.nodemanager.log.retain-seconds

Parameter explanation: maximum log storage time on NodeManager (valid when log aggregation is not enabled).

Default value: 10800 (3 hours)

(7) yarn.nodemanager.aux-services

Parameter explanation: ancillary services running on NodeManager. You need to configure mapreduce_shuffle before you can run the MapReduce program

Default value: "

Original link: http://dongxicheng.org/mapreduce-nextgen/hadoop-yarn-configurations-resourcemanager-

The answers to the questions about the parameters related to RM and NM in the Hadoop YARN configuration are shared here. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report