What are the conversion tools from Fair Scheduler to Capacity Scheduler 07/02 Update SLTechnology News&Howtos

What are the conversion tools from Fair Scheduler to Capacity Scheduler

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/01 Report--

This article introduces you what the Fair Scheduler to Capacity Scheduler conversion tools are, the content is very detailed, interested friends can refer to, hope to be helpful to you.

In Apache Hadoop YARN 3.x (YARN for short), switching to Capacity Scheduler has many advantages, but it also has some disadvantages. To bring these capabilities to users who are currently using Fair Scheduler, Cloudera, together with the upstream YARN community, has created a tool to assist in the migration process. Why switch to Capacity Scheduler by switching to the capacity Scheduler, what can we get? A few examples: improvement in scheduling throughput

O View multiple nodes at a time

O fine-grained locks

O multiple allocation threads

O increase throughput by 5-10 times

Node partitions and labels affinity and anti-affinity: run application X only on those nodes running application Y, and vice versa, do not run application X and application Y on the same node. Scheduler and application activities: messages used to debug important scheduling decisions can be logged and exposed through RESTful API. In addition, with the release of CDP, Cloudera's vision is to support Capacity Scheduler as the default scheduler for YARN and to phase out Fair Scheduler. Supporting two schedulers at the same time brings some problems: not only more support and engineering capabilities are needed, but also additional tests, more complex test cases and test suites are required because of functional defects.

After a long and careful analysis, we decided to choose Capacity Scheduler as the default scheduler. We compiled a document and compared the functions of "capacity scheduler" and "fair scheduler" under YARN-9698 (direct link).

Note that although we tested the tool with various Fair Scheduler and YARN site configurations, it is a new feature for Apache Hadoop. It is strongly recommended that you check and check the generated output file manually.

The fs2cs conversion tool converter itself is a CLI application that is part of the yarn command. To invoke the tool, you need to use the yarn fs2cs command with various command-line arguments. The tool will generate two files as output: Capacity-scheduler.xml and yarn-site.xml. Note that the site configuration is only an increment: it contains only the new settings for Capacity Scheduler, which means that you must manually copy and paste these values into the existing site configuration. Keeping the existing Fair Scheduler attributes is unlikely to cause any damage or failure, but we recommend removing them to avoid confusion. The generated properties can also go to standard output instead of the file mentioned earlier. This tool is an official part of the CDH-to-CDP upgrade and is described here.

The basic usage of using a converter from the command line is:

Yarn fs2cs-y / path/to/yarn-site.xml [- f / path/to/fair-scheduler.xml] {- o / output/path/ |-p} [- t] [- s] [- d]-y / path/to/yarn-site.xml [- f / path/to/fair-scheduler.xml] {- o / output/path/ |-p} [- t] [- s] [- d] the switches listed between [] curly braces are optional. Curly braces {} indicate that the switch is mandatory and you must select one. You can also use their long version: yarn fs2cs-- yarnsiteconfig / path/to/yarn-site.xml [--fsconfig / path/to/fair-scheduler.xml] {--output-directory / output/path/ |-- print} [--no-terminal-rule-check] [--skip-verification] [--dry-run]-- yarnsiteconfig / path/to/yarn-site.xml [--fsconfig / path/to/fair-scheduler.xml] {- -output-directory / output/path/ |-- print} [--no-terminal-rule-check] [--skip-verification] [--dry-run] for example:

Yarn fs2cs-- yarnsiteconfig / home/hadoop/yarn-site.xml-- fsconfig / home/hadoop/fair-scheduler.xml-- output-directory / tmp--yarnsiteconfig / home/hadoop/yarn-site.xml-- fsconfig / home/hadoop/fair-scheduler.xml-- output-directory / tmp is important: always use an absolute path for-f /-fsconfig.

For a list of all command line switches and their descriptions, you can use yarn fs2cs-help. The CLI options are listed in this document.

Using the step-by-step example of fs2cs, let's take a look at a short demonstration of the tool.

The existing configuration assumes that we have the following simple fair-scheduler.xml: 1.01.0 drfdrf 1.01.0 drfdrf memory-mb=8192, vcores=1memory-mb=8192 Vcores=1 1.01.0 drfdrf We also have the following entries in yarn-site.xml (only those related to Fair Scheduler are listed):

Yarn.scheduler.fair.allow-undeclared-pools = true.scheduler.fair.allow-undeclared-pools = trueyarn.scheduler.fair.user-as-default-queue = true.scheduler.fair.user-as-default-queue = trueyarn.scheduler.fair.preemption = false.scheduler.fair.preemption = falseyarn.scheduler.fair.preemption.cluster-utilization-threshold = 0.8.scheduler.fair.preemption.cluster-utilization-threshold = 0.8yarn.scheduler.fair.sizebasedweight = false.scheduler.fair.sizebasedweight = falseyarn.scheduler. Fair.assignmultiple = true.scheduler.fair.assignmultiple = trueyarn.scheduler.fair.dynamicmaxassign = true.scheduler.fair.dynamicmaxassign = trueyarn.scheduler.fair.maxassign =-1.scheduler.fair.maxassign =-1yarn.scheduler.fair.continuous-scheduling-enabled = false.scheduler.fair.continuous-scheduling-enabled = falseyarn.scheduler.fair.locality-delay-node-ms = 2000.scheduler.fair.locality-delay-node-ms = 2000 run the fs2cs converter let's run the converter for these files: ~ $ Yarn fs2cs-y / home/examples/yarn-site.xml-f / home/examples/fair-scheduler.xml-o / tmp$ yarn fs2cs-y / home/examples/yarn-site.xml-f / home/examples/fair-scheduler.xml-o / tmp

2020-05-05 14 converter.FSConfigToCSConfigConverter 22 INFO [main] converter.FSConfigToCSConfigConverter-Output directory for yarn-site.xml and capacity-scheduler.xml is: / tmp-05-05 14 converter.FSConfigToCSConfigConverter 22 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:prepareOutputFiles)-Output directory for yarn-site.xml and capacity-scheduler.xml is: / tmp

2020-05-05 14 converter.FSConfigToCSConfigConverter 22 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:loadConversionRules (17777))-Conversion rules file is not defined, using default conversion config Lizi 05-05 14converter.FSConfigToCSConfigConverter 41388 INFO [main] converter.FSConfigToCSConfigConverter (FSConfigToCSConfigConverter.java:loadConversionRules (17777)-Conversion rules file is not defined, using default conversion config!

[...] Output trimmed for brevity output trimmed for brevity

2020-05 14 converter.FSConfigToCSConfigConverterMain 22 ERROR [main] converter.FSConfigToCSConfigConverterMain (MarkerIgnoringBase.java:error (159))-- 05-05 14 converter.FSConfigToCSConfigConverterMain 22 ERROR [main] converter.FSConfigToCSConfigConverterMain (MarkerIgnoringBase.java:error (159))-Error while starting FS configuration conversion! While starting FS configuration conversion!

[...] Output trimmed for brevity output trimmed for brevity

Caused by: org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AllocationConfigurationException: Rules after rule 2 in queue placement policy can never be reachedat org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.QueuePlacementPolicy.updateRuleSet (QueuePlacementPolicy.java:115)

[...] This is a very typical mistake. If you look at the placement rules for fair-scheduler.xml, you can see that the default rules are after nestedUserQueue. We need to use the-no-terminal-rule-check switch to ignore the terminal rule check in Fair Scheduler. Why? Please refer to the following section.

By default, Fair Scheduler strictly checks whether the placement rule is a terminal rule. This means that if you use rules and then use rules, you are not allowed to do so because the latter is not accessible. However, before YARN-8967 (changing FairScheduler to use the PlacementRule interface), FairScheduler is looser and allows some sequences of rules that are no longer valid. As mentioned earlier, we use this tool to instantiate the Fair Scheduler instance to read and parse the allocation file. In order for Fair Scheduler to accept this type of configuration, using-t or-no-terminal-rule-check must provide parameters to suppress exceptions thrown by Fair Scheduler. This type of placement configuration is common in CDH 5.x, so it is recommended that you always use-t. Use-no-terminal-rule-check to run the tool again ~ $yarn fs2cs-y / home/examples/yarn-site.xml-f / home/examples/fair-scheduler.xml-o / tmp-- no-terminal-rule-check

2020-05-05 14 capacity.CapacityScheduler 41 capacity.CapacityScheduler 39189 INFO [main]-Initialized CapacityScheduler with calculator=class org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator, minimumAllocation=, maximumAllocation=, asynchronousScheduling=false, asyncScheduleInterval=5ms,multiNodePlacementEnabled=false

2020-05-05 14 converter.ConvertedConfigValidator 41 converter.ConvertedConfigValidator 39190 INFO [main] converter.ConvertedConfigValidator (ConvertedConfigValidator.java:validateConvertedConfig (72))-Capacity scheduler was successfully started

This time, the conversion succeeded!

There are several things worth mentioning in the log for considerations about the output log of the conversion tool:

Fair Scheduler does not throw an exception, it only prints a warning that the rule is unreachable. Two warnings will be displayed during the conversion process: 2020-05-05 14 converter.FSConfigToCSConfigRuleHandler 41 WARN [main] converter.FSConfigToCSConfigRuleHandler (ConversionOptions.java:handleWarning (48))-Setting is not supported, ignoring conversion

2020-05-05 14 converter.FSConfigToCSConfigRuleHandler 41 WARN [main] converter.FSConfigToCSConfigRuleHandler (ConversionOptions.java:handleWarning (48))-Setting is not supported, ignoring conversion as mentioned earlier, there is some functional gap between the two schedulers and by default, a warning is printed whenever an unsupported setting is detected. This is useful for operators to see which features must be fine-tuned after the upgrade.

You can clearly see that the Capacity Scheduler instance has been started to verify that the converted configuration is valid. Look at the converted configuration if we look at / tmp/yarn-site.xml, we will find that it is really short: yarn.scheduler.capacity.resource-calculator = org.apache.hadoop.yarn.util.resource.DominantResourceCalculatoryarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled = trueyarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler there are not many parameters here. This is because many scheduling-related settings are disabled: no preemption, no continuous scheduling, no rack or node location threshold settings.

Let's take a look at the new Capacity-scheduler.xml (again, it's formatted here And delete unnecessary XML tags): yarn.scheduler.capacity.root.users.maximum-capacity = 100yarn.scheduler.capacity.root.default.capacity = 50.000yarn.scheduler.capacity.root.default.ordering-policy = fairyarn.scheduler.capacity.root.users.capacity = 50.000yarn.scheduler.capacity.root.default.maximum-capacity = 100yarn.scheduler.capacity.root.queues = default Usersyarn.scheduler.capacity.root.maximum-capacity = 100yarn.scheduler.capacity.maximum-am-resource-percent = 0.5Please note that the property yarn.scheduler.capacity.maximum-am-resource-percent is set to 0.5. This feature is missing in fair-scheduler.xml, so why here? The tool must be set because the default setting is 10% in Capacity Scheduler and 50% in Fair Scheduler.

Let's modify the following properties: after yarn.scheduler.fair.preemption-trueyarn.scheduler.fair.sizebasedweight-trueyarn.scheduler.fair.continuous-scheduling-enabled-true runs the transformation again, these settings are now reflected in the new yarn-site.xml:

Yarn.scheduler.capacity.resource-calculator = org.apache.hadoop.yarn.util.resource.DominantResourceCalculatoryarn.scheduler.capacity.schedule-asynchronously.scheduling-interval-ms = 5yarn.scheduler.capacity.schedule-asynchronously.enable = trueyarn.resourcemanager.monitor.capacity.preemption.monitoring_interval = 10000yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill = 15000yarn.scheduler.capacity.per-node-heartbeat.multiple-assignments-enabled = trueyarn.resourcemanager.scheduler.class = org.apache.hadoop.yarn.server.resourcemanager.scheduler. Capacity.CapacityScheduleryarn.resourcemanager.scheduler.monitor.enable = true size-based weight settings also affect Capacity-scheduler.xml:

Yarn.scheduler.capacity.root.default.ordering-policy.fair.enable-size-based-weight = trueyarn.scheduler.capacity.root.users.ordering-policy.fair.enable-size-based-weight = trueyarn.scheduler.capacity.root.users.capacity = 50.000yarn.scheduler.capacity.root.queues = default,usersyarn.scheduler.capacity.root.users.maximum-capacity = 100yarn.scheduler.capacity.root.ordering-policy.fair.enable-size-based-weight = true [...] Rest is omitted because it's the same as before

How does weight conversion work in FairScheduler

A key problem is how to convert the weight. In the long run, weight determines the "fair share" of the queue. Fair share is the number of resources available to the queue, which limits how many resources can be used by applications submitted to the queue. For example, if "root.a" and "root.b" have weights of 3 and 1, respectively, "root.a" will get 75% of the total cluster resources, while "root.b" will get 25%. But what if we only submit an application to root.b? As long as "root.a" is empty, the applications in "root.b" are free to occupy the entire cluster (let's ignore it now). How do we simulate weights in Capacity Scheduler? It turns out that the "capacity" of Capacity Scheduler is very close to the concept of weight, only that it is expressed as a percentage rather than an integer. By default, however, there is an upper limit to capacity, which means that "root.b" with a capacity of 25.00 will always use only 25% of the cluster. This is where the concept of elasticity comes into being. Resilience means that free resources in the cluster can be allocated to queues that exceed their default capacity. The value is also expressed as a percentage. Therefore, we must enable full resilience for all queues. A simple example of FairScheduler weight versus CapacityScheduler configuration all in all, we can use the following properties to achieve FairScheduler-like behavior: weight in FairScheduler: root.a = 3root.b = corresponding setting for 1Capacity Scheduler:

Yarn.scheduler.capacity.root.a.capacity = 75.000yarn.scheduler.capacity.root.a.maximum-capacity = 100.000yarn.scheduler.capacity.root.b.capacity = 25.000yarn.scheduler.capacity.root.b.maximum-capacity = 100.000 another example with hierarchical queues assumes that the following simple queue hierarchy has weights in Fair Scheduler: root = 1root.users = 20root.default = 10root.users.alice = 3root.users.bob = 1 translates to the following capacity values:

Yarn.scheduler.capacity.root.capacity = 100.000yarn.scheduler.capacity.root.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.capacity = 66.667yarn.scheduler.capacity.root.users.maximum-capacity = 100.000yarn.scheduler.capacity.root.default.capacity = 33.333yarn.scheduler.capacity.root.default.maximum-capacity = 100.000yarn.scheduler.capacity.root.users.alice.capacity = 75.000yarn.scheduler.capacity.root.users.alice.maximum-capacity = 100. 000yarn.scheduler.capacity.root.users.bob.capacity = 25.000yarn.scheduler.capacity.root.users.bob.maximum-capacity = how the 100.000fs2cs tool works internally performs some basic verification steps such as If the output directory exists, the input file exists, etc.), it loads the yarn-site.xml and converts scheduling-related properties, such as preemption, continuous scheduling, and rack / node location settings. The tool uses Fair Scheduler instances to load and parse allocation files. It also detects unsupported properties and displays separate warning messages indicating that specific settings will not be converted. Unsupported settings and known limitations are explained later in this article. After the conversion is complete and the output file is generated, the final step is validation. By default, fs2cs attempts to start Capacity Scheduler internally using the converted configuration. This step ensures that the resource manager starts correctly with the new configuration.

Known limitations currently, there are some functional gaps between Fair Scheduler and Capacity Scheduler-that is, full conversion can only be done if you are not using the settings in the currently unimplemented Fair Scheduler configuration in Capacity Scheduler.

The conversion of the retention system settings is skipped completely, which may not change in the foreseeable future. The reason is that it is not a frequently used feature and works in completely different ways in the two schedulers. Placement rules Placement rules in the Fair Scheduler define which queues submitted applications should be placed in YARN. The placement rule follows the "failure" logic: if the first rule does not apply (the queue returned by the rule does not exist), try the next rule, and so on. If the last rule fails to return a valid queue, the application submission is rejected. The capacity scheduler uses a conceptually similar approach called mapping rules. However, the implementation is different: converting placement rules to mapping rules cannot be done correctly at this time. There are many reasons: 1) if the mapping rules match, it will return a queue and will not proceed to the next queue. It is either a specific queue or a root.default. 2) Mapping rules use placeholders, such as% primary_group,% secondary_group, and% user. This is very similar to the functionality in Fair Scheduler. However, it lacks% specified. 3) Placement rules can have creation flags. If create = true, the queue is created dynamically. Capacity Scheduler does not have automatic queue creation on a per-rule basis. If the parent is a so-called managed parent, it can create queues on demand (with the attribute auto-create-child-queue enabled). However, a managed parent queue cannot have a static leaf queue, that is. The children under them cannot be defined in Capacity-scheduler.xml. 4) the nesting rules of primary and secondary groups make things more complicated because the create flag is interpreted on both external and internal rules. These differences make it difficult, sometimes impossible, to convert placement rules into mapping rules. In this case, the cluster operator must be creative and deviate from its original placement algorithm. Unsupported properties the tool does not convert the following properties:

Maximum number of applications per user

-default maximum application per user

-minimum resource of the queue

-maximum resource of the queue

-maximum resources of dynamically created queues

Queue-level DRF sorting policy: in Capacity Scheduler, DRF must be global. In Fair Scheduler, you can use the regular "Fair" policy under the DRF parent.

Future improvements are still being actively developed to provide a better user experience. The most important tasks are:

1) treat the percentage vector as a resource in Capacity Scheduler (YARN-9936). Users will be able to define not only a single capacity, but also multiple values for different resources.

2) handle maxRunningApps userMaxAppsDefault per user (YARN-9930) We have the "most applications per user" setting, but it is not directly configured and tedious because of its combination of three settings. We must also be careful not to break the existing behavior-if the maximum setting is exceeded, the existing logic in Capacity Scheduler refuses to submit the application, while in Fair Scheduler, the application is always accepted and scheduled later.

3) dealing with minResources, maxResources and maxChildResources depends largely on YARN-9936. In Fair Scheduler, users can express these settings in several ways (a single percentage, two separate percentages, or absolute resources). To support similar settings in Capacity Scheduler, we need YARN-9936.

4) make the behavior of mapping rules similar to the implementation that exists in Fair Scheduler. How to evaluate mapping rules is explained in the Placement rules section. We may need a new, pluggable approach-so that we don't introduce regression into an already complex existing code base.

5) with regard to improvements to DRF and other scheduling strategies (YARN-9892) currently, we have a global resource calculator defined by the property yarn.scheduler.capacity.resource-calculator. This is more nuanced in Fair Scheduler.

6) General fine-tuning for the entire conversion process has some attributes in Capacity Scheduler, such as "user-limit-factor" or "minimum-user-limit-percent". We do not use these settings for the time being, but they have proved to be useful in some configurations

The fs2cs tool has become an upgrade path for CDH-CDP to help customers will be based on the fair configuration of the scheduler, a component of capacity scheduling. We saw why switching to Capacity Scheduler has obvious benefits. As we have seen, not everything is perfect at the moment. Some features in Fair Scheduler may be lost or only partially supported in Capacity Scheduler. The tool displays a warning message when such a setting is encountered during the conversion process. Some aspects of the transformation are very challenging, especially the presentation of location rules. Even though conceptually similar, the queue placement principles followed by the two schedulers are slightly different, which requires extra effort for the Capacity Scheduler mapping rules to work in the same way. Nevertheless, we are committed to implementing all necessary changes to increase customer satisfaction and improve the user experience. About Fair Scheduler to Capacity Scheduler conversion tools which are shared here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.