Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the job scheduling algorithms in Hadoop cluster

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what are the job scheduling algorithms in Hadoop cluster". The content of the explanation in this article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn what job scheduling algorithms are in Hadoop cluster.

There are three job scheduling algorithms in Hadoop cluster, which are FIFO, fair scheduling algorithm and computing power scheduling algorithm.

First come, first served (FIFO)

The default scheduler in Hadoop, FIFO, first selects the jobs to be executed according to the priority of the jobs, and then selects the jobs to be executed according to the order of arrival time.

FIFO is relatively simple, there is only one job queue in hadoop, the submitted jobs are queued in the job queue in order, and the new jobs are inserted at the end of the queue. After a job is run, the next job is always taken from the first in the queue to run. The advantage of this scheduling strategy is that it is simple and easy to implement, and it also reduces the burden of jobtracker. But its disadvantage is also obvious, it treats all jobs equally, does not take into account the urgency of the job, and is disadvantageous to the operation of small jobs.

Fair scheduling strategy

In this strategy, a task slot is configured in the system, and a task slot can run a task task, which is a small job after a large job is divided. When a user submits multiple jobs, each job can be assigned to a certain task slot to execute task tasks (the task slot here can be understood as running a map task or a reduce task). If the job scheduling of the whole hadoop cluster is compared with the job scheduling of the system, the first FIFO is equivalent to the early single-channel batch system in the operating system, in which only one job is running at each time, while fair scheduling is equivalent to multi-channel batch processing system, which realizes that multiple jobs run at the same time. Because linux is multi-user, what happens if multiple users submit multiple jobs at the same time? In this strategy, each user is assigned a job pool, and then a minimum number of shared slots is set for each job pool. What is the minimum number of shared slots? First, we have to understand what a minimum means. Minimum means that as long as the job pool needs, the scheduler should ensure that it can meet the minimum number of task slots in the job pool, but how to ensure that there are free task slots when it needs them? one way is to assign a certain number of slots to the job pool, which is at least the minimum task slot value. This is fine as long as it is assigned to the job pool when it is needed, but this will lead to waste when the job pool does not use so many task slots, which is actually done in this strategy. when the requirement of the job pool does not reach the minimum number of task slots, nominally, its own remaining task slots will be allocated to other job pools in need. When a job pool needs to apply for a task slot, if there is no one in the system, it will not preempt someone else's (and I don't know whose), as long as the current empty task slot is released, it will be assigned to the job pool immediately.

In a user's job pool, how to allocate slots for multiple jobs is optional, such as FIFO. Therefore, this scheduling strategy is divided into two levels:

In the first level, slots are allocated between pools, and in the case of multiple users, each user is assigned a job pool.

In the second level, each user can use a different scheduling policy within the job pool.

Computing capacity scheduling

Computing power scheduling is similar to fair scheduling. Fair scheduling strategy allocates task slots by job pool, while computing power scheduling allocates tasktracker (a node in a cluster) by queue. This scheduling strategy configures multiple queues, each queue is configured with a minimum number of tasktracker, similar to fair scheduling strategy, when a queue has idle tasktracker. The scheduler will allocate idle tasktracker to other queues. When there is free tasktracker, since there may be multiple queues that do not get a minimum amount of tasktracker and are applying for new ones, free tasktracker will be assigned to the hungriest queue first. How to measure the degree of hunger? It can be judged by whether the ratio between the number of running tasks in the queue and the computing resources allocated to them is the lowest, and the lower the number of tasks in the queue, the higher the degree of hunger.

Computing power scheduling strategy organizes jobs in the way of queues, so a user's job may be in multiple queues. If there are no certain restrictions on users, there is likely to be serious unfairness among multiple users. So when you select a new job to run, you also need to consider whether the user to which the job belongs exceeds the resource limit, and if so, the job will not be selected.

For the same queue, this policy uses a priority-based FIFO policy, but does not preempt.

Thank you for your reading, the above is the content of "what are the job scheduling algorithms in Hadoop clusters". After the study of this article, I believe you have a deeper understanding of what job scheduling algorithms are available in Hadoop clusters, and the specific usage needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report