Brief introduction of scheduler and scheduling strategy of Linux 07/03 Update SLTechnology News&Howtos

Brief introduction of scheduler and scheduling strategy of Linux

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Process is a virtual concept of the operating system, which is used to organize tasks in the computer. But as the process is given more and more tasks, the process seems to have real life, and it executes with CPU time from birth until it finally disappears. However, the life of the process is taken care of by the operating system kernel. It's as if the mother kernel, who is tired of taking care of several children, has to decide how to allocate limited computing resources among processes to give users the best experience. The module in the kernel that schedules the execution of a process is called a scheduler. Here is how the scheduler works.

Process status

The scheduler can switch the process state (process state). From creation to death, a Linux process may go through many states, such as execution, pause, interruptible sleep, uninterruptible sleep, exit, and so on. We can classify the various process states under Linux into three basic states.

Ready: the process has acquired all the necessary resources other than CPU, such as process space, network connections, and so on. The process in the ready state can be executed immediately after waiting for CPU.

Running: the process obtains the CPU and executes the program.

Blocked: when a process cannot execute because it is waiting for an event, it abandons the CPU and is in a blocked state.

Figure 1 basic state of the process

After the process is created, it automatically becomes ready. If the kernel allocates CPU time to the process, the process changes from the ready state to the execution state. In the execution state, the process executes instructions and is the most active. The executing process can actively enter the blocking state, for example, the process needs to read the data from part of the hard disk into memory. During this read time, the process does not need to use CPU, but can actively enter the blocking state and give up the CPU. When the reading ends, the computer hardware signals that the process returns from the blocking state to the ready state. A process can also be forced into a blocking state, such as receiving a SIGSTOP signal.

The scheduler is the administrator of CPU time. The Linux scheduler is responsible for two things: one is to select some ready processes to execute, and the other is to interrupt some executing processes and make them ready again. However, not all schedulers have a second function. The state switching of some schedulers is one-way, which can only make the ready process into the execution state, not the executing process back to the ready state. A scheduler that supports bidirectional state switching is called a preemptive (pre-emptive) scheduler.

When the scheduler makes one process ready, it immediately causes another ready process to start execution. Multiple processes replace the use of CPU to maximize the use of CPU time. Of course, if the executing process actively enters a blocking state, the scheduler will also choose another ready process to consume CPU time. The so-called context switching (context switch) refers to the process in which the process switches and executes in CPU. The kernel takes on the task of context switching and is responsible for storing and rebuilding the CPU state before the process is switched, so that the process does not feel that its execution is interrupted. When writing computer programs, application developers do not have to write code to deal with context switching.

Priority of the process

The basic basis for the scheduler to allocate CPU time is the priority of the process. Depending on the nature of the program task, the program can have different execution priorities. According to the priority characteristics, we can divide the process into two categories.

Real-time process (Real-Time Process): a process with high priority and needs to be executed as soon as possible. They must not be blocked by ordinary processes, such as video playback and various monitoring systems.

Normal process (Normal Process): a process with lower priority and longer execution time. For example, text compiler, batch processing of a document, graphics rendering.

Ordinary processes can also be divided into interactive processes (interactive process) and batch processes (batch process) depending on their behavior. Examples of interactive processes are graphical interfaces, which may be waiting for a long time, such as waiting for user input. Once a specific event occurs, the interactive process needs to be activated as soon as possible. In general, the response time of a graphical interface is 50 to 100 milliseconds. Batch processes that do not interact with users are often executed silently in the background.

Real-time processes are created by the Linux operating system, and ordinary users can only create ordinary processes. The priorities of the two processes are different, and the priority of real-time processes is always higher than that of ordinary processes. The priority of the process is an integer from 0 to 139. The smaller the number, the higher the priority. Among them, priorities 0 to 99 are reserved for real-time processes and 100 to 139 are reserved for normal processes.

The default priority for a normal process is 120. We can change the default priority of a process with the command nice. For example, there is an executable program called app that executes commands:

$nice-n-20. / app

The-20 in the command refers to subtracting 20 from the default priority. By executing the app program with this command, the kernel sets the default priority of the app process to 100, which is the highest priority for a normal process. The-20 in the command can be replaced with any integer from-20 to 19, including-20 and 19. The default priority will become the static priority (static priority) at execution time. The final priority used by the scheduler is based on the dynamic priority of the process:

Dynamic priority = static priority-Bonus + 5

If the calculated result of this formula is less than 100 or greater than 139, the number closest to the calculated result in the range of 100 to 139 will be taken as the actual dynamic priority. The Bonus in the formula is an estimate, and the higher the number, the more likely it is to be prioritized. If the kernel finds that the process needs to interact with the user frequently, it will set the Bonus value to a number greater than 5. If the process does not interact with the user frequently, the kernel will set the Bonus of the process to a number less than 5.

O (n) and O (1) scheduler

The scheduling strategy for Linux is described below. The most original scheduling strategy is to arrange the processes according to priority and wait until one process has finished running and then run the lower priority one, but this strategy can not give full play to the advantages of multitasking systems. Therefore, over time, the scheduler of the operating system has evolved many times.

Let's first take a look at the O (n) scheduler introduced by the Linux 2.4 kernel. The name O (n) comes from the large O representation of algorithm complexity. The big O symbol represents the worst-case complexity of the algorithm. The letter n here represents the number of active processes in the operating system. O (n) indicates that the time complexity of the scheduler is proportional to the number of active processes.

The O (n) scheduler divides time into a large number of tiny time slices (Epoch). At the beginning of each time slice, the scheduler checks all processes in the ready state. The scheduler calculates the priority of each process and then selects the process with the highest priority to execute. Once switched to execution by the scheduler, the process can use up this time slice without being disturbed. If the process does not run out of time slices, the remaining time of that time slice is increased to the next time slice.

The O (n) scheduler checks the priority of all ready processes before each time slice is used. This check time is proportional to the number of processes in the process n, which is why the scheduler's complexity is O (n). When there are a large number of processes running on the computer, the performance of this scheduler will be greatly degraded. In other words, the O (n) scheduler is not very scalable. The O (n) scheduler is the process scheduler used before Linux 2.6. As the Java language became more popular, the performance problems of the scheduler became more obvious because the Java virtual machine created a large number of processes.

In order to solve the performance problem of the O (n) scheduler, the O (1) scheduler was invented and used from the Linux 2.6 kernel. As the name implies, the O (1) scheduler means that the time that the scheduler selects processes to execute each time is a constant of 1 unit, regardless of the number of processes in the system. In this way, even if there are a large number of processes in the system, the performance of the scheduler will not degrade. The innovation of the O (1) scheduler is that it prioritizes processes into specific data structures. When selecting the next process to execute, the scheduler can directly select the process with the highest priority without traversing the process.

Similar to the O (n) scheduler, O (1) allocates time slices to processes. The process time slices with priority below 120 are:

(140-priority) x 20 Ms

The process time slices with priority 120 or above are:

(140-priority) x 5ms

The O (1) scheduler uses two queues to hold processes. A queue, called an active queue, is used to store processes that have time slices to be allocated. Another queue, called the expiration queue, is used to store processes that have already enjoyed time slices. The O (1) scheduler calls the time slice out of the active queue for a process. When the process runs out of time slices, it will be transferred to the expired queue. When all the processes in the active queue are executed, the scheduler swaps the active queue with the expired queue and continues to execute these processes in the same way.

The above description does not take priority into account. When priorities are added, the situation becomes a little more complicated. The operating system creates 140 active and expired queues, corresponding to processes with priorities from 0 to 139. At first, all processes are placed in the active queue. The operating system then selects processes in turn from the active queue with the highest priority, and if the two processes have the same priority, they have the same probability of being selected. After one execution, the process is removed from the active queue. If the process is not fully completed in this time slice, it will be added to the expiration queue with the same priority. When all processes in 140 active queues are executed, there will be many processes in the expired queue. The scheduler will continue to execute active queues and expired queues with the same priority. Expired queues and active queues, as shown in figure 2.

Figure 2 expired queue and active queue (need to be replaced)

Let's look at an example where there are five processes, as shown in Table 1.

Table 1 the process queue (run queue) in the process Linux operating system, as shown in Table 2.

Table 2 process queue

So in an execution cycle, the selected processes are first A, then B and C, then D, and finally E.

Note that the execution policy of an ordinary process does not guarantee that a process with a priority of 100 will be executed first and then a process with a priority of 101. instead, it will have a chance to be executed in every cycle of switching active and expired queues, which is designed to avoid process hunger (starvation). The so-called process hunger means that low-priority processes do not have a chance to be executed for a long time.

We see that the O (1) scheduler is easy to pick the next process to execute and does not need to traverse all processes. But it still has some shortcomings. The running order and time slice length of processes are highly dependent on priority. For example, calculate the time slice length for processes with priorities of 100, 110, 120, 130, and 139, as shown in Table 3.

Table 3 time slice length of the process

From the table, you will find that the time slice length gap of processes with priorities 110 and 120 is 10 times larger than that between 120 and 130. In other words, the calculation of the length of the process time slice is very random. The O (1) scheduler adjusts the process priority based on the average sleep time. The scheduler assumes that processes that have been dormant for a long time are waiting for user interaction. These interactive processes should be given higher priority in order to give users a better experience. Once this assumption is not true, there will be problems with the provisioning of CPU by the O (1) scheduler.

Fully fair scheduler

Since the release of Linux 2.6.23 in 2007, the full Fair Scheduler (CFS,Completely Fair Scheduler) has replaced the O (1) scheduler. The CFS scheduler does not estimate or guess the process in any way. This is completely different from O (1), which distinguishes between interactive and non-interactive processes.

The CFS scheduler adds a concept of virtual runtime (virtual runtime). Each time a process is executed in CPU for a period of time, its virtual runtime record is increased. Each time you select a process to execute, instead of selecting the process with the highest priority, you choose the process with the fewest virtual runtime. The fully fair scheduler replaces the 140 queues of the O (1) scheduler with a data structure called the red-black tree. The red-black tree can efficiently find the smallest process that runs virtually.

Let's first look at the CFS scheduler with an example. If a running computer originally has A, B, C, D four processes. The kernel records the virtual runtime of each process, as shown in Table 4.

Table 4 Virtual runtime for each process

The system adds a new process E. The virtual runtime of the newly created process will not be set to 0, but will be set to the smallest virtual runtime for all processes currently. This ensures that the process is executed quickly. In the original process, the minimum virtual runtime is 1 000 nanoseconds of process A, so the initial virtual runtime of E is set to 1 000 nanoseconds. The new process list is shown in Table 5.

Table 5 list of new processes

If the scheduler needs to select the next process to execute, process A will be selected for execution. Process An executes a time slice determined by the scheduler. If process A runs for 250 nanoseconds, its virtual runtime increases. Other processes are not running, so the virtual runtime remains the same. After A runs out of time slices, the updated list of processes is shown in Table 6.

Table 6 list of updated processes

As you can see, the ranking of process A drops to the third place, and the next process to be executed is process E. In essence, the virtual runtime represents how much CPU time the process has consumed. If it consumes less, then priority should be given to computing resources.

According to the above basic design philosophy, the CFS scheduler allows all processes to use CPU fairly. It sounds like this makes the priority of the process meaningless. The CFS scheduler also takes this into account. The CFS scheduler calculates a time slice factor based on the priority of the process. Also adding 250 nanoseconds to the virtual runtime, the low-priority process may actually get only 200 nanoseconds, while the high-priority process may actually get 300 nanoseconds. In this way, high-priority processes get more computing resources.

These are the basic principles of the scheduler and several scheduling strategies used by Linux. The scheduler can allocate CPU time to processes more reasonably. Modern computers are multitasking systems, and schedulers play an important role in multitasking systems.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.