What is the function of Linux scheduler BFS 07/09 Update SLTechnology News&Howtos

What is the function of Linux scheduler BFS

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what is the function of Linux scheduler BFS". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn "what is the function of the Linux scheduler BFS?"

BFS is a process scheduler, which can be interpreted as a "brain disability scheduler". This odd name has multiple meanings, and it is easier to accept that it is so simple, but so good, that it makes people doubt their thinking ability.

Linux mainline,BFS, which does not incorporate BFS into Linus maintenance, does not intend to do so. But BFS has a large following for only one reason: BFS is excellent and makes users' desktop environment more fluent than ever before. In an era when the hardware is becoming more and more advanced, but the system is still often dull, this is really exciting.

In 2010, Android used BFS as the standard scheduler for its operating system, which also proved the value of BFS.

BFS vs CFS, performance test competition

After the emergence of BFS, it has been praised by many users, such as "fast, feel fast", "the rapid future of the desktop" and so on. These words are eye-catching, so I began to look around for test data about BFS, hoping to find numbers or curves that illustrate all of this. But the result was quite disappointing.

Testing of Jens Axboe

Shortly after BFS's release, in September 2009, Ingo Molnar released his review, comparing CFS with BFS. As the author of CFS, the test results he claims are not surprising: CFS is better than BFS in every way. However, people have different reactions to his evaluation results, some agree with him, while others have doubts. Jens Axboe is one of the skeptics who wrote a program called Latt.c himself that tried to test two mysterious attributes of the scheduler: "Interactivity" and "Fluidness".

His test results are just the opposite, showing that BFS is better than CFS in terms of interactivity, and its CPU utilization is higher. However, BFS is less stable and in some cases exhibits poor interactivity problems.

From the test data of Jens, BFS is slightly better than CFS, but the advantages are not as exaggerated as it is widely circulated. Interested readers can find detailed data about the Jens test in lkml's mailing list: http://thread.gmane.org/gmane.linux.kernel/886319/focus=887636

As a result, I was a little disappointed when I was looking forward to it. I didn't see BFS taking the lead. On the contrary, it is somewhat similar to the Olympic men's 100-meter final, and it is difficult to tell who is the champion for the time being. It is worth noting, however, that the test accidentally made people aware of a serious problem with CFS itself.

The sleeper fairness feature of CFS leads to severe scheduling delays in some cases, even a delay of 10s in the xmodmap test of Jens. And around the testing of Jens, people have issued statements that there are many interaction problems when using CFS, such as when compiling the kernel, there will be a serious pause in audio and video at the same time, while using BFS does not have these problems. However, these CFS problems mysteriously disappeared after turning off the sleeper fairness feature.

This forced the developers of CFS Scheduler to temporarily shut down the sleeper fairness feature, and at one point said it would officially turn it off in the upcoming 2.6.32 until the problem was resolved. Surprisingly, Ingo threw out a new patch, Gentle Fairness, within a week. The delay in using this patch,10s disappeared, as did other negative reports about mouse lag and video pause about CFS.

Testing of Phoronix

You can see Phoronix's professional testing of BFS at http://www.phoronix.com/scan.php?page=article&item=bfs_scheduler_benchmarks&num=1 and http://global.phoronix-test-suite.com/?k=profile&u=zero-9274-28890-6247. The test was also completed in September 2009, and as mentioned earlier, there have been some updates to both BFS and CFS, so the test does not fully reflect the latest status of both schedulers. However, as an authoritative evaluation institution, the evaluation results are still worth seeing.

From the test results of Phoronix, BFS has a slight lead in many tests, while CFS has surpassed in other test projects. I can't help feeling a little sad again.

The only test project that can reflect the "haste" of BFS comes from the test of network server throughput, and the most convincing and shocking histogram is posted here.

Figure 1. Network throughput test

But apart from this, in general, Phoronix's test results only show that BFS and CFS are neck and neck.

Evaluation of University of New Mexico computer Department

Taylor Groves, Je Knockel and Eric Schulte of the University of New Mexico also released a review of BFS vs. CFS in December 2009.

Their assessment focused on three areas: latency, Turnaround Time and interactivity. Here is an excerpt of their test results.

Figure 2. Delay

Figure 3. Turnaround Time

Figure 4. Interactivity

These three pictures are finally talking to comfort me for the hard work I have been looking for everywhere. According to the results of this evaluation, we can finally come to the following conclusion:

CFS is superior to BFS in terms of turnaround time. However, the scheduling delay of BFS is less than that of CFS. This shows that BFS is more suitable for interactive application environment. CFS is more suitable for batch job environments. This is the same experience as many users.

Summary

The above three tests were completed before the release of Linux2.6.32. However, CFS introduced the GENTLE_FAIR_SLEEPERS feature into Linux2.6.32, and as mentioned in Section 2.1, this patch is said to greatly improve interactivity. Unfortunately, since then, no one seems to have done any more comparative tests on CFS and BFS. Therefore, when Linux has entered the era of 2.6.35, we can not easily come to the conclusion that which is better or worse, BFS or CFS.

On the other hand, although professional reviews do not show the obvious advantages of BFS, from the information that can be gathered on Internet, most users feel that BFS can significantly improve the experience of interactive applications, which is a personal experience, such as whether the mouse moves smoothly and so on. In this kind of experience, the difference between the two schedulers is quite large, which cannot be explained by the previous test data.

Therefore, I think that at present, people do not understand the real reasons that affect interactivity, and the data focused on by professional tests cannot accurately describe subjective feelings such as "fluency". So, for BFS, we might as well believe in feeling for once.

So what improvements have BFS made, and if these improvements are so effective, why aren't mainstream kernels willing to accept BFS?

BFS vs CFS, different in design

During the day, Con Kolivas works as an anesthesiologist in the hospital, relieving pain for people and relieving himself with Linux in his spare time. Well, Kolivas didn't learn Linux to solve the pain, I just guessed. But according to Kolivas, he didn't even learn the C language when he came into contact with the Linux kernel. This fact proves that language is only a tool, and a deep understanding of the nature of the problem is the key to writing programs. There may be persistence, and the struggle between CFS and RSDL led Kolivas to leave the Linux community, and after years of this, when Kolivas started looking at kernel code again, he immediately discovered that CFS had the following design problems:

The goal of CFS is to support all application scenarios from the desktop to high-end servers. This large and comprehensive design idea makes it necessary to make some implementation tradeoffs. In addition, features that are only needed in high-end machines will introduce unnecessary complex code.

Second, in order to maintain fairness on multi-CPU, CFS uses a load balancing mechanism, and Kolivas believes that this complex code offsets the benefits of per cpu queue.

Finally, CFS in the mainstream kernel still has some preference for sleep processes, which means "unfair".

Differences in design objectives

In reality, the scheduling algorithm is similar to an awkward housewife, meeting the children's request for dinner may hurt the elderly's appetite. The Linux kernel has been trying to make a dish that the whole family likes, young and old, and CFS has done a good job in this respect. But a dish that can be accepted by everyone may mean a little insipid. BFS, on the other hand, intends to satisfy only one taste in order to take it to its limit.

According to Linux Magazine, Con Kolivas started thinking about BFS after seeing the following cartoon from xkcd.

Figure 5. Xkcd cartoons that ridicule the Linux scheduler

It all started with some Linux users who found that although Linux claims to be able to make full use of the computing power of 4096 CPU systems, it is unable to play Youtube videos smoothly on an ordinary laptop.

This makes people wonder whether some of the complex features of CFS really make sense for the Desktop environment. Is it necessary for people to use a scheduler that supports 4096 CPU in their personal computers?

BFS is a natural response to this doubt. It is not going to support the behemoth of 4096 CPU, and BFS is aimed at desktop computers used by ordinary people. In addition, BFS removes features that are only needed on the server. For example, BFS abandons the group scheduling feature of CFS, and features like CGROUP are redundant technology for the average desktop user.

This is easy to understand: who designs multiple CPU in a system with only one CGroup, and where else can concepts such as NUMA domain be used?

In addition, BFS uses a single run queue, which eliminates the need for complex load balancing mechanisms. Because there is no longer the concept of CGROUP, there is no need for load balancing between Group.

These simple tailoring makes the BFS code greatly simplified, which means that the number of instructions required to perform a schedule is reduced, and the corresponding footprint is naturally reduced.

Of course, simplifying the code is only an obvious aspect, and more importantly, the difference in concept will have a more profound impact on the final scheduler implementation, which is hard to describe.

Multi-queue vs single queue

When the Linux kernel entered 2.6, the scheduler adopted per cpu run queue to overcome the limitations of a single run queue. In a multi-CPU system, a single run queue means that the run queue becomes the bottleneck of the system, because at the same time, when one CPU accesses the run queue, the other CPU must wait even if it is idle. When per CPU's run queue is used, each CPU no longer has to use large locks, allowing scheduling to be processed in parallel.

But many things are not as simple as they seem at first glance.

Kolivas found that the benefits of adopting per cpu run queue are offset by load balance code that pursues fairness. In the current CFS scheduler, each CPU only maintains the fairness of all processes in the local run queue. In order to achieve cross-CPU scheduling fairness, CFS must regularly load balance and move some processes from the busy CPU run queue to other idle run queue.

This load balance process requires acquiring locks from other run queue, which reduces the parallelism brought about by multiple run queues.

And in complex cases, the introduction of footprint due to load balance will be very considerable.

Of course, the cost of locking introduced by load balance is still lower than that of global locking, and the cost difference becomes more significant with the increase of the number of CPU. Note, however, that BFS does not intend to work for systems with 1024 CPU, and if the number of CPU in the system is limited, the advantage of multi-run queue is not obvious.

After BFS adopts a single queue, each new process that needs to be scheduled can find the most appropriate CPU globally, without having to wait for the load balance code to decide as CFS does, which reduces the delay of adjudication between multiple CPU, and the final result is a smaller scheduling delay.

Look forward or backward?

Kolivas has been following Linux's performance on desktop for many years. For desktop users, the most important thing is not the throughput of the system, but the smooth experience of interactive programs. Since SD, Kolivas has told kernel hackers that complete fairness can fundamentally guarantee interactivity. He always adheres to a basic idea: the scheduler should be forward look only. Never think about the past of a process.

However, CFS has to consider the past of the process. 2.6.23, CFS records and uses sleep time. Shortly after that, when 2.6.24 was released, CFS merged "Real Fair Scheduler" and deleted sleep time. So in the kernel after 2.6.24, CFS finally stopped thinking about the sleep time of the process in the past.

But CFS still retains the idea of sleeper fairness, when the process wakeup, in the place_entity () function, CFS will reward sleeper so that it can get CPU as soon as possible. This strategy is very subtle, and we described the evolution of sleeper fairness in detail in Section 2.1. If you take some time to look back, you will see what a serious delay problem sleeper fairness has caused. Although Ingo claims that Gentle fairness solves the latency problem, from a code point of view, Gentle Fairness only halves the reward for sleeper. So we can say that CFS still rewards the Sleeper process, which represents a preference, an "unfairness". And this is exactly what BFS opposes.

In BFS, when a process wakeup, the scheduler will choose according to the process's deadline (described in detail in Chapter 4 of this article on deadline). As a result, the process that sleeps earlier can be scheduled more quickly; the sleeper fairness of CFS means that the next scheduled process is selected according to the time of wakeup, and the process of earlier wakeup will be scheduled faster.

There is no theoretical basis for reference on what impact this difference will have on desktop applications. But I personally think that BFS's strategy is more reasonable.

You may be a little upset by now, so I'd better introduce the implementation details of BFS as soon as possible. Then maybe you will understand me, some words are better not translated.

Principle of BFS implementation

Scheduler is a very complex topic, especially the CFS scheduler. To describe it clearly, you need an extraordinary pen, which I haven't found yet. But BFS is very simple, so I have the courage to write something about the implementation principle of BFS here. First of all, several key concepts are introduced.

Virtual Deadline (Virtual Deadline)

When a process is created, it is given a fixed time slice and a virtual Deadline. The calculation formula for the virtual deadline is very simple:

Virtual Deadline = jiffies + (user_priority * rr_interval) formula one

Where jiffies is the current time, user_priority is the priority of the process, and rr_interval stands for round-robin interval, which is similar to the deadline that a process must be scheduled, the so-called Deadline. However, before this Deadline, there is an adjective "Virtual", so this Deadline just expresses a wish, not the kind of deadline that many leaders often say.

The virtual Deadline will be used for the picknext decision of the scheduler, which will be described in detail in later chapters.

Representation and scheduling strategy of process queue

Within the operating system, all Ready processes are stored in the process queue, and the scheduler selects the next scheduled process from the process queue. Therefore, how to design the process queue is an important topic for us to study the scheduler. BFS uses a very traditional process queue representation, that is, bitmap plus queue.

BFS divides all processes into four categories, which represent different scheduling policies:

● Realtime, real-time process

● SCHED_ISO,isochronous process for interactive tasks

● SCHED_NORMAL, normal process

● SCHED_IDELPRO, low priority task

Real-time processes can always get CPU, and Round Robin or FIFO methods are used to select real-time processes with the same priority. They need the authority of superuser, which is usually limited to processes that take up little CPU time but care a lot about Latency.

SCHED_ISO is still not implemented in the mainstream kernel. Con proposed this patch as early as 2003, but it has not been able to enter the mainstream kernel. This scheduling strategy is designed for those near-realtime processes. As mentioned earlier, real-time processes require the user to have superuser privileges, and such processes can monopolize CPU, so only a few processes can be configured as real-time processes. For those processes that require high interactivity but cannot become real-time processes, BFS will adopt SCHED_ISO, which can preempt SCHED_NORMAL processes. Their priority is higher than SCHED_NORMAL, but lower than real-time processes. In addition, when the SCHED_ISO process occupies a certain limit of CPU time, it will be downgraded to SCHED_NORMAL to prevent it from monopolizing the entire system resources.

SCHED_NORMAL, similar to SCHED_OTHER in the mainstream scheduler CFS, is a basic time-sharing scheduling strategy.

SCHED_IDELPRO is similar to SCHED_IDLE in CFS, that is, a process that is scheduled only when CPU is about to be in the IDLE state.

In these different scheduling strategies, the real-time process is divided into 100 different priorities, plus the other three scheduling strategies, there are a total of 103 different process types. For each process type, it is possible to have multiple processes Ready at the same time in the system, for example, there are probably two RT processes with a priority of 10 Ready at the same time, so for each type, a queue is needed to store ready processes belonging to that type.

BFS uses 103 bitmap to indicate whether there are processes of the corresponding type ready to schedule. As shown in the following figure:

Figure 6. BFS process queue

When any type of process queue is not empty, that is, there is a Ready process, the corresponding bitmap bit is set to 1.

The problem of how the scheduler selects the next scheduled process in such a complex structure of bitmap plus queue is called Task Selection or pick next.

Task Selection i.e. Pick Next

When the scheduler decides to schedule the process, BFS will select the task according to the following principles:

Figure 7. Task Selection

First check to see if the bitmap has a set bit. For example, in the figure above, the bit corresponding to SCHED_NORMAL is set, indicating that there is a process ready of type SCHED_NORMAL. If any bits of SCHED_ISO or RT task are set, they are given priority.

Once the corresponding bit bit is selected, you need to traverse its corresponding subqueue. If it is a subqueue of a RT process, select the first process. If it is another queue, then use the EEVDF algorithm to select the appropriate process.

EEVDF, or earliest eligible virtual deadline first. BFS will traverse the subqueue, a two-way list, comparing the Virtual Deadline value of each process in the queue to find the smallest one. At worst, this is an O (n) algorithm, that is, it needs to traverse the entire bi-directional list, and if there are n processes, it needs to be read and compared.

But in fact, there is often no need to traverse the entire n processes, because BFS also has such a search condition:

When the Virtual Deadline of a process is less than the current jiffies value, the process is returned directly. It is removed from the ready queue and will be placed at the end of the queue the next time you insert, ensuring that each process is likely to be selected without hunger.

This rule corresponds to a situation where the process has been sleeping for so long that it has slept through its Virtual Deadline, as shown in the following figure:

Figure 8. Sleep and awakening

The original virtual deadline of T1 is T1, and after it sleep, other processes such as T2 start to run. By the time T1 wakeup again, the jiffies is already greater than T1. In this case, T1 does not need to be compared with the virtual deadline of other processes, but is directly selected by the BFS scheduler.

Basic scheduling scenario

The three basic scenario can summarize most scheduling scenarios. Every scheduling that occurs in the system belongs to one of the following three scenarios.

Process wakeup:Task Insertion

When sleeping in the process wakeup, the scheduler needs to perform the operation of task insertion to insert the process into the run queue. The operation of BFS to insert a process into the corresponding queue is to perform a two-way queue insertion operation. The common algorithm structure of the computer tells us that this operation is O (1). However, BFS needs to see if the current process can preempt processes that are currently running on the system before performing the insert operation. So it compares the virtual deadline value of the new process with the virtual deadline value of the process currently running on each CPU, and if the value of the new process is small, it directly preempts the running process on that CPU. This algorithm is O (m), where m is the number of CPU. If there are 16 CPU in the system, then 16 comparisons are required each time. But this design ensures very good low-latency features.

Process Sleep

The currently running process is likely to sleep actively, and at this point the scheduler needs to remove the process from the run queue and select another process to run. However, the value of the process's virtual deadline remains the same.

In this way, when the process wakeup, its virtual deadline will be relatively small, because the jiffies continues to increase over time. A smaller Virtual Deadline ensures that the process can be scheduled more quickly.

Still taking figure 8 as an example, there are two processes in the system. After T1 and T2 Magi T1 enter the sleep state, their virtual deadline is still T1. T2 is scheduled at this time, and its virtual deadline is T2 calculated according to formula 1. Since then, the T1 process wakeup, at this time, although the time slice of T2 has not been used up, but because the virtual deadline of T1 is less than T2, (T1)

The process uses up its own time slice

Each process has its own time slice, and even if it is not preempted by other processes, if its own time slice is used up, the current process will certainly be deprived of CPU time so that other processes have a chance to execute.

After running out of time slices for the current process, you must give up the CPU and recalculate its virtual deadline according to the formula one.

This guarantees a feature: only after all the other ready processes get CPU, the process that runs out of the current time slice can be run again, which avoids hunger.

I feel powerless at the moment. It seems that the introduction should not end abruptly here, but I have indeed finished what I want to say. The only thing I can do is to seize the last opportunity here to make a small summary.

BFS focuses on a single goal, so it can simplify the code to the extreme. It uses a single Queue, thus eliminating the need for load balance, although the concurrency is reduced, but for desktop systems with a small number of CPU, its ability to quickly switch CPU should be able to compensate for the loss of concurrency, and maybe a surplus.

BFS is only concerned about the future, it is completely fair, a process's sleep habits and its past can not affect the timing of its next scheduling. In the BFS world, the scheduler makes fair scheduling strictly according to the Deadline of each process, which is simple, serious and even monotonous.

Well, I have to admit, I can't see any advanced ideas or features from these descriptions, but the real experience of the majority of users speaks for itself. I think this may also mean that the theoretical basis for describing desktop interaction is so lacking that I can only sum it up through perceptual rather than rational. What I want to say is that my experience is "fast, really fast".

At this point, I believe you have a deeper understanding of "what is the role of Linux scheduler BFS". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.