How to understand workqueue Mechanism in Linux system 07/02 Update SLTechnology News&Howtos

How to understand workqueue Mechanism in Linux system

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/01 Report--

This article is about how to understand the workqueue mechanism in Linux system. Xiaobian thinks it is quite practical, so share it with everyone to learn. I hope you can gain something after reading this article. Let's not say much. Let's take a look at it together with Xiaobian.

Workqueue: The Workqueue mechanism in Linux is designed to simplify kernel thread creation. Kernel threads can be created by calling workqueue's interface. And the number of threads can be created according to the number of CPUs in the current system, so that the transactions processed by threads can be parallelized. workqueue is a simple and effective mechanism implemented in the kernel. It obviously simplifies the creation of kernel daemons and facilitates user programming.

Workqueue mechanism in Linux The workqueue mechanism in Linux is an implementation of the bottom half of the interrupt, and it is also a common means of asynchronous processing of tasks. The work item entered into the workqueue is represented in the code by the "work_struct " struct (defined in include/linux/workqueue.h):

struct work_struct { struct list_head entry; work_func_t func; atomic_long_t data; }; where "entry" indicates the node of the workqueue it mounts, and "func" is the entry function for the task to be executed. The meaning of "data" is richer. The last four bits are used as "flags" flags, and the middle four bits are "color" for the flush function. The flush function is simply to wait for the tasks on the workqueue to be processed and clear the workqueue (since the author has not studied the specific implementation principle of this piece in depth, this part will not be involved in the description of this article).

The remaining bits have different meanings in different scenarios (equivalent to "union" in C language), which can point to the address of the workqueue where the work item is located. Since the lower 8 bits are used for other purposes, the address of the workqueue is required to be aligned according to 256 bytes. It can also represent the ID of the pool in which the worker thread processing the work item is located (more about pools later in this article).

This method of stuffing different types of data into a C variable is not difficult to see in Linux code implementations. In the current workqueue mechanism,"flags" and "color" require fewer bits, and using integer variables alone does increase memory consumption. But this sacrifice of readability is also considered "ugly" by some kernel developers.

In order to make full use of locality, the CPU that processes hardirq is usually selected as the execution CPU of the bottom half of the workqueue corresponding to the hardirq. In early Linux implementations, each CPU corresponds to a workqueue queue, and only one worker thread on each CPU processes the workqueue queue.

Let's see what's wrong with this design. Suppose now that a work item(set to w0) has been added to the workqueue queue. w0 needs to run for 5ms, sleep for 10ms, and then run for another 5 ms. After w0 starts running for 5ms and 10ms, two other work items(set to w1 and w2) are added to the workqueue, respectively, w1 and w2 both need to run for 5ms and then sleep for 10ms(this example is from the kernel Documentation/core-api/workqueue.rst document).

Because there is only one worker thread, even if you sleep while executing a work item, other work items will not be executed, so it will take a total of 55ms to complete the execution of these three work items.

Assuming that there are now two worker threads on a CPU, worker 1 and worker 2, the overall execution time will be reduced to 35ms:

If there are 3 worker threads on a CPU, the execution time will be further reduced to 25ms:

cmwq

This practice of running multiple worker threads on a CPU was introduced in version 2.6.36, and is also the concurrency managed workqueue (cmwq) now used by the Linux kernel. It is impossible to run multiple threads "simultaneously" on a CPU, so the name here is concurrency, not parallelism.

Obviously, setting the right number of worker threads is critical, too much waste of resources, too little and not enough CPU utilization. The general principle is: if all worker threads on a CPU are now asleep, but there are still unprocessed work items on the workqueue, then start another worker thread.

All worker threads on a CPU together form a worker pool(this concept was introduced by kernel v3.8), we may be familiar with memory pool, when memory is needed, we get it from the spare memory pool, similarly, when there is a work item to be processed on the work queue, we pick an idle worker thread from the worker pool to serve this work item.

The worker pool is represented in code by the "worker_pool " struct (defined in kernel/workqueue.c):

struct worker_pool { spinlock_t lock; /* the pool lock */ int cpu; /* the associated cpu */ int id; /* pool ID */ struct list_head idle_list; /* list of idle workers */ DECLARE_HASHTABLE(busy_hash, 6); /* hash of busy workers */ ... If a worker is working on a work item, then it is busy and will be mounted on a hash table of order 6 of busy workers. Since it is a hash table, it needs a key, which is the memory address of the work item being processed.

If a worker doesn't handle work items, it's idle state and will be mounted on the list of idle workers. Because the number of idle worker threads is small, it is possible to use linked list management, and busy worker threads may be more, so use hash tables to organize to speed up the search.

As mentioned earlier, if there is an unprocessed work item, the kernel starts a new worker thread to improve efficiency. When there are too many idle worker threads, you need to destroy some worker threads to save CPU resources. Just like a company, when the project is tight and the staff is insufficient, it needs to recruit people, and when the project is insufficient and the staff is excessive, it may lay off employees. A fairly complex algorithm is involved in how many idle threads to keep to achieve a better balance, and I won't expand on it here.

A worker thread is represented in code by the "worker " struct (defined in kernel/workqueue_internal.h):

struct worker { struct worker_pool *pool; /* the associated pool */ union { struct list_head entry; /* while idle */ struct hlist_node hentry; /* while busy */ }; struct work_struct *current_work; /* work being processed */ work_func_t current_func; /* current_work's fn */ struct task_struct *task; /* worker task */ struct pool_workqueue *current_pwq; /* current_work's pwq */ ... } where "pool" is the worker pool in which the worker thread resides, and depending on the state of the worker thread, it is either in the idle worker list or in the busy worker hash table.

"current_work" and "current_func" are respectively the work item and its corresponding entry function being processed by the worker thread. Since the worker thread is a kernel thread, whether it is idle or busy, it corresponds to a task_struct(denoted by "task").

"current_pwq" points to the workqueue where the serviced work item is located,

The above is how to understand the workqueue mechanism in Linux system, Xiaobian believes that some knowledge points may be seen or used in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.