The processes, threads and collaborations of the necessary series of concurrent interviews 07/02 Update SLTechnology News&Howtos

The processes, threads and collaborations of the necessary series of concurrent interviews

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/03 Report--

Coordinate Shanghai Songjiang Hi-Tech Park, looking for senior front-end engineer / senior Java engineer, interested in watching JD: https://www.lagou.com/jobs/6361564.html

The processes, threads and collaborations of the necessary series of concurrent interviews

In the common interview questions summarized by "Awesome Interviews", both front and back end, concurrent and asynchronous knowledge is the top priority of the interview, "concurrent programming" series is to review and summarize the common concurrent knowledge in the interview; you can also go to "Awesome Interviews" to understand your mastery in the actual interview test. You can also go to "Java Warfare" and "Go Warfare" to learn about concurrent programming in specific programming languages.

In a system without OS, programs are executed sequentially, that is, one program must be executed before another program is allowed to execute; in a multi-program environment, multiple programs are allowed to execute concurrently. There are significant differences between the two execution modes of the program. It is this characteristic of program concurrent execution that leads to the introduction of the concept of process into the operating system. Process is the basic unit of resource allocation and thread is the basic unit of resource scheduling.

Application startup reflects that static instructions are loaded into memory, and then enter the CPU operation. The operating system opens up a stack memory in memory to store instructions and variable values, thus forming a process. In the early operating system, CPU was scheduled based on processes, and memory space was not shared among different processes, so processes had to switch memory-mapped addresses in order to switch tasks. Because the context-sensitive variables, references, counters and other field data of the process take up the memory space of the segment, frequently switching processes need to organize a large amount of memory space to save the unfinished process site, and then resume the field operation when it is CPU's turn next time slice.

This is both time-consuming and space-consuming, so we need to study multithreading. All threads created by a process share a memory space, so the cost of task switching is very low. Modern operating systems are scheduled based on lighter threads, and now we refer to "task switching" as "thread switching".

Processes and threads

This section is excerpted from "Linux and operating system / process Management".

Process (Process)

A process is an abstraction of a running program by the operating system. Multiple processes can be run on a system at the same time, and each process seems to be using hardware exclusively. The so-called concurrent running means that the instructions of one process and the instructions of another process are interlaced. Whether in single-core or multi-core systems, a single CPU can be made to look like multiple processes are being executed concurrently by switching processors between processes. The mechanism by which the operating system implements this interleaving execution is called context switching.

The operating system keeps track of all the state information needed for the process to run. This state, or context, contains a lot of information, such as the current values of PC and register files, as well as the contents of main memory. A single-processor system can only execute the code of one process at any one time. When the operating system decides to transfer control from the current process to a new process, it will make a context switch, that is, save the context of the current process, restore the context of the new process, and then transfer control to the new process. The new process will start from where it stopped last time.

In the section on virtual storage management, we described that it provides an illusion for each process that each process is using main memory exclusively. Each process sees a consistent memory, called a virtual address space. The top area of its virtual address space is reserved for code and data in the operating system, which is the same for all processes; the bottom area of the address space stores the code and data defined by the user process.

Program code and data, for all processes, the code starts from the same fixed address and initializes directly according to the contents of the executable object file.

The heap, code, and data area are followed by the runtime heap. Unlike code and data areas, which are sized at the beginning of the process, the heap can be dynamically expanded and contracted at run time when C standard library functions such as malloc and free are called.

Shared libraries: about the middle of the address space is an area where code and data from shared libraries such as C standard libraries and math libraries are stored.

Stack, at the top of the user virtual address space is the user stack, which is used by the compiler to implement function calls. Like the heap, the user stack can expand and contract dynamically during program execution.

Kernel virtual memory: the kernel always resides in memory and is part of the operating system. The area at the top of the address space is reserved for the kernel and does not allow applications to read or write the contents of this area or directly call functions defined by kernel code. Thread (Thread)

In modern systems, a process can actually consist of multiple execution units called threads, each of which runs in the context of the process and shares the same code and global data. The individuals of the process are completely independent, while the threads are interdependent. In a multi-process environment, the termination of any one process will not affect other processes. In a multithreaded environment, the parent thread terminates and all child threads are forced to terminate (without resources).

The termination of any child thread generally does not affect other threads, unless the child thread executes the exit () system call. Any child thread executes exit (), and all threads die at the same time. There is at least one main thread in a multithreaded program, and this main thread is actually a process with a main function. It is the process of the whole program, and all threads are its child threads; we usually call the main process with multithreading as the main thread.

Thread sharing environment includes: process code segment, process public data, process open file descriptor, signal processor, process current directory, process user ID and process group ID and so on. Using these shared data, threads can easily communicate with each other. While threads have a lot in common, they also have their own personalities, thus achieving concurrency:

Thread ID: each thread has its own thread ID, which is unique in the process. Processes use this to identify threads.

Register group value: because threads run concurrently, each thread has its own different running clues. When switching from one thread to another, the state of the original thread's register set must be saved so that the thread can be restored in the future when it is switched back to.

The stack of threads: the stack is necessary to ensure that the thread runs independently. The thread function can call the function, and the called function can be nested layer by layer, so the thread must have its own function stack so that the function call can be executed normally without being affected by other threads.

Error return code: because there are many threads running in the same process at the same time, one thread may have set the errno value after making a system call, but before the thread has processed the error, another thread will be run by the scheduler at this time, so the error value may be modified. Therefore, different threads should have their own error return code variables.

Thread's signal shielding code: since each thread is interested in different signals, the thread's signal shielding code should be managed by the thread itself. But all threads share the same signal processor.

Thread priority: since a thread needs to be scheduled like a process, there must be a parameter available for scheduling, which is the priority of the thread.

Thread model thread implementation in user space

When the thread is implemented in user space, the operating system knows nothing about the existence of the thread, and the operating system can only see the process, not the thread. All threads are implemented in user space. From the operating system's point of view, each process has only one thread. In the past, most operating systems were implemented in this way, and one of the advantages of this approach is that even if the operating system does not support threads, it can also support threads through library functions.

Under the model, programmers need to implement the thread's data structure, create destruction, and schedule maintenance. This is equivalent to the need to implement its own thread scheduling kernel, and at the same time these threads run in a process of the operating system, and finally the operating system schedules the process directly.

This has some advantages, first of all, it does achieve the real multithreading in the operating system, and secondly, the thread scheduling is only in the user mode, which reduces the switching overhead of the operating system from kernel state to user mode. The deadliest drawback of this mode is that the operating system is not aware of the existence of threads, so when a thread in a process makes a system call, such as a page fault, which causes thread blocking, the operating system will block the whole process, even if other threads in the process are still working. Another problem is that if a thread in a process does not release CPU for a long time, because there is no clock interrupt mechanism in user space, it will cause other threads in this process to wait without getting CPU.

Thread implementation in the operating system kernel

The kernel thread is directly supported by the operating system kernel (Kernel). This kind of thread is switched by the kernel. The kernel schedules the thread by manipulating the scheduler (Scheduler) and is responsible for mapping the thread's tasks to each processor. Each kernel thread can be regarded as a separate part of the kernel, so that the operating system has the ability to handle multiple things at the same time. A kernel that supports multithreading is called a multithreaded kernel (Multi-Threads Kernel).

Programmers directly use the threads that have been implemented in the operating system, and the creation, destruction, scheduling and maintenance of threads are realized by the operating system (kernel, to be exact). Programmers only need to use system calls, but do not need to design their own thread scheduling algorithms and thread preemptive use of CPU resources.

Hybrid implementation using user threads and lightweight processes

In this hybrid implementation, there are both user threads and lightweight processes. User threads are still completely built in user space, so user thread creation, switching, destructing and other operations are still cheap, and can support large-scale concurrency of user threads. On the other hand, the lightweight process supported by the operating system acts as a bridge between the user thread and the kernel thread, so that the thread scheduling function and processor mapping provided by the kernel can be used. And the system call of the user thread is completed through the lightweight process, which greatly reduces the risk of the whole process being completely blocked. In this mixed mode, the ratio of the number of user threads to lightweight processes is variable, that is, the relationship of NVR M:

The cooperative program of Golang uses this model. In the user mode, the cooperative program can be switched quickly, which avoids the CPU overhead of thread scheduling. The cooperative program is equivalent to the thread of the thread.

Threads in Linux

Before Linux 2.4, threads were implemented and managed entirely according to the process. Before Linux 2.6, the kernel did not support the concept of threads and only simulated threads through lightweight processes (Lightweight Process). Lightweight processes are user threads built on and supported by the kernel. They are highly abstract kernel threads, and each lightweight process is associated with a specific kernel thread. Kernel threads can only be managed by the kernel and scheduled like ordinary processes. The biggest feature of this model is that thread scheduling is completed by the kernel, while other thread operations (synchronization, cancellation) are completed by out-of-core thread library (Linux Thread) functions.

In order to be fully compatible with the Posix standard, Linux 2.6 first improved the kernel by introducing the concept of thread groups (which are still represented by lightweight processes). With this concept, a group of threads can be organized as a process, but the kernel does not prepare special scheduling algorithms or define special data structures to represent threads. Instead, a thread is simply seen as a process (conceptually a thread) that shares some resources with other processes (conceptually, threads). The main change in implementation is to add the tgid field to the task_struct, which is the field used to represent the thread group id. In terms of user thread library, NPTL is also used instead of Linux Thread, and the one-to-one model is still used in different scheduling models.

The process is implemented by calling the fork system call: pid_t fork (void); the thread is implemented by calling the clone system call: int clone (int (* fn) (void *), void * child_stack, int flags, void * arg,...). Compared with standard fork (), the overhead of threads is so small that the kernel does not need to copy the process's memory space or file description descriptors separately, and so on. This saves a lot of CPU time, makes thread creation ten to a hundred times faster than new process creation, and allows you to use threads heavily without worrying too much about CPU or insufficient memory. Whether it is fork, vfork, kthread_create, do_fork is finally called, and do_fork allocates the resources needed by a process according to different function parameters.

Kernel thread

Kernel threads are created by the kernel itself, also known as Deamon threads. Of all the processes listed on the terminal with the command ps-Al, the kernel threads whose names end with k switches and d are often kernel threads, such as kthreadd, kswapd, and so on. Compared to user threads, they are all created by do_fork (), each with a separate task_struct and kernel stack; they are also involved in scheduling, and kernel threads have priority, which are swapped in and out equally by the scheduler. The difference between the two is that kernel threads only work in kernel mode, while user threads can run either kernel mode (when executing system calls) or user mode. Kernel threads do not have user space, so for a kernel thread, its current- > mm is blank, and it uses the same page table as the kernel. On the other hand, the user thread can see the full 04G memory space.

In the final phase of Linux kernel startup, two kernel threads are created, one is init and the other is kthreadd. The role of the init thread is to run a series of "init" scripts on the file system and start the shell process, so the init thread is the ancestor of all user processes in the system, and its pid is 1. The kthreadd thread is the daemon thread of the kernel, it never exits when the kernel is working properly, it is an endless loop, and its pid is 2.

Coroutine | Cooperative program

User Space Thread is a lightweight thread in user mode, and the most accurate name should be user space thread (USP), which has different names in different areas, such as Fiber, Green Thread, and so on. The scheduling of the cooperative program is completely controlled by the application program, and the operating system does not care about the scheduling of this part; a thread can contain one or more cooperative programs, and the cooperative program has its own register context and stack. When scheduling and switching, the fine lines on the register and the stack are saved, and the previously guaranteed register context and stack are restored when switching back.

The advantages of the collaborative process are as follows:

Save memory, each thread needs to allocate a section of stack memory, and some resources in the kernel save the overhead of allocating threads (create and destroy threads to do a syscall respectively) save a lot of thread switching overhead and cooperate with NIO to achieve non-blocking programming to improve system throughput

For example, the go keyword in Golang is actually responsible for opening a Fiber and letting func logic run on it. And all this happens in the user state, not in the kernel state, that is, there is no overhead on the ContextSwitch. In the implementation library of cooperative program, the author is more commonly used, such as Go Routine, node-fibers, Java-Quasar and so on.

The Cooperative process Model of Go

Go thread model belongs to many-to-multithread model. Go builds a unique two-level thread model on top of the kernel threads provided by the operating system. Goroutine created using Go statements in Go can be thought of as lightweight user threads, and the Go thread model contains three concepts:

G: stands for Goroutine, and each Goroutine corresponds to a G structure. G stores the running stack, state, and task functions of the Goroutine, and can be reused. G is not the executor, and each G needs to be bound to P before it can be scheduled for execution.

P: Processor stands for logic processor. For G, P is equivalent to CPU core, and G can be scheduled only if it is bound to P (in P's local runq). For M, P provides relevant execution environment (Context), such as memory allocation status (mcache), task queue (G), etc. The number of P determines the maximum number of parallel G in the system (the number of physical CPU cores > = P), and the number of P is determined by the GOMAXPROCS set by the user, but no matter how large the GOMAXPROCS is set to, the maximum number of P is 256.

M: Machine,OS thread abstraction, which represents the resources that actually perform the calculation. After binding a valid P, it enters the schedule loop. The number of M is variable and adjusted by Go Runtime. In order to prevent the system from being unable to schedule due to the creation of too many OS threads, the default maximum limit is 10000.

In Go, each logical processor (P) is bound to a kernel thread, and each logical processor (P) has a local queue to hold the goroutine allocated by the Go runtime. In the many-to-multithreading model, the operating system schedules threads to run on the physical CPU, while in Go, Go's runtime schedules Goroutine to run on the logical processor (P).

The stack of Go allocates size dynamically, growing and shrinking with the amount of data stored. Each newly created Goroutine has only a stack of approximately 4KB. There is only 4KB per stack, so we can have 2.56 million Goroutine on the RAM of one 1GB, which is a huge improvement over the 1MB of each thread in the Java. Golang implements its own scheduler, allowing multiple Goroutines to run on the same OS thread. Even if Go runs the same context switch as the kernel, it can avoid switching to ring-0 to run the kernel and then switch back, which saves a lot of time.

There are two levels of scheduling in Go:

The first level is the scheduling system of the operating system, the scheduling logic processor of the scheduling system takes up cpu time slices to run; the first level is the runtime scheduling system of Go, which schedules a Goroutine to run on the logical processing.

After creating a Goroutine using the Go statement, the created Goroutine is placed in the global run queue of the Go runtime scheduler, and then the Go runtime scheduler allocates the Goroutine in the global queue to a different logical processor (P), and the assigned Goroutine is placed in the local queue of the logical processor (P), and when a Goroutine in the local queue is ready to be allocated to the time slice, it can run on the logical processor.

Discussion on Java Cooperative Program

At present, JVM itself does not provide the implementation library of the cooperative program, and the cooperative program framework like Quasar still seems to be a non-mainstream solution to the concurrency problem. In this part, we discuss whether it is necessary to introduce the cooperative program in Java. In an ordinary Web server scenario, for example, the default number of threads in the Worker thread pool in Spring Boot is about 200. if you consider that each thread context is about 128KB in terms of thread memory consumption, then the memory consumption of 500 threads themselves is 60m, which is less than that of the whole stack. The thread pool provided by Java itself has very good support for thread creation and destruction; even the co-programs provided in Vert.x or Kotlin are often based on native thread pools.

From the perspective of thread switching overhead, we often talk about switching overhead for active threads, while ordinary Web servers naturally have a large number of threads suspended because of operations such as request for read and write, DB read and write, in fact, only dozens of concurrent active threads will participate in the thread switching scheduling of OS. If there are a large number of scenarios with active threads, there is also an Actor concurrency model framework such as Akka in the Java ecosystem, which can sense when threads can perform work and build a runtime scheduler in user space, thus supporting millions of levels of Actor concurrency.

In fact, when we introduce the scenario of collaboration, we are more likely to deal with the so-called million-level connections, which is typically an IM server, which may need to handle a large number of idle links at the same time. At this time, in the Java ecosystem, we can use Netty to process, and its scheduling mechanism based on NIO and Worker Thread is very similar to the collaborative program, which can solve most of the problems caused by the waste of resources caused by IO waiting. From the perspective of concurrency model comparison, if we want to follow the idea of memory sharing by message passing in Go, we can also use a model like Disruptor.

Java thread and operating system thread

Before JDK1.2, Java threads were implemented based on user threads called "green threads" (Green Threads). After JDK1.2 and later, JVM chose a more robust and easy-to-use native thread model of the operating system, and handed over the threads of the program to the operating system kernel for scheduling through system calls. Therefore, in the current version of JDK, what kind of thread model is supported by the operating system largely determines how the threads of Java virtual machines are mapped, which can not be agreed on different platforms, and the virtual machine specification does not define which thread model Java threads need to use. The thread model only affects the concurrent scale and operation cost of threads, and these differences are transparent to the coding and running process of Java programs.

For Sun JDK, both Windows and Linux versions are implemented using an one-to-one threading model, where a Java thread is mapped to a lightweight process, because the threading model provided by Windows and Linux systems is one-to-one. In other words, the essence of threads in today's Java is actually threads in the operating system, Linux is a lightweight process based on pthread library, and Windows is a native system Win32 API that provides system calls to achieve multithreading.

In the current operating system, because threads are still regarded as lightweight processes, the state of threads in the operating system is actually the same as the process state. In a practical sense, except for the new and terminated states of a thread in the operating system, the only real state of a thread is:

Ready: indicates that the thread has been created and is waiting for the system schedule to allocate the right to use CPU. Running: indicates that the thread has acquired the right to use CPU and is performing operations. Waiting: means that a thread waits (or hangs), freeing up CPU resources for use by other threads.

For the thread state in Java: whether it is Timed Waiting, Waiting or Blocked, it corresponds to the waiting state of the operating system thread. The Runnable state, on the other hand, corresponds to the ready and running states in the operating system. Java threads and operating system threads are actually of the same origin, but they are very different.

Extended reading

You can read the author's series of articles in Gitbook through the following navigation, covering technical data induction, programming language and theory, Web and large front-end, server development and infrastructure, cloud computing and big data, data science and artificial intelligence, product design and other areas:

Knowledge system: "Awesome Lists | CS Collection", "Awesome CheatSheets | Quick Learning Quick check Manual", "Awesome Interviews | essential for Job interview", "Awesome RoadMaps | programmer's Advanced Guide", "Awesome MindMaps | brain Map of knowledge context", "Awesome-CS-Books | Collection of Open Source Books (.pdf)"

Programming languages: "programming language theory", "Java practice", "JavaScript practice", "Go practice", "Python practice", "Rust practice"

Software Engineering, patterns and Architecture: "programming Paradigm and Design pattern", "data structure and algorithm", "Software Architecture Design", "cleanliness and Reconstruction", "R & D methods and tools"

Web and large Front end: "Modern Web Development Foundation and Engineering practice", "data Visualization", "iOS", "Android", "mixed Development and Cross-end applications"

Server-side development practice and engineering architecture: "server-side foundation", "micro-service and cloud native", "testing and high availability assurance", "DevOps", "Node", "Spring", "information security and * * testing"

Distributed Infrastructure: distributed system, distributed Computing, Database, Network, Virtualization and orchestration, Cloud Computing and big data, Linux and operating system

Data science, artificial intelligence and deep learning: "mathematical statistics", "data analysis", "machine learning", "deep learning", "natural language processing", "tools and engineering", "industry applications"

Product design and user experience: "product design", "interactive experience", "project management"

Industry applications: "industry myth", "functional domain", "e-commerce", "intelligent manufacturing"

In addition, you can go to xCompass to interactively retrieve and find the articles / links / books / courses you need, or view more detailed directory navigation information such as articles and project source code in the MATRIX article and code index matrix. Finally, you can also follow the official Wechat account: "the Technology Road of a Bear" for the latest information.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.