How to use python to implement processes, threads and collaborators 07/01 Update SLTechnology News&Howtos

How to use python to implement processes, threads and collaborators

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "how to use python to implement processes, threads and coroutines". The explanation in this article is simple and clear, easy to learn and understand. Please follow the ideas of Xiaobian and study and learn "how to use python to implement processes, threads and coroutines" together!

What is Process?

Process-An abstract concept provided by an operating system. It is the basic unit of resource allocation and scheduling in the system. It is the basis of the operating system structure. A program is a description of instructions, data, and their organizational form, and a process is the entity of the program. The program itself has no life cycle, it's just instructions on disk and once it runs it's a process.

When the program needs to run, the operating system records the code and all static data into the address space of the memory and process (each process has a unique address space, as shown in the figure below), by creating and initializing the stack (local variables, function parameters and return addresses), allocating heap memory and IO-related tasks, the current preparation is completed, the program is started, and the OS transfers control of the CPU to the newly created process.

The process is controlled and managed by the operating system through PCB(Processing Control Block), which is usually a continuous storage area in the system memory occupation area, which stores all the information required by the operating system to describe the process and control the process (including: process identification number, process status, process priority, file system pointer and the contents of each register, etc.). The PCB of the process is the only entity that the system perceives the process.

A process has at least five basic states: initial state, ready state, waiting (blocking) state, execution state, and termination state.

Initial state: The process has just been created, and because other processes are occupying CPU resources, they cannot be executed and can only be in the initial state. Ready state: Only a process in the ready state can be scheduled to execute. Waiting state: A process waits for an event to complete. Execution state: Only one process can be in execution at any time (for a single-core CPU). Stop state: End of process

Switching between processes

In both multicore and single-core systems, a CPU appears to be executing multiple processes concurrently by switching processors between processes. The operating system's mechanism for swapping CPU control between processes is called a context switch, which saves the context of the current process, restores the context of the new process, and then transfers CPU control to the new process, which starts where it left off. Thus, processes take turns using the CPU, which is shared among several processes, using some scheduling algorithm to decide when to stop one process and switch services to another.

Single CPU dual process situation

Processes switch contexts according to specific scheduling mechanisms and I/O interrupts, taking turns using CPU resources

Dual-core CPU dual-process case

Each process monopolizes a CPU core resource, and the CPU is blocked while processing I/O requests.

Interprocess data sharing

The processes in the system share CPU and main memory resources with other processes. In order to better manage main memory, the operating system provides an abstract concept of main memory, namely virtual memory (VM). It is also an abstract concept that provides the illusion that each process is using main memory exclusively.

Virtual memory provides three main capabilities:

By thinking of main memory as a cache stored on disk, holding only active regions in main memory, and transferring data back and forth between disk and main memory as needed, main memory is used more efficiently to provide consistent address space for each process, thus simplifying memory management and protecting each process's address space from destruction by other processes. The CPU translates virtual addresses into real physical addresses through address translation, and each process can only access its own address space. Therefore, without the aid of other mechanisms (interprocess communication), there is no way to share data between processes.

For example, multiprocessing in Python:

import multiprocessingimport threadingimport timen = 0def count(num): global n for i in range(100000): n += i print("Process {0}:n={1},id(n)={2}".format(num, n, id(n)))if __name__ == '__main__': start_time = time.time() process = list() for i in range(5): p = multiprocessing.Process(target=count, args=(i,)) #Test multiprocessing usage # p = threading.Thread(target=count, args=(i,)) #Testing multithreading usage process.append(p) for p in process: p.start() for p in process: p.join() print("Main:n={0},id(n)={1}".format(n, id(n))) end_time = time.time() print("Total time:{0}".format(end_time - start_time))

results

Process 1:n=4999950000,id(n)=139854202072440Process 0:n=4999950000,id(n)=139854329146064Process 2:n=4999950000,id(n)=139854202072400Process 4:n=4999950000,id(n)=139854201618960Process 3:n=4999950000,id(n)=139854202069320Main:n=0,id(n)=9462720Total time:0.03138256072998047

Variable n has a unique address space in both process p{0,1,2,3,4} and the main process (main)

What is Thread?

Thread-Also an abstract concept provided by the operating system, it is a single sequence control flow in program execution, the smallest unit of program execution flow, and the basic unit of processor scheduling and dispatch. A process can have one or more threads, and multiple threads in the same process will share all system resources in the process, such as virtual address space, file descriptors, and signal handling. However, multiple threads in the same process have their own call stacks and thread-local storage (as shown in the figure below).

The system uses PCB to complete the process control and management. Similarly, the system allocates a thread control block (TCB) for the thread, and records all the information used to control and manage the thread in the thread control block. The TCB usually includes:

thread identifier set of registers thread running status priority thread-specific storage area signal mask

Like processes, threads have at least five states: initial, ready, waiting (blocking), executing, and terminating

Switching between threads requires context switching just like processes, so I won't go into details here.

There are many similarities between processes and threads, so what is the difference between them? Process VS Thread

Processes are independent units of resource allocation and scheduling. A process has a complete virtual address space, and when a process switch occurs, different processes have different virtual address spaces. While multiple threads of the same process share the same address space (threads between different processes cannot be shared), threads are the basic unit of CPU scheduling, and a process contains several threads (at least one thread). Threads are smaller than processes and essentially own no system resources. Thread creation and destruction takes much less time than processes because threads can share address space, so synchronization and mutual exclusion operations need to be considered. The unexpected termination of a thread will affect the normal operation of the entire process, but the unexpected termination of a process will not affect the operation of other processes. Therefore, multi-process programs are more secure. In short, multi-process programs have high security, high process switching overhead, and low efficiency; multi-threaded programs have high maintenance costs, low thread switching overhead, and high efficiency. (Python's multithreading is pseudo-multithreading, described in more detail below)

What is a corollary?

Coroutines (also known as microthreads) are a lighter entity than threads, and are not managed by the operating system kernel, but are completely controlled by programs. The coroutines are related to threads and processes as shown in the figure below.

A coroutine can be likened to a subroutine, but during execution, the subroutine can interrupt internally, then switch to another subroutine, and return to execution at an appropriate time. Switching between coroutines does not need to involve any system calls or any blocking calls. The coroutines execute only in one thread, switching between subroutines, and occur in user mode. Moreover, thread blocking is done by the operating system kernel and occurs in kernel state, so coroutines save thread creation and switching overhead compared to threads. There are no simultaneous variable write conflicts in coroutines, so there is no need for synchronization primitives to guard critical blocks, such as mutex locks, semaphores, etc., and no support from the operating system.

The coroutine is suitable for IO blocking and requires a large number of concurrent scenarios. When IO blocking occurs, it is scheduled by the scheduler of the coroutine. By yielding the data stream and recording the data on the current stack, the coroutine stack is restored immediately after blocking, and the blocking result is put on this thread to run.

Below, we will analyze how to choose to use Python processes, threads, and coroutines in different application scenarios.

How to choose?

Before comparing the differences between the three for different scenarios, we first need to introduce Python multithreading (which has been criticized by programmers as "fake" multithreading).

So why is multithreading in Python considered "pseudo" multithreading?

Replace the multiprocessing example above, p = multiprocessing.Process(target=count, args=(i,)) with p = threading.Thread(target=count, args=(i,)), the other code remains unchanged, the result is as follows:

To reduce code redundancy and article length, ignore naming and printing irregularities

Process 0:n=5756690257,id(n)=140103573185600Process 2:n=10819616173,id(n)=140103573185600Process 1:n=11829507727,id(n)=140103573185600Process 4:n=17812587459,id(n)=140103573072912Process 3:n=14424763612,id(n)=140103573185600Main:n=17812587459,id(n)=140103573072912Total time:0.1056210994720459

n is a global variable, the print result of Main is equal to the thread, which proves that there is data sharing between threads.

But why does multithreading take longer than multiprocessing? This is similar to what we said above (overhead of threads

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.