How to understand the Python process 07/01 Update SLTechnology News&Howtos

How to understand the Python process

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to understand Python process". The explanation content in this article is simple and clear, easy to learn and understand. Please follow the idea of Xiaobian and go deep into it slowly to study and learn "how to understand Python process" together!

preface

Process, a fresh word, some people may not understand, it is a system running a program carrier, this program can have a single or multiple processes, in general, the process is allocated and set by the number of system CPU cores, we can look at the process in the system:

You can see that 360 browser is really skin, so many processes ah, of course, you can see the process thread usage very clearly like this:

Through the task manager resource monitor, is not very powerful, ha. Having said that, let's talk about usage.

1. basic usage

What the process can do is something we have to think about. We all know that a program creates processes when it runs, so when it creates these processes, it must also add threads to make them work more orderly.

There will be multiple threads working together in a process, but the process cannot be created too much, otherwise it will consume resources, unless you are developing a large system. Now let's create a process.

Before creating a process, we import the modules of the process, as follows: import multiprocess as m.Process(target, args)

In fact, this writing is wrong, just like Beautiful Soup in bs4, you want to import bs4 first, and then introduce Beautiful Soup is not working, you must be like this:

from multiprocessing import Process(group, target, args, kwargs, name) group: user group target: calling function args: parameter ancestor kwargs: parameter dictionary name: child process name

It can be seen that the process and thread usage is basically the same, but the name function is different. And there are many other excellent ways:

#Returns a list of surviving child processes of the current process. Calling this method has the side effect of "waiting" for a process that has already finished. multiprocessing.active_children() #Returns the number of CPUs in the system. multiprocessing.cpu_count()2. Create a single process

From the above parameters, it can be seen that the return value of the function is basically indistinguishable from the thread.

#Start the process and call the run() method in the process. start() #method of process activity run() #Force termination of process without any cleanup. If a child process is created before termination, it becomes a zombie process after its forced termination; if the process also holds a lock, it will not be released, resulting in deadlock. terminate() #Determines whether a process survives, survives returns True, otherwise False. is_alive() The main thread waits for the child thread to terminate. timeout is a selectable timeout period; it should be emphasized that p.join can only join processes that are started by live start, but cannot join processes that are started by live run. join([timeout]) #Sets the process to a background daemon; when the parent process of the process terminates, the process also terminates, and the process cannot create child processes, this attribute must be set before start() daemon #Process name. name #process pid, after start can generate pid #child process exit code. If the process has not been terminated, this will be None, and a negative value-N indicates that the child process was terminated by signal N. exitcode #process authentication, default is os. urrandom () randomly generated string. authkey #Numeric handle of the system object that will become "ready" when the process ends. sentinel #kill() #close()

Note: Make sure you add it to the following statement:

if __name__ == '__main__':

This is how we implement one of our processes. We can also implement this by inheriting process classes:

It can be said that every time we create a process, it will have an ID to identify it, as follows:

3. Creating multiple processes

A single process is often not enough, so we need to create a multi-process, multi-process creation method is also very simple, add a layer of loop:

This makes it easy to create multi-process tasks faster than ever before.

4. process pool

The process pool is designed at the beginning to facilitate our more effective use of resources, to avoid waste, if the task load is large, multiple cores to help, if less, only one or two cores, let's take a look at the implementation process:

First import package:

from multiprocessing import Pool import multiprocessing as m

The installation package of the process pool is Pool, and then let's take a look at its CPU cores:

num=m.cpu_count()#CPU cores

Then we create a process pool:

pool=multiprocessing.Pool(num)

There are also many methods in the process pool that we can use:

apply(func，args，kwargs) synchronous execution (serial) blocking apply_async(func, args, kwargs) asynchronous execution (parallel) nonblocking terminate() Forcefully terminates the process, no longer working on unfinished tasks. join() The main process blocks, waiting for the child process to exit. must be followed by close or terminate() Wait for all processes to finish before closing the process pool map (func, iterable, chunksize=int) Parallel version of map function, keeps blocking until results are obtained #Returns an object that can be used to obtain results, callback function should be executed immediately, otherwise it blocks the thread responsible for processing results map_async (func，iterable，chunksize，callback，error_callback) imap The delayed version #of (func, iterable, chunksize) map is the same as imap(), except that the iterator returns an arbitrary imap_unordered(func, iterable, chunksize) #similar to map(), except that each item in iterable is unpacked and used as a function parameter. starmap(func，iterable，chunksize)

We can create synchronous and asynchronous programs for this, if you think this is a good choice for crawlers, small crawlers synchronous is good, large crawlers asynchronous is better, many people do not understand asynchronous and synchronous, synchronous asynchronous is actually serial and parallel meaning serial and parallel simply put is serial and parallel. Let's look at it together through examples:

I. Serial

II. Parallel

As you can see, there is only one parameter change, everything else is the same, we get the pid of the current process and print it out.

5. lock

Although asynchronous programming multi-process brings us convenience, but the process is uncontrollable after starting, we need to control it, let it do what we think meaningful, this time we need to lock it, and threads are the same lock:

First import the process lock module:

from multiprocessing import Lock

Then let's create a program about locks:

As you can see, the locking process is still relatively smooth, as simple as multithreading, but relatively slow. Since there is Lock, then there must be RLock, in Python, many uses of processes and threads are consistent, lock is. We can change it to RLock, and here is a reentrant lock, which is recursive:

import time lock1=RLock() lock2=RLock() s=time.time() def jc(num): lock1.acquire() lock2.acquire() print('start') print(m.current_process().pid，'run----'，str(num)) lock1.release() lock2.release() print('end') if __name__ == '__main__': aa=[] for y in range(12): pp=Process(target=jc，args=(y，)) pp.start() aa.append(pp) for x in aa: x.join() e=time.time() print(e-s)6. Interprocess Communication 1. Event

Inter-process communication, the same method and thread, here is a small chestnut, not detailed description, do not understand can see my last article on threads, we are going to talk about today is other inter-process communication methods, please see below:

import time e=Event() def main(num): while True: if num=5: e.wait(timeout=1) #Wait for signal flag is true e.set() print ('Start ') if num==10: e.wait(timeout=3) e.clear() print ('exit ') break num+=1 time.sleep(2) if __name__ == '__main__': for y in range(10): pp=Process(target=main，args=(y，)) pp.start() pp.join() 2. Pipelining messages

After initialization, the pipeline module returns two parameters, one for the sender and one for the receiver. It has a parameter that can be set to full duplex or half duplex, full duplex transceiver, half duplex only receive or only send. First understand its method:

p1, p2=m.Pipe(duplex=bool) #Set whether full duplex, return two connected objects

p1.send() #Send p2.recv() #Receive p1.close() #Close connection p1.fileno() #Returns integer file descriptor used by connection p1.poll([timeout]) #Returns True if data is available on connection, timeout specifies maximum time to wait. p2.recv_bytes([maxlength]) #Receive maximum bytes p1.send_bytes([maxlength]) #Send maximum bytes #Receive a complete byte message and store it in a buffer object, offset specifies the byte offset where the message is placed in the buffer. p2.recv_bytes_into(buffer [， offset])

In fact, we can use the lock to control its first launch, so that it can receive while sending.

III. Queue

Queue is different from the others in that it takes an insertion and deletion approach. Let's see:

def fd(a): for y in range(10): a.put(y) #insert data print ('insert:', str(y)) def df(b): while True: aa=b.get(True) #Delete data print ('release:', str(aa)) if __name__ == '__main__': q=Queue() ff=Process(target=fd，args=(q，)) dd=Process(target=df，args=(q，)) ff.start() #Start running dd.start() dd.terminate() #Close ff.join()

The queues mentioned above are mainly used for multi-process queues, and there is also a queue for process pools, which is in the Manager module.

7. semaphore

It is exactly the same as in thread. I will not repeat it here. See the following example:

s=Semaphore(3) s.acquire() print(s.get_value()) s.release() print(s.get_value()) print(s. get_value ()) s.release() print(s.get_value()) s.release() output: 2 3 4 8. data sharing

The shared data type can be set directly through the process module:

Value: m.Value() Array: m.Array() Typical word: m.dict() List type:m.list()

It can also be implemented through the Manager module of the process:

Manager().dict() Manager.list()

Here are some examples:

You can see that we succeeded in adding data to it, forming a sharing of data.

Thank you for reading, the above is the content of "how to understand Python process", after the study of this article, I believe that everyone has a deeper understanding of how to understand Python process, the specific use of the situation also needs to be verified by practice. Here is, Xiaobian will push more articles related to knowledge points for everyone, welcome to pay attention!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.