In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article will explain in detail how to analyze Python multi-processes, the content of the article is of high quality, so the editor will share it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.
1.Python multiprocess module
Multi-processes in Python are implemented through multiprocessing packages, similar to multi-threaded threading.Thread, which can use multiprocessing.Process objects to create a process object. The methods of this process object and thread object almost have start (), run (), join () and other methods, among which there is a method that the daemon thread method in the Thread thread object is setDeamon, while the daemon process of the Process process object is done by setting the daemon property.
Let's talk about the implementation of Python multi-process, which is similar to multi-thread.
2.Python multiprocess implementation method-from multiprocessing import Processdef fun1 (name): print ('test% s multiprocess'% name) if _ _ name__ = ='_ main__': process_list = [] for i in range (5): # start five sub-processes to execute the fun1 function p = Process (target=fun1,args= ('Python') )) # instantiate the process object p.start () process_list.append (p) for i in process_list: p.join () print ('end the test')
Result
Test Python multiprocess
Test Python multiprocess
Test Python multiprocess
Test Python multiprocess
Test Python multiprocess
End the test
Process finished with exit code 0
The above code opens five sub-processes to execute the function, and we can observe the results, which are printed at the same time. Here, the real parallel operation is implemented, that is, multiple CPU execute tasks at the same time. We know that the process is the smallest resource allocation unit in the python, that is, the data among the processes, the memory is not shared, and every time a process is started, it is necessary to allocate resources and copy the accessed data independently, so the cost of starting and destroying the process is relatively high, so in practice, the use of multiple processes should be set according to the configuration of the server.
3.Python multi-process implementation method 2
Remember the second implementation of python multithreading? It is implemented through the method of class inheritance, and the second implementation of python multiprocess is the same.
From multiprocessing import Processclass MyProcess (Process): # inherit the Process class def _ _ init__ (self,name): super (MyProcess) Self). _ init__ () self.name = name def run (self): print ('test% s multiprocess'% self.name) if _ _ name__ = ='_ main__': process_list = [] for i in range (5): # start five child processes to execute the fun1 function p = MyProcess ('Python') # instantiate the process object p .start () process_list.append (p) for i in process_list: p.join () print ('end the test')
Result
Test Python multiprocess
Test Python multiprocess
Test Python multiprocess
Test Python multiprocess
Test Python multiprocess
End the test
Process finished with exit code 0
The effect is the same as the first way.
We can see that Python multiprocesses are implemented in almost the same way as multithreading.
Other methods of the Process class
Construction method:
Process ([group [, target [, name [, args [, kwargs])
Group: thread group
Target: the method to execute
Name: process name
Args/kwargs: the parameter to be passed in the method
Example method:
Is_alive (): returns whether the process is running, bool type.
Join ([timeout]): blocks the process of the current context until the process calling this method terminates or reaches the specified timeout (optional).
Start (): the process is ready, waiting for CPU scheduling
Run (): strat () calls the run method, which executes the t default run () method if target is not passed in during the instance process.
Terminate (): stop the work process immediately, regardless of whether the task is completed or not
Attributes:
Daemon: same as the setDeamon function of threads
Name: process name
Pid: process number
The use of join,daemon is the same as that of python multithreading, so I won't repeat it here.
Communication of 4.Python multithreading
The process is the basic unit of the system independent scheduling core to allocate system resources (CPU, memory). The processes are independent of each other. Each start of a new process is equivalent to a clone of data. The data modification in the child process can not affect the data in the main process, and the data between different child processes can not be shared. This is the most obvious difference between multi-processes and multi-threads. But is it true that Python is isolated among multiple processes? Of course not. Python also provides a variety of ways to communicate and share data among multiple processes (you can modify a piece of data).
Process alignment Queue
Queue also mentioned in multithreading that when used in producer-consumer mode, it is thread-safe and is a data pipeline between producers and consumers. In python multi-processes, it is actually a data pipeline between processes to achieve process communication.
From multiprocessing import Process,Queuedef fun1 (QQuery I): print ('child process% s starts put data'% I) q.put ('I am% s communicating through Queue'% I) if _ _ name__ = ='_ _ main__': Q = Queue () process_list = [] for i in range (3): P = Process (target=fun1,args= (QPoweri) ) # notice that the Q object is passed to the method we want to execute in args Only in this way can the process and the main process use Queue to communicate with the main process p.start () process_list.append (p) for i in process_list: p.join () print ('the main process obtains Queue data') print (q.get ()) print ('end test')
Result
Child process 0 starts put data
Child process 1 starts put data
Child process 2 starts put data
The main process acquires Queue data
I am 0 to communicate through Queue
I am 1 to communicate through Queue
I am 2 communicating through Queue
End the test
Process finished with exit code 0
The above code results can see that our main process can obtain the data of put in the child process through Queue to achieve inter-process communication.
Pipeline Pipe
The functions of Pipe and Queue are roughly the same, and they also realize the communication between processes. Let's see how to use them.
From multiprocessing import Process, Pipedef fun1 (conn): print ('child process sends message:') conn.send ('Hello master process') print ('child process accepts message:') print (conn.recv ()) conn.close () if _ _ name__ = ='_ _ main__': conn1, conn2 = Pipe () # key points Pipe instantiates and generates a two-way tube p = Process (target=fun1, args= (conn2,)) # conn2 to the child process p.start () print ('main process accepts message:') print (conn1.recv ()) print ('main process sends message:') conn1.send ("Hello child process") p.join () print ('end test')
Result
The main process accepts the message:
The child process sends a message:
Child processes accept messages:
Hello, master process.
The main process sends a message:
Hello, son process.
End the test
Process finished with exit code 0
You can see above that the main process and child processes can send messages to each other.
Managers
Queue and Pipe only implement data interaction, not data sharing, that is, one process changes the data of another process. It takes so long to use Managers.
From multiprocessing import Process, Managerdef fun1 (dic,lis,index): dic [index] ='a 'dic [' 2'] ='b' lis.append (index) # [0main__': with Manager 1, 2, 4, 4, 5, 5, 7, 8, 9] # print (l) if _ _ name__ = ='_ _ main__': with Manager () as manager: dic = manager.dict () # pay attention to the way the dictionary is declared L = manager.list (range (5)) can not be defined directly through {} # [0LECHER 1, args= 3, 4] process_list = [] for i in range (10): P = Process (target=fun1, args= (dic,l) I) p.start () process_list.append (p) for res in process_list: res.join () print (dic) print (l)
Results:
{0: 'axed,' 2': 'baked, 3:' a', 1:'a, 2:'a, 4:'a, 5:'a, 7: a, 6: a, 8: a, 9: a'}
[0, 1, 2, 3, 4, 0, 3, 1, 2, 4, 5, 7, 6, 8, 9]
You can see that the main process defines a dictionary and a list. In the child process, you can add and modify the contents of the dictionary, insert new data into the list, and achieve data sharing between processes, that is, you can modify the same data together.
5. Process pool
A process sequence is maintained inside the process pool, and when used, it goes to the process pool to get a process. If there is no process available in the process pool sequence, the program will wait until there are available processes in the process pool. That is, there are several processes that can be used.
There are two methods in the process pool:
Apply: synchronization, generally not used
Apply_async: asynchronous
From multiprocessing import Process,Poolimport os, time, randomdef fun1 (name): print ('Run task% s (% s)...'% (name, os.getpid ()) start = time.time () time.sleep (random.random () * 3) end = time.time () print ('Task% s runs% 0.2f seconds.'% (name) (end-start)) if _ _ name__=='__main__': pool = Pool (5) # create a process pool of 5 processes for i in range (10): pool.apply_async (func=fun1, args= (I,) pool.close () pool.join () print ('end test')
Result
Run task 0 (37476)...
Run task 1 (4044)...
Task 0 runs 0.03 seconds.
Run task 2 (37476)...
Run task 3 (17252)...
Run task 4 (16448)...
Run task 5 (24804)...
Task 2 runs 0.27 seconds.
Run task 6 (37476)...
Task 1 runs 0.58 seconds.
Run task 7 (4044)...
Task 3 runs 0.98 seconds.
Run task 8 (17252)...
Task 5 runs 1.13 seconds.
Run task 9 (24804)...
Task 6 runs 1.46 seconds.
Task 4 runs 2.73 seconds.
Task 8 runs 2.18 seconds.
Task 7 runs 2.93 seconds.
Task 9 runs 2.93 seconds.
End the test
Calling the join () method on the Pool object waits for all child processes to finish executing. You must call close () before calling join (), and you can't add new Process after calling close ().
Process pool map method
The case comes from the Internet. Please let me know about the infringement. Thank you.
Because I think it's good to see this example on the Internet, so I don't write my own case here. This case is more convincing.
Import os import PIL from multiprocessing import Pool from PIL import ImageSIZE = (75 in 75) SAVE_DIRECTORY =\ 'thumbs\' def get_image_paths (folder): return (os.path.join (folder, f) for f in os.listdir (folder) if\ 'jpeg\' in f) def create_thumbnail (filename): im = Image.open (filename) im.thumbnail (SIZE, Image.ANTIALIAS) base Fname = os.path.split (filename) save_path = os.path.join (base, SAVE_DIRECTORY, fname) im.save (save_path) if _ name__ = =\'_ _ main__\': folder = os.path.abspath (\ '11_18_2013_R000_IQM_Big_Sur_Mon__e10d1958e7b766c3e840\') os.mkdir (os.path.join (folder) SAVE_DIRECTORY)) images = get_image_paths (folder) pool = Pool () pool.map (creat_thumbnail, images) # key points Images is an iterable object pool.close () pool.join ()
The main job of the above code is to traverse the picture files in the incoming folder, generate thumbnails one by one, and save these thumbnails to a specific folder. On my machine, it takes 27.9 seconds to process 6000 pictures with this program. The map function does not support manual thread management, which makes the related debug work extremely simple.
Map can also be used in the field of crawlers, such as crawling the content of multiple URL, you can put the URL into the meta-ancestor and then pass it to the execution function.
On how to analyze Python multi-process to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.