In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Python in multi-process and multi-threaded usage and scenario analysis, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
[written in front]
When I first started to learn Python concurrent queries or concurrent reading and writing, I saw the great gods say that multithreading is the chicken rib of python, so if you want to learn, you should learn multiple processes. Well, I don't even know how to write multithreading.
Therefore, the following sample code is written. The purpose of the code is to write the contents of the test.txt file to the new file in a concurrent way (multi-thread / multi-process), so as to verify the efficiency of the two concurrent modes.
[sample code]
# coding=utf-8# @ Auther: "Pengge thieves are excellent" # @ Date: 2019 @ Software @ Software
From multiprocessing import Poolimport timeimport threading
# def writedata (content): with open ("new1.txt", "a") as f: f.writelines (content) # define your own multithreaded inheritance class class myThread (threading.Thread): # declare that myThread is the multithreaded inheritance class def _ _ init__ (self Content): threading.Thread.__init__ (self) self.content = content # content of multithreaded operation def run (self): threadingLock = threading.Lock () threadingLock.acquire () self.my_writedata (self.content) threadingLock.release () # Multithreaded write data method def my_writedata (self Content): with open ("new2.txt", "a") as f: f.writelines (content)
If _ _ name__ = = "_ _ main__": # create a test.txt Used to write with open ("test.txt", "w") as flipw: for i in range (1000): f_w.write (str (I) + "\ n") # Multi-process read and write print ("start timing (multi-process write)") t0 = time.time () with open ("test.txt", "r") Encoding= "utf-8") as f: content = f.readlines () pool = Pool (processes=4) pool.map_async (writedata (content) Range (len (content)) pool.close () pool.join () T1 = time.time () print ("completion time is: {0}" .format (T1-t0)) # Multithreaded read and write print ("start timing (multithreaded writing)") T2 = time.time () with open ("test.txt", "r") Encoding= "utf-8") as f: content = f.readlines () threads = [] threadnum = 4 eline = len (content) / / threadnum for i in range (threadnum): threadtemp = myThread (content [I * eline: (I + 1) * eline]) threadtemp.start () threads.append (threadtemp) for i in threads: i.join () T3 = time.time () Print ("time to complete: {0}" .format (T3-T2))
[effect]
The first one is multi-process. I typed the wrong word at the beginning. )
[knowledge points]
1. Multi-process code flow:
(1) create a process pool and specify the number of processes. I am using 4 processes here because the CPU core of my local PC is 4. Query method:
Multiprocessing.cpu_count () can know the nuclear number.
(2) add the specific method to be executed to the map_async handling of the process pool. Of course, you can also use the map method. Among them, the map_async method requires two parameters, one is the specific function to be executed, and the other represents the number of times the loop is executed, and the type is list. In the sample code, because only one line of content is written for each wirtedata, a total of len (content) is written.
(3) finally, there is the shutdown after the process pool, and after the child process is closed, the main program continues to execute. The purpose of the join () method is the same whether it is multithreaded or not multiprocess.
In terms of code implementation, I feel that the multi-process process is simple and clear, and the data written between the processes is sequential, unlike multithreading because of the random problems of GIL. But the only problem is that it takes a long time to write.
2. Multi-thread code flow:
This article takes the method of inheriting the threading class, that is, you need to ReFactor the run method. Personally, I feel that in this way, the level is clearer and the run method can be reconstructed flexibly according to its own purpose. This approach is highlighted below.
(1) define an object that inherits the threading class, and then ReFactor the run method. Among them, the run method needs to consider the lock problem. Why consider locking? Because there are multiple threads writing files at the same time, if there is no lock, multiple threads will write the same piece of data at the same time, resulting in dirty data.
The refactoring of the run method is also very simple: first lock it, then add the function you want, and finally unlock it.
(2) I quickly learned how to write the mythread class, but the problem is how to write test.txt to multiple threads.
At first, I wrote:
For i in range (threadnum):
Threadtemp = myThread (content)
Threadtemp.start ()
Threads.append (threadtemp)
It is found that each thread has repeatedly written the contents of the test.txt once, which is equivalent to 4 times of repetition. This is where the usage of multi-process is different.
[description]:
Multiple processes, because they do not share resources, each process can read different content content; multithreading, threads share with each other, if you want it to achieve concurrent function, you must specify what to write to each thread. So in the final sample code, I specified a different write to each thread through content [I * eline: (I + 1) * eline].
(3) in the main function, each thread needs to be start, and finally, after each child thread ends, the main thread is executed again.
From the last line of code, I think multithreading is actually a bit of a hassle. In addition, there is another trouble: multithreading is called randomly when it is called. What do you mean? That is, after the thread1 ends, the second one running is thread9, not thread2. So the question is, since the Great God advised us to use multi-processes, and multi-processes are easy to write, do we use multi-processes in all scenarios?
3. Usage scenarios of multi-thread and multi-process:
Multi-threaded usage scenarios: scenarios with intensive IO operations, such as crawlers, web access, etc., need to frequently read and write data from network, hard disk, memory, etc. In this case, because the single-threaded IO operation will have IO waiting, resulting in unnecessary waste of time, so using multithreading can start the operation of thread B while thread An is waiting.
Multi-process usage scenarios: CPU computing-intensive scenarios, such as scientific computing, circular processing, etc. In these scenarios, due to the heavy computing workload, a single thread may time out to release GIL, which leads to the snatch of multiple threads, but the effect is not good.
Then why do the great gods say it is best to use multiple processes? It should be in order to make full use of multicore CPU, otherwise our PCs will have to buy multicore CPU?
Going back to the sample code, from the result, the efficiency of multithreading is faster, and it also shows that multithreading is more recommended for hard disk read and write scenarios, but the result of multithreading, that is, new2.txt is out of order, ah. However, I think there should be a solution, but I will study next time.
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.