In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains "what is Python asynchronous crawler multithreading and thread pool". The content of the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "what is Python asynchronous crawler multithreading and thread pool".
Catalogue
Background
Asynchronous crawler mode
Multi-thread, multi-process (not recommended)
Thread pool, process pool (appropriate use)
Single thread + asynchronous co-program (recommended)
Multithreading
Thread pool
Background
When sending requests to multiple url, only the first url is requested before the second url is requested (requests is a blocking operation). There is a waiting time, which is very inefficient. Can we start the process or thread separately when sending the request and wait, continue to request the next url, and execute the parallel request?
Asynchronous crawler mode multi-thread, multi-process (not recommended)
Benefits: threads or processes can be opened separately for related blocking operations, and blocking operations can be performed asynchronously
Disadvantages: can not open multi-thread or multi-process indefinitely (need to frequently create or destroy processes, threads)
Thread pool, process pool (appropriate use)
Benefits: it can reduce the frequency of process or thread creation and destruction, thus reducing the overhead of the system.
Disadvantages: there is a limit to the number of threads or process pools
Single thread + asynchronous co-program (recommended) multithreading
It takes 8 seconds to run the following code normally, because sleep is a blocking operation and nothing else is performed while waiting, which greatly reduces efficiency.
From time import sleepimport timestart = time.time () def xx (str): print ('downloading:', str) sleep (2) str = ['xiaozi',' aa', 'bb',' cc'] for i in str: xx (I) end = time.time () print ('running time:', end-start)
After using multithreading
From threading import Threadfrom time import sleepimport timestart = time.time () def xx (str): print ('downloading:', str) sleep (2) str = ['xiaozi','aa','bb','cc'] def main (): for s in str: # start thread, target= function name, args= (xx,), xx is the parameter passed to the function and must be of tuple type So you need to add t = Thread (target=xx,args= (s,)) t.start () if _ _ name__ ='_ _ main__': main () end = time.time () print ('program run time:', end-start)
But we found that the following sequence seems to be a little out of order.
Thread pool
Change the above to thread pool and run
# from multiprocessing.dummy import Poolfrom time import sleepimport timestart = time.time () def xx (str): print ('downloading:', str) sleep (2) str = ['xiaozi','aa','bb','cc'] # instantiate a thread pool object and open up four thread objects in the thread pool Four parallel threads handle four blocking operations pool = Pool (4) # pass each list element (iterable object) in the list to the xx function (blocking operation) for processing # the map method will have a return value as the function's return value (a list) But there is no return value here, so do not consider # calling the map method pool.map (xx,str) end = time.time () print ('program run time:', end-start)
Thank you for reading, the above is the content of "what is Python asynchronous crawler multithreading and thread pool". After the study of this article, I believe you have a deeper understanding of what is Python asynchronous crawler multithreading and thread pool. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.