In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article "python how to achieve multithreading and get the return value" most people do not understand, so the editor summed up the following content, detailed, clear steps, with a certain reference value, I hope you can get something after reading this article, let's take a look at this "python how to achieve multithreading and get the return value" article bar.
1. Multithreading 1.1 implementation code with return value #-*-coding:utf-8-* "" author: wyt date: April 21, 2022 "" import threadingimport requestsimport timeurls = [f 'https://www.cnblogs.com/#p{page}' # address to be crawled for page in range (1) " 10) # climb 1-10 pages] def craw (url): r = requests.get (url) num = len (r.text) # the number of words on the page crawled from the blog garden return num # returns the number of words on the page def sigle (): # single thread res = [] for i in urls: res.append (craw (I)) return resclass MyThread (threading.Thread): # rewrite threading.Thread class Add the function def _ _ init__ (self, url): threading.Thread.__init__ (self) self.url = url # to initialize the incoming url def run (self): # newly added function, which aims at self.result = craw (self.url) # ①. Call the craw (arg) function, and pass the pilot url as an argument-- to implement the crawler function # ②. And get the return value of the craw (arg) function and store it in the defined value result of this class def get_result (self): # newly added function The purpose of this function is to return the result return self.resultdef multi_thread () obtained by the run () function: print ("start") threads = [] # define a thread group for url in urls: threads.append (# add the MyThread class after assignment to the thread group MyThread (url) # pass each url to the rewritten MyThread class ) for thread in threads: # each thread group start thread.start () for thread in threads: # each thread group join thread.join () list = [] for thread in threads: list.append (thread.get_result ()) # each thread returns the result (result) added to the list print ("end") return list # returns the result returned by multiple threads The resulting list if _ _ name__ ='_ _ main__': start_time = time.time () result_multi = multi_thread () print (result_multi) # output return value-list # result_sig = sigle () # print (result_sig) end_time = time.time () print ('usage:' End_time-start_time) 1.2Results
Single thread:
Multithreading:
The acceleration effect is obvious.
Second, the implementation process 2.1 A common crawler function import threadingimport requestsimport timeurls = [f 'https://www.cnblogs.com/#p{page}' # address to be crawled for page in range (1) 10) # climb 1-10 pages] def craw (url): r = requests.get (url) num = len (r.text) # number of words print (num) def sigle (): # single thread res = [] for i in urls: res.append (craw (I)) return resdef multi_thread (): print ("start") threads = [] # define a thread group for url in urls: threads.append (threading.Thread (target=craw) Args= (url,) # pay attention to args= (url,) Tuple) for thread in threads: # each thread group start thread.start () for thread in threads: # each thread group join thread.join () print ("end") if _ _ name__ ='_ _ main__': start_time = time.time () result_multi = multi_thread () # result_sig = sigle () # print (result_sig) End_time = time.time () print ('usage:' End_time-start_time)
Return:
Start
69915
69915
69915
69915
69915
69915
69915
69915
69915
End
Time: 0.316709041595459
2.2 A simple multithreaded value passing instance import timefrom threading import Threaddef foo (number): time.sleep (1) return numberclass MyThread (Thread): def _ _ init__ (self) Number): Thread.__init__ (self) self.number = number def run (self): self.result = foo (self.number) def get_result (self): return self.resultif _ name__ = ='_ main__': thd1 = MyThread (3) thd2 = MyThread (5) thd1.start () thd2.start () thd1.join () thd2. Join () print (thd1.get_result ()) print (thd2.get_result ())
Return:
three
five
2.3 implementation focus
Multithreaded entry
Threading.Thread (target=craw,args= (url,)) # pay attention to args= (url,), tuple
Multithread parameter transfer
You need to rewrite the threading.Thread class and add a function that receives the return value. III. Code practice
Using this multithreading technique with return values, I rewrote a previously released code to crawl a subdomain. The original code is here: https://blog.csdn.net/qq_45859826/article/details/124030119
Import threadingimport requestsfrom bs4 import BeautifulSoupfrom static.plugs.headers import get_ua# https://cn.bing.com/search?q=site%3Abaidu.com&go=Search&qs=ds&first=20&FORM=PEREdef search_1 (url): Subdomain = [] html = requests.get (url, stream=True, headers=get_ua ()) soup = BeautifulSoup (html.content) 'html.parser') job_bt = soup.findAll (' h3') for i in job_bt: link = i.a.get ('href') # print (link) if link not in Subdomain: Subdomain.append (link) return Subdomainclass MyThread (threading.Thread): def _ init__ (self) Url): threading.Thread.__init__ (self) self.url = url def run (self): self.result = search_1 (self.url) def get_result (self): return self.resultdef Bing_multi_thread (site): print ("start") threads = [] for i in range (1) 30): url = "https://cn.bing.com/search?q=site%3A" + site +" & go=Search&qs=ds&first= "+ str ((int (I)-1) * 10) +" & FORM=PERE "threads.append (MyThread (url)) for thread in threads: thread.start () for thread in threads: thread.join () Res_list = [] for thread in threads: res_list.extend (thread.get_result ()) res_list = list (set (res_list)) # list deduplication number = 1 for i in res_list: number + = 1 number_list = list (range (1) Number + 1)) dict_res = dict (zip (number_list, res_list)) print ("end") return dict_resif _ _ name__ = ='_ main__': print (Bing_multi_thread ("qq.com"))
Return:
{
1RV 'https://transmart.qq.com/index',
2RV 'https://wpa.qq.com/msgrd?v=3&uin=448388692&site=qq&menu=yes',
3RV 'https://en.exmail.qq.com/',
4RV 'https://jiazhang.qq.com/wap/com/v1/dist/unbind_login_qq.shtml?source=h6_wx',
5RV 'http://imgcache.qq.com/',
6RV 'https://new.qq.com/rain/a/20220109A040B600',
7RV 'http://cp.music.qq.com/index.html',
8RV 'http://s.syzs.qq.com/',
9RV 'https://new.qq.com/rain/a/20220321A0CF1X00',
10RV 'https://join.qq.com/about.html',
11RV 'https://live.qq.com/10016675',
12RV 'http://uni.mp.qq.com/',
13RV 'https://new.qq.com/omn/TWF20220/TWF2022042400147500.html',
14 'https://wj.qq.com/?from=exur#!',
15RV 'https://wj.qq.com/answer_group.html',
16RV 'https://view.inews.qq.com/a/20220330A00HTS00',
17RV 'https://browser.qq.com/mac/en/index.html',
18Rose 'https://windows.weixin.qq.com/?lang=en_US',
19 'https://cc.v.qq.com/upload',
20RV 'https://xiaowei.weixin.qq.com/skill',
21 'http://wpa.qq.com/msgrd?v=3&uin=286771835&site=qq&menu=yes',
22 'http://huifu.qq.com/',
23RV 'https://uni.weixiao.qq.com/',
24Ru 'http://join.qq.com/',
25Rose 'https://cqtx.qq.com/',
26Rose 'http://id.qq.com/',
27Rom 'http://m.qq.com/',
28Rom 'https://jq.qq.com/?_wv=1027&k=pevCjRtJ',
29RV 'https://v.qq.com/x/page/z0678c3ys6i.html',
30RV 'https://live.qq.com/10018921',
31purl 'https://m.campus.qq.com/manage/manage.html',
32RV 'https://101.qq.com/',
33Ru 'https://new.qq.com/rain/a/20211012A0A3L000',
34R 'https://live.qq.com/10021593',
35RV 'https://pc.weixin.qq.com/?t=win_weixin&lang=en',
36RV 'https://sports.qq.com/lottery/09fucai/cqssc.htm'
}
I can feel that the speed is much faster, and it's cool to blow up the sub-domain name in this way. Without multithreading, there may be several features missing in my project: because some of the programs I've written before have been cut off because they take too long to execute. This function is still very practical.
The above is about the content of this article on "how to achieve multithreading and get the return value of python". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more related knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.