Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement multithreading and get the return value by python

2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)05/31 Report--

This article "python how to achieve multithreading and get the return value" most people do not understand, so the editor summed up the following content, detailed, clear steps, with a certain reference value, I hope you can get something after reading this article, let's take a look at this "python how to achieve multithreading and get the return value" article bar.

1. Multithreading 1.1 implementation code with return value #-*-coding:utf-8-* "" author: wyt date: April 21, 2022 "" import threadingimport requestsimport timeurls = [f 'https://www.cnblogs.com/#p{page}' # address to be crawled for page in range (1) " 10) # climb 1-10 pages] def craw (url): r = requests.get (url) num = len (r.text) # the number of words on the page crawled from the blog garden return num # returns the number of words on the page def sigle (): # single thread res = [] for i in urls: res.append (craw (I)) return resclass MyThread (threading.Thread): # rewrite threading.Thread class Add the function def _ _ init__ (self, url): threading.Thread.__init__ (self) self.url = url # to initialize the incoming url def run (self): # newly added function, which aims at self.result = craw (self.url) # ①. Call the craw (arg) function, and pass the pilot url as an argument-- to implement the crawler function # ②. And get the return value of the craw (arg) function and store it in the defined value result of this class def get_result (self): # newly added function The purpose of this function is to return the result return self.resultdef multi_thread () obtained by the run () function: print ("start") threads = [] # define a thread group for url in urls: threads.append (# add the MyThread class after assignment to the thread group MyThread (url) # pass each url to the rewritten MyThread class ) for thread in threads: # each thread group start thread.start () for thread in threads: # each thread group join thread.join () list = [] for thread in threads: list.append (thread.get_result ()) # each thread returns the result (result) added to the list print ("end") return list # returns the result returned by multiple threads The resulting list if _ _ name__ ='_ _ main__': start_time = time.time () result_multi = multi_thread () print (result_multi) # output return value-list # result_sig = sigle () # print (result_sig) end_time = time.time () print ('usage:' End_time-start_time) 1.2Results

Single thread:

Multithreading:

The acceleration effect is obvious.

Second, the implementation process 2.1 A common crawler function import threadingimport requestsimport timeurls = [f 'https://www.cnblogs.com/#p{page}' # address to be crawled for page in range (1) 10) # climb 1-10 pages] def craw (url): r = requests.get (url) num = len (r.text) # number of words print (num) def sigle (): # single thread res = [] for i in urls: res.append (craw (I)) return resdef multi_thread (): print ("start") threads = [] # define a thread group for url in urls: threads.append (threading.Thread (target=craw) Args= (url,) # pay attention to args= (url,) Tuple) for thread in threads: # each thread group start thread.start () for thread in threads: # each thread group join thread.join () print ("end") if _ _ name__ ='_ _ main__': start_time = time.time () result_multi = multi_thread () # result_sig = sigle () # print (result_sig) End_time = time.time () print ('usage:' End_time-start_time)

Return:

Start

69915

69915

69915

69915

69915

69915

69915

69915

69915

End

Time: 0.316709041595459

2.2 A simple multithreaded value passing instance import timefrom threading import Threaddef foo (number): time.sleep (1) return numberclass MyThread (Thread): def _ _ init__ (self) Number): Thread.__init__ (self) self.number = number def run (self): self.result = foo (self.number) def get_result (self): return self.resultif _ name__ = ='_ main__': thd1 = MyThread (3) thd2 = MyThread (5) thd1.start () thd2.start () thd1.join () thd2. Join () print (thd1.get_result ()) print (thd2.get_result ())

Return:

three

five

2.3 implementation focus

Multithreaded entry

Threading.Thread (target=craw,args= (url,)) # pay attention to args= (url,), tuple

Multithread parameter transfer

You need to rewrite the threading.Thread class and add a function that receives the return value. III. Code practice

Using this multithreading technique with return values, I rewrote a previously released code to crawl a subdomain. The original code is here: https://blog.csdn.net/qq_45859826/article/details/124030119

Import threadingimport requestsfrom bs4 import BeautifulSoupfrom static.plugs.headers import get_ua# https://cn.bing.com/search?q=site%3Abaidu.com&go=Search&qs=ds&first=20&FORM=PEREdef search_1 (url): Subdomain = [] html = requests.get (url, stream=True, headers=get_ua ()) soup = BeautifulSoup (html.content) 'html.parser') job_bt = soup.findAll (' h3') for i in job_bt: link = i.a.get ('href') # print (link) if link not in Subdomain: Subdomain.append (link) return Subdomainclass MyThread (threading.Thread): def _ init__ (self) Url): threading.Thread.__init__ (self) self.url = url def run (self): self.result = search_1 (self.url) def get_result (self): return self.resultdef Bing_multi_thread (site): print ("start") threads = [] for i in range (1) 30): url = "https://cn.bing.com/search?q=site%3A" + site +" & go=Search&qs=ds&first= "+ str ((int (I)-1) * 10) +" & FORM=PERE "threads.append (MyThread (url)) for thread in threads: thread.start () for thread in threads: thread.join () Res_list = [] for thread in threads: res_list.extend (thread.get_result ()) res_list = list (set (res_list)) # list deduplication number = 1 for i in res_list: number + = 1 number_list = list (range (1) Number + 1)) dict_res = dict (zip (number_list, res_list)) print ("end") return dict_resif _ _ name__ = ='_ main__': print (Bing_multi_thread ("qq.com"))

Return:

{

1RV 'https://transmart.qq.com/index',

2RV 'https://wpa.qq.com/msgrd?v=3&uin=448388692&site=qq&menu=yes',

3RV 'https://en.exmail.qq.com/',

4RV 'https://jiazhang.qq.com/wap/com/v1/dist/unbind_login_qq.shtml?source=h6_wx',

5RV 'http://imgcache.qq.com/',

6RV 'https://new.qq.com/rain/a/20220109A040B600',

7RV 'http://cp.music.qq.com/index.html',

8RV 'http://s.syzs.qq.com/',

9RV 'https://new.qq.com/rain/a/20220321A0CF1X00',

10RV 'https://join.qq.com/about.html',

11RV 'https://live.qq.com/10016675',

12RV 'http://uni.mp.qq.com/',

13RV 'https://new.qq.com/omn/TWF20220/TWF2022042400147500.html',

14 'https://wj.qq.com/?from=exur#!',

15RV 'https://wj.qq.com/answer_group.html',

16RV 'https://view.inews.qq.com/a/20220330A00HTS00',

17RV 'https://browser.qq.com/mac/en/index.html',

18Rose 'https://windows.weixin.qq.com/?lang=en_US',

19 'https://cc.v.qq.com/upload',

20RV 'https://xiaowei.weixin.qq.com/skill',

21 'http://wpa.qq.com/msgrd?v=3&uin=286771835&site=qq&menu=yes',

22 'http://huifu.qq.com/',

23RV 'https://uni.weixiao.qq.com/',

24Ru 'http://join.qq.com/',

25Rose 'https://cqtx.qq.com/',

26Rose 'http://id.qq.com/',

27Rom 'http://m.qq.com/',

28Rom 'https://jq.qq.com/?_wv=1027&k=pevCjRtJ',

29RV 'https://v.qq.com/x/page/z0678c3ys6i.html',

30RV 'https://live.qq.com/10018921',

31purl 'https://m.campus.qq.com/manage/manage.html',

32RV 'https://101.qq.com/',

33Ru 'https://new.qq.com/rain/a/20211012A0A3L000',

34R 'https://live.qq.com/10021593',

35RV 'https://pc.weixin.qq.com/?t=win_weixin&lang=en',

36RV 'https://sports.qq.com/lottery/09fucai/cqssc.htm'

}

I can feel that the speed is much faster, and it's cool to blow up the sub-domain name in this way. Without multithreading, there may be several features missing in my project: because some of the programs I've written before have been cut off because they take too long to execute. This function is still very practical.

The above is about the content of this article on "how to achieve multithreading and get the return value of python". I believe we all have a certain understanding. I hope the content shared by the editor will be helpful to you. If you want to know more related knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report