How to use python to climb the first 100 Cat's Eye movies 04/16 Update SLTechnology News&Howtos

How to use python to climb the first 100 Cat's Eye movies

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "how to use python to climb the first 100 cat's eye movies". In daily operation, I believe many people have doubts about how to use python to climb the first 100 cat's eye movies. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubts about "how to use python to climb the first 100 cat's eye movies". Next, please follow the editor to study!

Import requestsimport refrom bs4 import BeautifulSoupfrom lxml import etreeimport tracebackimport csv# defines a function to get the first page of Douban movie def get_one_page (url,code = 'utf-8'): headers = {' User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64) X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.90 Safari/537.36'} try: r = requests.get (url) Headers = headers) if r.status_code = 200: r.encoding = code return r.text else: print ("corresponding failure") return None except: traceback.print_exc () def process (raw): right = raw.split ("@") return right [0] def area (a): if a [- 1] = " ) ": return a [16:] else: return None

Def parse_one_page (slst,html): # regular expression # rank = re.findall ('. *? (\ d +). *? data-src= "(. *?)". *? (. *?).

. * (. *?)

. * (. *?) (. *?)

', re.S) items = re.findall (pattern,html) # print (items) for item in items: # yield is equivalent to the function of return, but it is also different The yield statement converts the program iterator yield {'rank':item [0],' img':process (item [1]), 'MovieName':item [2], "star": item [3] .strip () [3:], "time": item [4] .strip () [5:15] "area": area (item [4] .strip ()), "score": str (item [5]) + str (item [6])} # return "" def write_to_file (item): with open ("Cat's Eye top100.csv", 'axiom camera = "utf_8_sig", newline= "") as f: # an append mode newline line break fieldnames = [' rank'] 'img','MovieName','star','time','area','score'] w = csv.DictWriter (fje field names = fieldnames) # dictionary is written to csv # w.writeheader () w.writerow (item) return "" def down_img (name,url,num): try: response = requests.get (url) with open (' C:/Users/HUAWEI/Desktop/py/ crawler / douban/'+name+'.jpg' " 'wb') as f: f.write (response.content) print ("% s picture downloaded"% str (num)) print ("=" * 20) except Exception as e: print (e.printing classrooms. Error name _) # print error type name def main (I): num = 0 url =' https://maoyan.com / board/4?offset=' + str (I) html = get_one_page (url) # print (html) lst = [] # this is of no use here But in the future, if you want to store some kind of information separately, write like this. Followed by the function parameter iterator = parse_one_page (lst,html) for an in iterator: # print (a) num + = 1 write_to_file (a) down_img (a ['MovieName'], a [' img'] Num) # if _ _ name__ = ='_ main__':# for i in range (10): # main (I) # Multithreaded capture from multiprocessing import Poolif _ _ name__ = ='_ main__': pool = Pool () pool.map (main, [I * 10 for i in range (10)])

The final running result is as follows:

Save the cover picture

Save the crawled information to the csv file

At this point, the study on "how to climb the first 100 cat's eye movies with python" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.