In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "how to use Python to climb bilibili animation drama update information", in the daily operation, I believe that many people have doubts about how to use Python to climb bilibili animation drama update information. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubt of "how to use Python to climb bilibili animation drama update information". Next, please follow the editor to study!
Goal: climb to get the latest update of bilibili's drama.
Output format: name + playback + introduction
So let's get started.
The class libraries used:
Requests: network request
Pyquery: parsing xml documents is as simple as using jquery.
1. Analyze the page layout and find the content you need to crawl
Target url:
Https://bangumi.bilibili.com/22/
Design the video class:
Import requestsfrom pyquery import PyQuery as pqclass Video (object): def _ init__ (self,name,see,intro): self.name=name self.see=see self.intro=intro def _ str__ (self): return "{}-- {}-- {}" .format (self.name,self.see,self.intro)
After analyzing the page, set the crawl class:
Class bilibili (object): host= "https://bangumi.bilibili.com" def _ _ init__ (self): self.dom=pq (requests.get ('https://bangumi.bilibili.com/22/').text) def get_recent (self):' 'recently updated' items=self.dom ('# list_bangumi_new. C-list. New. Cmeritem') videos= [] For i in items: name=i.find ('.rmuri.t'). Attr ('title') link=self.host+i.find (' .rMuri.t'). Attr ('href') d=pq (requests.get (url=link) .text) see=d (".info-count .info-count-item"). Eq (1). Find (' em'). Text () Intro=d ('. Info-row'). Eq (3). Find ('. Info-desc'). Text () videos.append (Video (name=name) See=see,intro=intro)) return videos
Oh, what's going on? the return is empty.
In this case, don't panic. If there are no errors in the code, it is usually caused by two situations.
No target selected, the page is dynamically loaded by js
Let's try the first case, open the browser, F12, copy the selection string to console and run it. Here we are.
$('# list_bangumi_new. C-list. New. Cmuritem`)
This is an item message, which contains the name information we want, so the next step is to go to the details page to find the number of views and profiles, but where is the link to the details page? it's not in the interface just now. Let's F12, review the element.
The link here is / anime/6439. There is no such information in the API just now, so the information should be pieced together. The key is the number 6439. Look for it in the API information just now, and sure enough, a season_id field is found to match. Then the link on the details page is constructed as follows:
Detail_url = "https://bangumi.bilibili.com/anime/{season_id}"
So the next step is to analyze the details page, crawl to what we want to play and profile information, and construct the crawl code as follows:
See = d (".info-count .info-count-item") .eq (1) .find ('em') .text () intro = d (' .info-desc-wrp') .find ('.info-desc') .text ()
Then the key code for crawling the class is as follows:
Class bilibili (object): recent_url = "https://bangumi.bilibili.com/api/timeline_v2_global" # recently updated detail_url =" https://bangumi.bilibili.com/anime/{season_id}" def _ _ init__ (self): self.dom=pq (requests.get ('https://bangumi.bilibili.com/22/').text) def get_recent (self):' Items=json.loads (requests.get (self.recent_url) .text) ['result'] videos= [] for i in items: name=i [' title'] link=self.detail_url.format (season_id=i ['season_id']) d=pq (requests.get (url=link) .text) see = d (".info-count) .info-count-item ") .eq (1) .find ('em') .text () intro = d (' .info-desc-wrp') .find ('.info-desc') .text () videos.append (Video (name=name) See=see,intro=intro)) return videos
Run it:
Very ok, then make it into a command line ~
two。 Make a command line version
The class libraries used:
Argparse: parsing command line arguments
The main code is as follows:
If _ _ name__ = ='_ _ main__': parser=argparse.ArgumentParser () parser.add_argument ('--recent',help= "get the recent info", action= "store_true") parser.add_argument ('--num',help= "The number of results returned,default show all", type=int,default=0) parser.add_argument ('- vaulting Action= "store_true") args=parser.parse_args () if args.version: print ("bilibili 1.0") elif args.recent: B = bilibili () b.get_recent (args.num)
Take a look at the effect:
Ok, it's done. Next, let's free to add more functions.
At this point, on "how to use Python to climb bilibili animation drama update information" on the end of the study, I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.