In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
Today, the editor will share with you the relevant knowledge points about how to climb bilibili's mini video with Python. The content is detailed and the logic is clear. I believe most people still know too much about this, so share this article for your reference. I hope you can get something after reading this article.
Bilibili's short video address:
Http://vc.bilibili.com/p/eden/rank#/?tab= all
I climbed the daily mini video rankings and learned the daily ones. It's very easy to climb this week's and this month's. Just change the tag and we'll talk about it later in the detailed analysis. The following is the crawl result.
Project environment
Language: Python3
Tool: Pycharm
Program structure
It is mainly composed of three parts:
Get_json (): extract the json data information of the target web page.
Downloader (): download the short video and show the download progress.
Main function: download the video cyclically until the download is complete.
Code analysis
Observing the changes in the parameters below, it is found that only the next_offset field is changing, with 10 more each time than the previous one.
This is easy to do, we take out the parameters separately, write the variable next_offset into variables, and return the json data of the target web page.
Next, I downloaded the short video, and in order to look beautiful, I made a downloader to show the download speed. The effect is as follows.
There is one thing to note here. When you request a target page, you must bring the headers information of this page. The website has done an anti-crawling operation, otherwise the downloaded video is empty. Part of the code is as follows. (ps: when you run the code, change the headers to the headers of your browser on this page.)
In order to extract more videos in the main function, we make the variable next_offset bad, and then extract the video title and downloadable links from the json data. Looking at the json data structure of the page, you can easily get the article title and download the link data.
In order to prevent some videos from not providing download links, I added exception handling. Careful friends should have found that there are only 84 videos in the effect images given in front of the article. That's why. Finally, in order to prevent the ip from being blocked, the random wait time is set. In fact, on the whole, 100 videos can be downloaded in less than 5 minutes.
These are all the contents of the article "how to climb bilibili's Mini Video with Python". Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.