In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article will explain in detail how to use Python to download the major V videos of Douyin. The editor thinks it is very practical, so I share it for you as a reference. I hope you can get something after reading this article.
Train of thought
First of all, to download videos in bulk, you can try to download one successfully, make sure there is no watermark, and then write a loop for batch download.
Difficulty: downloading one video may be very simple, but downloading multiple videos is slightly more complicated. You need to grab the url corresponding to multiple videos. Douyin has taken an anti-crawling measure, allowing only the list of videos on the personal home page to be seen on the phone, but not on the computer side. This requires crawling the https package of the phone, which is crawled here with the help of Burpsuite.
There is a play_addr inside and a urllist inside. We copy the urllist [0] and open it in the browser. The website jumps to the real playback address and you can see the download button:
Download this video and find that it is watermarked. How to download the video without watermark? After searching on the Internet, the method is to change the playwm in the above urllist [0] to play.
Then start writing code, get the urllist [0], and download
Def get (share_url)-> dict: "share_url-> Douyin video sharing url returns format [{'url':'',' title','format':'',}, {}]" data = [] headers = {'accept':' application/json', 'user-agent':' Mozilla/5.0 (iPhone) " CPU iPhone OS 1140 like Mac OS X) AppleWebKit/604.1.38 (KHTML, like Gecko) Version/11.0 Mobile/15A372 Safari/604.1'} api = "https://www.iesdouyin.com/web/api/v2/aweme/iteminfo/?item_ids={item_id}" rep = requests.get (share_url, headers=headers, timeout=10) if rep.ok: # item_id item_id = re.findall (r'video/ (\ d +)' Rep.url) if item_id: item_id=item_id [0] # video info rep = requests.get (api.format (item_id=item_id), headers=headers Timeout=10) if rep.ok and rep.json () ["status_code"] = 0: info = rep.json () ["item_list"] [0] tmp = {} tmp ["title"] = info ["desc"] # unwatermarked video link play_url = info ["video" ] ["play_addr"] ["url_list"] [0] .replace ('playwm' 'play') tmp ["url"] = play_url tmp ["format"] =' mp4' data.append (tmp) return dataif _ _ name__ ='_ _ main__': videos = get ('https://www.iesdouyin.com/share/video/6920538027345415431/?region=&mid=6920538030852885262&u_code=48&titleType=title&did=0&iid=0') for video in videos: Downloader.download (video ['url'] Video ['title'], video [' format'],'. / download')
Here the downloader.download function is the same as the function in the previous Zhihu video download, so there is no code here.
Get a video link to your personal home page
The first two steps have achieved the unwatermarked download of a single Douyin video, now all we need to do is to find a large number of such links and loop them directly.
Open a big V home page, share, copy the link, open it with a browser, and you can't see a single video, but you can see it by using Douyin App:
browser
Be careful not to set ip to 127.0.0.1, so that only local requests can use the proxy, and the phone cannot connect to this proxy.
2. Set up an agent for mobile phone
The operation of connecting the phone and the computer to the same wifi,IPhone is as follows: then enter Settings-> Wireless LAN-> Click the information symbol to the right of the same wifi, then drop down, click configure proxy, and configure the same ip and port as BurpSuite. The settings for Android's phones are similar. At this point, you can grab the http traffic of the mobile phone on the BurpSuite.
3. Download the certificate of Burp on the phone and set the trust
The mobile browser goes to http://burp.
Click CA to download the certificate.
Settings-> General-> description File-> Click PortSwigger CA- > install
Set-> General-> about Local-> Certificate Trust Settings, open the certificate of BurpSuite
In this way, you can grab the https package initiated on the phone.
4. Set BurpSuite interrupt
Then open the Repeater tab of BurpSuite, and you can see the request just sent. At this time, we choose to replay, look at the data, and decide which interface we need to use, as shown in the following figure:
It is found that this API satisfies the request. Here you can see the various parameters of the url,headers of the interface, and the User-Agent parameter in headers is an important identification that distinguishes whether the client is a browser or an App, so you can write code to simulate the request and get the required batch download links.
Because there are so many parameters in url, some are fixed, and some will change with different people's home page parameters, if you only use it yourself, you can simply extract these url links through regular expressions and then download them in bulk.
If you want to write a script for others to use, then you need to do more work, for example, you need to see more api to determine how the parameters in url and headers are obtained or generated, and then write a script to automate the process, in some cases, anti-crawling measures such as encryption confusion are involved.
The last words
The key to crawling a video is to find the playback address of the video. With the playback address, even if you don't write the code, you can still use the browser to download. Finding the playback address is not enough. Consider whether you can remove the watermark. If you want to download in bulk, you need to know how to get more video links. When the browser cannot grab it, consider using BurpSuite to grab the traffic packet of the phone and further extract the data of the interface. Or simulated mobile phone request, for students engaged in crawlers, BurpSuite is a Swiss Army knife, very practical.
This is the end of this article on "how to use Python to download Douyin V videos". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, please share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.