Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Python how to climb the download link of Aitu.com material

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Python how to climb Aitu.com material download link, I believe that many inexperienced people do not know what to do, so this article summarizes the causes of the problem and solutions, through this article I hope you can solve this problem.

Preface

Usually crawl pictures directly, but sometimes you only want individual pictures. What should I do?

Project goal

Crawl the download address of Aitu.com material

By clicking on the material to enter the material details page, you can see the local download address and copy more download address links for the material:

Http://www.aiimg.com/sucai.php?open=1&aid=126632&uhash=70a6d2ffc358f79d9cf71392http://www.aiimg.com/sucai.php?open=1&aid=126630&uhash=99b07c347dc24533ccc1c144http://www.aiimg.com/sucai.php?open=1&aid=126634&uhash=d7e8f7f02f57568e280190b4

The aid of each link is different. This should be each ID of the material. What is the uhash behind it?

Originally thought whether there is interface data in the web page data can directly find this parameter, search in the developer tool does not have this parameter, check to see if there is this download link in the web source code.

If we have this link, we can download it directly after we get it.

We can find that all the data we need is in the tag of the web page, and we request the web page to get the returned data.

Import requestsurl = 'http://www.aiimg.com/list.php?tid=1&ext=0&free=2&TotalResult=5853&PageNo=1'headers = {' User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'} response = requests.get (url=url, headers=headers) print (response.text)

Parsing crawling data

Import parselselector = parsel.Selector (response.text) lis = selector.css ('.imglist _ d ul li a::attr (href)'). Getall () for li in lis: num_id = li.replace ('.html','). Split ('/') [- 1] new_url = 'http://www.aiimg.com/sucai.php?aid={}'.format(num_id) response_2 = requests.get (url=new_url) Headers=headers) selector_2 = parsel.Selector (response_2.text) data_url = selector_2.css ('.downlist a.down1::attr (href)'). Get () title = selector_2.css ('.toart a Groupe text'). Get () download_url = 'http://www.aiimg.com' + data_url

After reading the above, have you mastered the method of how Python climbs the download link of Aitu.com material? If you want to learn more skills or want to know more about it, you are welcome to follow the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report