Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use python to crawl a group of pictures of little sisters

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly explains "how to use python to climb a group of pictures of little sisters". Friends who are interested might as well take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to use python to climb a group of pictures of little sisters"!

Import the library

Import timeimport requestsfrom lxml import etree these three libraries are designed to allow us to take a break when we request other people's websites to avoid blocking or crashing other people's websites and parsing the resulting page source code. Web page analysis

Use the developer mode of the browser to analyze the page and find out the url of each image cover we need.

Href = tree.xpath ('/ / * [@ id= "features"] / div/div [1] / div/div [1] / a _ Accesshref')

We got the url of the cover, but that's not what we need. What we need is the picture in the hyperlink.

When we entered, we found that every picture was there.

Inside, how do we use a loop to get the url address of each picture

For url_img in href: img_url = requests.get (url_img,headers=head) # print (img_url.text) time.sleep (1) t = etree.HTML (img_url.text) url_list = t.xpath ("/ html/body/section/div/div/div [1] / div [2] / p [2] / img/@src")

The rest is very simple, we just need to save the file to get the results we want.

With open (f ". / img/ {name}", mode= "wb") as f: f.write (download_img.content) print ("downloading:" + name) time.sleep (1)

Complete code import timeimport requestsfrom lxml import etree def get_page_url (): for i in range (1,4): # Loop 3 page url = f "https://mm.tvv.tw/category/xinggan/{i}/" # request page to get the source code res = requests.get (url Headers=head) # parse the source code tree = etree.HTML (res.text) # to get the cover of each picture url (href) href = tree.xpath ('/ / * [@ id= "features"] / div/div [1] / div/div [1] / a features) # print ("- -") time.sleep (3) for url_img in href: img_url = requests.get (url_img Headers=head) # print (img_url.text) time.sleep (1) t = etree.HTML (img_url.text) url_list = t.xpath ("/ html/body/section/div/div/div [1] / div [2] / p [2] / img/@src") # print (url_list) time.sleep (1) For url_src in url_list: get_img (url_src) def get_img (url): name = url.rsplit ("/" 1) [1] time.sleep (2) download_img = requests.get (url,headers=head) with open (f ". / img/ {name}" Mode= "wb") as f: f.write (download_img.content) print ("downloading:" + name) time.sleep (1) f.close () if _ _ name__ ='_ _ main__': head = {"user-agent": "Mozilla/5.0 (Windows NT 10.0) Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36 "} get_page_url () so far, I believe you have a deeper understanding of" how to use python to crawl a group of pictures of little sisters ". You might as well do it in practice! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report