In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-09-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains "how to use python to climb a group of pictures of little sisters". Friends who are interested might as well take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to use python to climb a group of pictures of little sisters"!
Import the library
Import timeimport requestsfrom lxml import etree these three libraries are designed to allow us to take a break when we request other people's websites to avoid blocking or crashing other people's websites and parsing the resulting page source code. Web page analysis
Use the developer mode of the browser to analyze the page and find out the url of each image cover we need.
Href = tree.xpath ('/ / * [@ id= "features"] / div/div [1] / div/div [1] / a _ Accesshref')
We got the url of the cover, but that's not what we need. What we need is the picture in the hyperlink.
When we entered, we found that every picture was there.
Inside, how do we use a loop to get the url address of each picture
For url_img in href: img_url = requests.get (url_img,headers=head) # print (img_url.text) time.sleep (1) t = etree.HTML (img_url.text) url_list = t.xpath ("/ html/body/section/div/div/div [1] / div [2] / p [2] / img/@src")
The rest is very simple, we just need to save the file to get the results we want.
With open (f ". / img/ {name}", mode= "wb") as f: f.write (download_img.content) print ("downloading:" + name) time.sleep (1)
Complete code import timeimport requestsfrom lxml import etree def get_page_url (): for i in range (1,4): # Loop 3 page url = f "https://mm.tvv.tw/category/xinggan/{i}/" # request page to get the source code res = requests.get (url Headers=head) # parse the source code tree = etree.HTML (res.text) # to get the cover of each picture url (href) href = tree.xpath ('/ / * [@ id= "features"] / div/div [1] / div/div [1] / a features) # print ("- -") time.sleep (3) for url_img in href: img_url = requests.get (url_img Headers=head) # print (img_url.text) time.sleep (1) t = etree.HTML (img_url.text) url_list = t.xpath ("/ html/body/section/div/div/div [1] / div [2] / p [2] / img/@src") # print (url_list) time.sleep (1) For url_src in url_list: get_img (url_src) def get_img (url): name = url.rsplit ("/" 1) [1] time.sleep (2) download_img = requests.get (url,headers=head) with open (f ". / img/ {name}" Mode= "wb") as f: f.write (download_img.content) print ("downloading:" + name) time.sleep (1) f.close () if _ _ name__ ='_ _ main__': head = {"user-agent": "Mozilla/5.0 (Windows NT 10.0) Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/97.0.4692.71 Safari/537.36 "} get_page_url () so far, I believe you have a deeper understanding of" how to use python to crawl a group of pictures of little sisters ". You might as well do it in practice! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r
A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
About us Contact us Product review car news thenatureplanet
More Form oMedia: AutoTimes. Bestcoffee. SL News. Jarebook. Coffee Hunters. Sundaily. Modezone. NNB. Coffee. Game News. FrontStreet. GGAMEN
© 2024 shulou.com SLNews company. All rights reserved.