In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains "Python how to climb to find knowledge net material picture", the content of the article is simple and clear, easy to learn and understand, now please follow the editor's train of thought slowly in depth, together to study and learn "Python how to climb to find knowledge net material picture"!
[1. Project background]
If you want to find the right picture in the material network, you need to turn down one page after another. Now learn python and you can save all the pictures with the program and slowly select the right one.
[II. Project objectives]
1. Get the source code of the web page according to the given URL.
2. Use regular expressions to filter out the image addresses in the source code.
3. The filtered image address downloads the material picture.
[III. Libraries and websites involved]
1. The website is as follows:
Https://www.51miz.com/
2. Libraries involved: requests, lxml
[IV. Project Analysis]
First of all, you need to solve the problem of how to make a request for the URL on the next page. You can click the button on the next page and observe the changes in the website as follows:
Https://www.51miz.com/so-sucai/1789243.htmlhttps://www.51miz.com/so-sucai/1789243/p_2/https://www.51miz.com/so-sucai/1789243/p_3/
We can find that the number of pages of the picture is 1789243, and the number of curly braces in p {} indicates which page of the picture.
[v. Project implementation]
1. Open the search website and enter the picture material you want in the search (take the picture of the year of the Rat as an example).
2. According to the analysis of the URL in the previous step, we first define a class called ImageSpider, which defines the initialization function, the request to get the response data function, the parsing function and the main function. First initialize the function and prepare the url address and headers, as shown in the following figure.
3. Send a request to get the response data function.
4. Parse the data, use xpath to get the secondary page link, and finally store the picture in the folder. Select the developer tool using Google browser or press F12 directly, and find that the image src we need is under the img tag, so use Python's requests to extract the component.
5, the main function, the code is shown in the following figure.
[VI. Effect display]
1. Run the program and enter the number of pages you want to climb in the console, as shown in the following figure.
2. You can see the effect image locally, as shown in the following figure.
Thank you for your reading, the above is the content of "how Python crawls to find knowledge net material pictures". After the study of this article, I believe you have a deeper understanding of how Python crawls to find knowledge net material pictures, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.