In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
Today, we will teach you to use component combinations to make a web page picture crawler.
Components to be used: loop controller + counter + xpath advancer + function nesting + beanshell code
First of all, let's identify the photo website to crawl: https://dp.pconline.com.cn/list/all_t5.html.
Looking at his html code through F12, it is found that these pictures are all src images accessed after jumping through the href link.
Then we can consider initiating a request to the website first and extracting the src of the picture through the xpath expression, that is, the access link to the picture.
Then the titles of these pictures are extracted by xpath.
Because considering that each group of pictures and titles are one-to-one correspondence, the matching numbers extracted here can also be regarded as one-to-one correspondence.
In dubug, we can see that both the url and the title are extracted 50 and correspond one to one.
Let's add a loop controller, and the number of cycles is the matchNr in debug.
Add a counter to the loop controller to calculate the number of times crawling needs to be performed
Add a http request. Under the Loop Controller, looping initiates the request to the url obtained in debug, while traversing and writing the picture title.
After each request, the image obtained by the request is written locally through the code, and the name of the image is written through traversal.
Execute the script to observe the response result and observe the local file write result
Tip: this set of crawler scripts abandons the previous method of crawling through the foreach controller. Instead, through the way of function nesting, multiple parameters are traversed synchronously for file writing. The difficulty lies in the understanding of function nesting traversal and the positioning of xpath elements. I hope you can study it carefully and discuss with me if you have any questions.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.