In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-24 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
How to build their own agent ip pool, in view of this problem, this article introduces the corresponding analysis and solutions in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.
Today's article will set up its own agent ip pool, so, in order not to break its promise, I wrote this article, so let's get down to business.
one
Target website
Crawl agent ip, which also needs to find web pages, which depends on which page provides these proxy ip. I know several websites that provide proxy ip for free, as follows:
Worry-free agent ip
Sesame agent ip
West thorn agent ip
Cloud Link Agent ip
I chose to climb the website of the western thorn agent.
two
Analyze the structure of the website
We need to get the Gaoni agent, press F12 to open the developer tool
The data we need to get above is the ip address, port and type. As you can see, these data are all in one tr tag, but there are two different tr tags, so we can use regular expressions and use the separated html structure to match the whole content first, and then match the important information. Finally, put him like this {'https':' https://ip: port'} into the list, and finally get an ip at random, and then you can first judge whether it is useful, and then use it as the proxy ip for your project at this time. The method to determine whether to use it is to casually take a Baidu to get another website, plus the agent ip to send a get request to see if the return code of status_code () is 200. it's fine, like the following.
three
Code part
1. Match the data and select the data to store in the list
two。 Get the ip randomly and write the ip format
I put him on the list here and use it now, because my current crawler projects are very small, and that's all I need.
The above is my simple proxy ip pool. When it is gradually perfected later, you can store them in your database, and then take them out at random to see if they are useful, delete them if they are useless, and use them if you want to use them.
This is the answer to the question about how to build your own agent ip pool. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel to learn more about it.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.