In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-14 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly explains the "anti-climbing strategies and solutions common on the website". The content of the explanation in the article is simple and clear, and it is easy to learn and understand. let's study and learn the common anti-climbing strategies and solutions on the website.
1. Limit the frequency of users'IP access only
Limiting the frequency of IP access is usually shown as follows: when the collection speed using local IP is higher than a certain frequency, there will be collection errors, page redirects, and so on. Storing the IP information of visitors in cookie increases the difficulty of the crawler.
Solution:
(1) when there is no IP record in cookie, you need to use dynamic short-term proxy IP/ tunnel proxy IP to limit IP strength according to the website, adjust the collection speed, purchase appropriate proxy IP, and set it to ForeSpider crawler IP proxy.
(2) when recording IP in cookie, we need to use static persistent proxy IP. According to the IP limit of the website, adjust the collection speed, purchase an appropriate amount of proxy IP, and set it to the IP agent of ForeSpider data acquisition system.
2. Restrict users'ID access
Frequency limit user identification is usually shown as follows: after collecting for a period of time, stop collecting / collecting errors, and the page in the browser cannot be displayed (page redirection, CAPTCHA, error page, etc.). . After emptying the browsing history of the browser, you can open it again and display normally.
At this point, you can check the cookie of the page to confirm whether the server restricts the user ID. When there is a UID or other ID string in the cookie that visits the page, it indicates that the server has recognized the user ID. There is also a case where UID is encrypted, where there is a string of encrypted strings in the cookie.
Methods: the multi-channel collection function was used in the advanced setting of ForeSpider collector, the maximum number of login users was set, the proxy IP was set (using static persistent proxy IP), and the limitation of website ID was solved by simulating multi-user browsing the website.
3. User IP access frequency
The double restrictions on the access frequency of users'ID are usually shown as follows: after a certain period of time, the collection / collection error stops, and the page cannot be displayed in the browser (page redirection, CAPTCHA, error page, etc.). . After emptying the browsing record of the browser, open it again and display normally.
After the crawler is set to multi-channel collection, after collecting for a period of time, it is found that the intellectual property is sealed. You can also judge by observing whether there is IP and UID/UID encrypted information in the page cookie.
Solution: use the multi-channel acquisition function in the advanced settings of the ForeSpider data acquisition system, turn on dynamic IP locking, set proxy IP (using static long-term proxy IP), set the maximum number of logged-in users, and solve the account restrictions imposed by the website.
4. Restricting the access frequency of user accounts is usually shown as follows:
The website needs to be logged in, and the accounts collected after login are sealed. This situation is generally caused by the server identifying the user account and restricting the access frequency of the user account.
Solution: register multiple accounts, seal and replace
5. Double restrictions on user account access frequency and user IP access frequency
Generally speaking, the website needs to log in, the account collected after login is blocked, and IP is also blocked. Using multichannel or proxy IP collection is not valid. This situation is caused by the server's double restrictions on user accounts and access to IP.
Thank you for your reading. the above is the content of "Anti-climbing Strategies and Solutions commonly used on websites". After the study of this article, I believe you have a deeper understanding of the common anti-climbing strategies and solutions of the website. the specific use of the situation also needs to be verified by practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.