Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does the Proxy-Tunnel in the crawler switch IP independently

2025-04-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how the Proxy-Tunnel in the reptile switches IP independently". The explanation in the article is simple and clear and easy to learn and understand. Please follow the editor's train of thought to study and learn "how the Proxy-Tunnel in the reptile switches IP independently".

In the process of collecting data, we often encounter this problem: the target website needs to log in, and two requests to obtain data are completed under one IP, so we only need to set the same Proxy-Tunnel for this group of requests. For example: Proxy-Tunnel: 12345, this group requests to use the same proxy IP during the validity period of the agent. This is Yiniuyun's Proxy-Tunnel switching IP, which is suitable for businesses that need to log in, Cookie cache processing and other crawlers need to accurately control the timing of IP switching. The crawler can set the HTTP header Proxy-Tunnel: random number, when the random number is the same, the agent IP that visits the target website is the same.

#!-*-encoding:utf-8-*-import urllib2 import random import httplib class HTTPSConnection (httplib.HTTPSConnection): def set_tunnel (self, host, port=None, headers=None): httplib.HTTPSConnection.set_tunnel (self, host, port, headers) if hasattr (self 'proxy_tunnel'): self._tunnel_headers [' Proxy-Tunnel'] = self.proxy_tunnel class HTTPSHandler (urllib2.HTTPSHandler): def https_open (self, req): return urllib2.HTTPSHandler.do_open (self, HTTPSConnection, req, context=self._context) # destination page to visit targetUrlList = ["https://httpbin.org/ip", "https://httpbin.org/headers"," https://httpbin.org/user-agent", ] # proxy server (product website www.16yun.cn) proxyHost = "t.16yun.cn" proxyPort = "31111" # proxy verification information proxyUser = "username" proxyPass = "password" proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s"% {" host ": proxyHost," port ": proxyPort," user ": proxyUser "pass": proxyPass,} # set http and https access using HTTP proxy proxies = {"http": proxyMeta, "https": proxyMeta,} # set IP switch head tunnel = random.randint (1, 10000) headers = {"Proxy-Tunnel": str (tunnel)} HTTPSConnection.proxy_tunnel = tunnel proxy = urllib2.ProxyHandler (proxies) opener = urllib2.build_opener (proxy) HTTPSHandler) urllib2.install_opener (opener) # visit the website three times Using the same tunnel logo, all can maintain the same extranet IP for i in range (3): for url in targetUrlList: r = urllib2.Request (url) print (urllib2.urlopen (r). Read ()) Thank you for reading, this is the content of "how the Proxy-Tunnel in the crawler switches IP independently". After the study of this article I believe that you have a deeper understanding of how the Proxy-Tunnel in the crawler switches IP independently, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report