In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly introduces "python crawler Mitmproxy installation and use method", in daily operation, I believe many people have doubts on python crawler Mitmproxy installation and use method, Xiaobian consulted all kinds of information, sorted out simple and easy to use operation method, hope to answer "python crawler Mitmproxy installation and use method" doubts helpful! Next, please follow the small series to learn together!
directory
I. Introduction and installation
1.1 Concept and role
concept
role
1.2 Installation
1.3 Introduction to tools
II. Setting up agents
2.1 PC side setting agent
2.2 PC installation certificate
2.3 Mobile Settings Agent
Third, mitmdump
3.1 Plug-in usage
3.2 Common Events
3.2.1 request event
3.2.2 Response to events
3.3 Download pictures
I. Introduction and installation 1.1. Concept and function concept
Mitmproxy is a free, open source, interactive HTTPS proxy. MITM stands for Man-in-the-Middle Attack.
role
The role of proxy, forwarding requests, ensuring communication between server and client
View, record, modify data, cause specific behavior on the server or client
Supplement: Mitmproxy, Fiddler and Charles
The same points: a, are used to capture HTTP, HTTPS requests (other protocols such as TCP,UDP,IP,ICMP, etc. use Wireshark)
b. Packet capture, breakpoint debugging, request replacement, construction request, simulation of weak network, etc.
Difference: a, Fiddler can only run on Windows system;Mitmproxy, Charles is cross-platform, can run on Windows, Mac or Linux system.
b, Fiddler, Mitmproxy open source free, Charles is charged (can be cracked).
c. Mitmproxy supports command line interaction mode and GUI interface, Fiddler and Charles only support GUI interface.
(There's a command-line tool at the bottom of Fiddler called QuickExec)
1.2 install pip install mitmproxy
or
pip install -i https://pypi.douban.com/simple mitmproxy
If the speed of direct installation is too slow, you can add a domestic mirror source to the command line to speed up, which is the second command. Note: Python version is not lower than 3.6
Check whether the installation is successful: enter the command in the command line to view mitmdump --version
After successful installation, you can find mitmdump.exe, mitmproxy.exe, mitmweb.exe in Python installation path Script path.
1.3 Introduction to tools
mitmproxy: command-line interface, allows interactive inspection and modification of http data streams, does not support windows
mitmweb: Web interface where users can see requests in real time, filter requests, and view request data
mitmdump: a command-line tool, no interface, no interaction, but can be customized by starting parameters and combining custom scripts, is the environment we run in.
These three commands have the same function and can load custom scripts. The only difference is the interaction interface.
mitmproxy, mitmweb is mainly used for debugging, and mitmdump is used when deploying projects.
II. Setting Agent 2.1. Setting Agent on PC
When you turn on an agent, you need to turn off all other agents.
Open proxy
Note: At this time we have just opened the proxy, has not installed the certificate, if you visit other websites will appear error, as shown in the following figure:
2.2 PC installation certificate
In proxy state, visit http://mitm.it/, PC side and mobile side operate the same.
(Note: After setting is completed, the browser opens the webpage and finds that it is not connected to the network. You need to start mitmweb.exe or mitmdump.exe program before opening the link)
Download the corresponding certificate installation according to your system environment
Click on the download certificate and follow the steps to import it
2.3 Mobile Settings Agent
Take Night God Simulator as an example (pay attention to ensure that mobile phones and computers are under the same local area network)
After setting up proxy, open browser to visit mitm.it/
Download installation certificate III. mitmdump
Official Document: docs.mitmproxy.org/stable/addons-overview/
3.1 Plug-in usage
The essence of a plug-in is a script file, which in Python is an instance object of a class.
Here the plug-in is a Counter instance object and the request method is an event.
For the request event, its argument is an object of mitmproxy.http.HTTPFlow.
Example: (Official document example)
"""Basic skeleton of a mitmproxy addon.Run as follows: mitmproxy -s anatomy.py"""from mitmproxy import ctxclass Counter: def __init__(self): self.num = 0 def request(self, flow): self.num = self.num + 1 ctx.log.info("We've seen %d flows" % self.num)addons = [ Counter()]
Above is a simple plug-in for tracking the number of streams (or more specifically HTTP requests) we have seen. Each time it sees new traffic, it uses mitmproxy's internal logging mechanism to announce its prompt. Output can be found in the event log of the interactive tool or in the console of mitmdump.
You can use mitmdump -s ./ anatomy.py runs the plug-in (anatomy.py is the file name created).
3.2 def request(self, flow: mitmproxy.http.HTTPFlow): """ The full HTTP request has been read. """def response(self, flow: mitmproxy.http.HTTPFlow): """ The full HTTP response has been read. """3.2.1, request event
(Note: remember to execute in proxy state, command:mitmdump -s ./ xxx.py)
from mitmproxy import httpdef request(flow:http.HTTPFlow): #Note that the function name request cannot be written incorrectly #Get Request Header Information print ('request header', flow.request.headers) #Complete request address print ('request url', flow.request.url) #Domain Name print ('domain ',flow.request.host) #Request path url content other than domain name print ('request path', flow.request.path) #Return MultiDictView data, URL key parameters print ('url ',flow.request.query) #Request method print ('request method', flow.request.method) #Request Type print ('request type', flow.request.scheme) #Get content requested ''' print ('request content', flow.request.get_text) print ('request content type', type(flow.request.get_text)) print ('request content bytes', flow.request.raw_content) print ('request content bytes', flow.request.get_content) ''' if 'https://www.baidu.com' in flow.request.url: #Get the value of the request parameter wd print(flow.request.query.get('wd')) #Get all request parameters print(list(flow.request.query.keys())) #Modify Request Parameters flow.request.query.set_all('wd',['python']) #Print modified parameters print(flow.request.query.get ('wd '))3.2.2, response event
(Note: remember to execute in proxy state, command:mitmdump -s ./ xxx.py)
from mitmproxy import httpdef response(flow:http.HTTPFlow): #Note that the function name response cannot be written incorrectly #Status Code print ('status code', flow.response.status_code) #Return content, decoded print ('return content', flow.response.text) #Return content, bytes type print ('return content bytes type', flow.response.content) #Get the text of the response print ('text should', flow.response.get_text) #Modify the text of the response flow.response.set_text ('Your response has been modified! ')3.3. Download pictures
(Note: remember to execute in proxy state, command:mitmdump -q -s ./ xxx.py plus-q will make the print clearer and more visible)
import osindex = 0def response(flow): global index print ('=============== print(flow.request.url) if flow.request.url[-3:] == 'jpg': dir = 'images' if not os.path.exists(dir): os.mkdir(dir) filename = dir+'/'+str(index)+'.jpg' with open(filename,'wb') as f: f.write(flow.response.get_content()) index+=1 At this point, the study of "Python crawler Mitmproxy installation and use method" is over, I hope to solve everyone's doubts. Theory and practice can better match to help you learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.