Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Installation and use of python crawler Mitmproxy

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces "python crawler Mitmproxy installation and use method", in daily operation, I believe many people have doubts on python crawler Mitmproxy installation and use method, Xiaobian consulted all kinds of information, sorted out simple and easy to use operation method, hope to answer "python crawler Mitmproxy installation and use method" doubts helpful! Next, please follow the small series to learn together!

directory

I. Introduction and installation

1.1 Concept and role

concept

role

1.2 Installation

1.3 Introduction to tools

II. Setting up agents

2.1 PC side setting agent

2.2 PC installation certificate

2.3 Mobile Settings Agent

Third, mitmdump

3.1 Plug-in usage

3.2 Common Events

3.2.1 request event

3.2.2 Response to events

3.3 Download pictures

I. Introduction and installation 1.1. Concept and function concept

Mitmproxy is a free, open source, interactive HTTPS proxy. MITM stands for Man-in-the-Middle Attack.

role

The role of proxy, forwarding requests, ensuring communication between server and client

View, record, modify data, cause specific behavior on the server or client

Supplement: Mitmproxy, Fiddler and Charles

The same points: a, are used to capture HTTP, HTTPS requests (other protocols such as TCP,UDP,IP,ICMP, etc. use Wireshark)

     b. Packet capture, breakpoint debugging, request replacement, construction request, simulation of weak network, etc.

Difference: a, Fiddler can only run on Windows system;Mitmproxy, Charles is cross-platform, can run on Windows, Mac or Linux system.

    b, Fiddler, Mitmproxy open source free, Charles is charged (can be cracked).

    c. Mitmproxy supports command line interaction mode and GUI interface, Fiddler and Charles only support GUI interface.

(There's a command-line tool at the bottom of Fiddler called QuickExec)

1.2 install pip install mitmproxy

or

pip install -i https://pypi.douban.com/simple mitmproxy

If the speed of direct installation is too slow, you can add a domestic mirror source to the command line to speed up, which is the second command. Note: Python version is not lower than 3.6

Check whether the installation is successful: enter the command in the command line to view mitmdump --version

After successful installation, you can find mitmdump.exe, mitmproxy.exe, mitmweb.exe in Python installation path Script path.

1.3 Introduction to tools

mitmproxy: command-line interface, allows interactive inspection and modification of http data streams, does not support windows

mitmweb: Web interface where users can see requests in real time, filter requests, and view request data

mitmdump: a command-line tool, no interface, no interaction, but can be customized by starting parameters and combining custom scripts, is the environment we run in.

These three commands have the same function and can load custom scripts. The only difference is the interaction interface.

mitmproxy, mitmweb is mainly used for debugging, and mitmdump is used when deploying projects.

II. Setting Agent 2.1. Setting Agent on PC

When you turn on an agent, you need to turn off all other agents.

Open proxy

Note: At this time we have just opened the proxy, has not installed the certificate, if you visit other websites will appear error, as shown in the following figure:

2.2 PC installation certificate

In proxy state, visit http://mitm.it/, PC side and mobile side operate the same.

(Note: After setting is completed, the browser opens the webpage and finds that it is not connected to the network. You need to start mitmweb.exe or mitmdump.exe program before opening the link)

Download the corresponding certificate installation according to your system environment

Click on the download certificate and follow the steps to import it

2.3 Mobile Settings Agent

Take Night God Simulator as an example (pay attention to ensure that mobile phones and computers are under the same local area network)

After setting up proxy, open browser to visit mitm.it/

Download installation certificate III. mitmdump

Official Document: docs.mitmproxy.org/stable/addons-overview/

3.1 Plug-in usage

The essence of a plug-in is a script file, which in Python is an instance object of a class.

Here the plug-in is a Counter instance object and the request method is an event.

For the request event, its argument is an object of mitmproxy.http.HTTPFlow.

Example: (Official document example)

"""Basic skeleton of a mitmproxy addon.Run as follows: mitmproxy -s anatomy.py"""from mitmproxy import ctxclass Counter: def __init__(self): self.num = 0 def request(self, flow): self.num = self.num + 1 ctx.log.info("We've seen %d flows" % self.num)addons = [ Counter()]

Above is a simple plug-in for tracking the number of streams (or more specifically HTTP requests) we have seen. Each time it sees new traffic, it uses mitmproxy's internal logging mechanism to announce its prompt. Output can be found in the event log of the interactive tool or in the console of mitmdump.

You can use mitmdump -s ./ anatomy.py runs the plug-in (anatomy.py is the file name created).

3.2 def request(self, flow: mitmproxy.http.HTTPFlow): """ The full HTTP request has been read. """def response(self, flow: mitmproxy.http.HTTPFlow): """ The full HTTP response has been read. """3.2.1, request event

(Note: remember to execute in proxy state, command:mitmdump -s ./ xxx.py)

from mitmproxy import httpdef request(flow:http.HTTPFlow): #Note that the function name request cannot be written incorrectly #Get Request Header Information print ('request header', flow.request.headers) #Complete request address print ('request url', flow.request.url) #Domain Name print ('domain ',flow.request.host) #Request path url content other than domain name print ('request path', flow.request.path) #Return MultiDictView data, URL key parameters print ('url ',flow.request.query) #Request method print ('request method', flow.request.method) #Request Type print ('request type', flow.request.scheme) #Get content requested ''' print ('request content', flow.request.get_text) print ('request content type', type(flow.request.get_text)) print ('request content bytes', flow.request.raw_content) print ('request content bytes', flow.request.get_content) ''' if 'https://www.baidu.com' in flow.request.url: #Get the value of the request parameter wd print(flow.request.query.get('wd')) #Get all request parameters print(list(flow.request.query.keys())) #Modify Request Parameters flow.request.query.set_all('wd',['python']) #Print modified parameters print(flow.request.query.get ('wd '))3.2.2, response event

(Note: remember to execute in proxy state, command:mitmdump -s ./ xxx.py)

from mitmproxy import httpdef response(flow:http.HTTPFlow): #Note that the function name response cannot be written incorrectly #Status Code print ('status code', flow.response.status_code) #Return content, decoded print ('return content', flow.response.text) #Return content, bytes type print ('return content bytes type', flow.response.content) #Get the text of the response print ('text should', flow.response.get_text) #Modify the text of the response flow.response.set_text ('Your response has been modified! ')3.3. Download pictures

(Note: remember to execute in proxy state, command:mitmdump -q -s ./ xxx.py plus-q will make the print clearer and more visible)

import osindex = 0def response(flow): global index print ('=============== print(flow.request.url) if flow.request.url[-3:] == 'jpg': dir = 'images' if not os.path.exists(dir): os.mkdir(dir) filename = dir+'/'+str(index)+'.jpg' with open(filename,'wb') as f: f.write(flow.response.get_content()) index+=1 At this point, the study of "Python crawler Mitmproxy installation and use method" is over, I hope to solve everyone's doubts. Theory and practice can better match to help you learn, go and try it! If you want to continue learning more relevant knowledge, please continue to pay attention to the website, Xiaobian will continue to strive to bring more practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report