Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize reverse Wechat Index crawling by python

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/01 Report--

This article is about how python achieves reverse Wechat index crawling. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.

Wechat index crawling

Appium + mitmproxy + NetEase mumu Android Simulator implements Wechat Index Mini Program crawling

Transmit the instructions to the phone for related operations through appium, mitmproxy runs the Python script to filter out the relevant requests, and the Android simulator replaces the real machine to make the project better landing.

Build the environment of 1.MAC system Appium 1. Installation of homebrew / usr/bin/ruby-e "$(curl-fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"2.) Install nodebrew install node through brew

Check whether node is installed successfully

Node-v3. Install npmsudo bashsudo curl-L https://npmjs.org/install.sh | sh

Check whether the installation of npm is complete

Npm-v4. Install android-sdk-macosx

Link: android-sdk-macosx.

Download completed due to the lack of corresponding platform-tools and build-tools commands for sdk, check the pop-up window to download platform-tools and build-tools

5. Install jdk

Go to the official website to download: download dmg directly to install

Link: JDK

6. Environment variable configuration

You can refer to the following configuration

Cd ~

Vi .bash _ profile

JAVA_HOME=/Library/java/JavaVirtualMachines/jdk1.8.0_201.jdk/Contents/HomeCLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarPATH=$JAVA_HOME/bin:$PATH:export JAVA_HOMEexport CLASSPATHexport PATHexport ANDROID_HOME=/Users/admin/Desktop/android-sdk-macosxexport PATH=$PATH:$ANDROID_HOME/toolsexport PATH=$PATH:$ANDROID_HOME/platform-tools

Source .bash _ profile

7. Install appium-doctor

Check whether all existing environments have been successful

Npm install-g appium-doctor

After the installation of appium-doctor, enter the appium-doctor command on the terminal to automatically check whether the package on which appium depends is missing.

8. Install appium command line version npm install-g appium

Appium-v to check the version number

9. Install mitmproxy

(grab package, man-in-the-middle agent tool, support SSL)

Brew install mitmproxy

Self-study of specific usage this article is only a simple use

10. Install NetEase mumu Android Simulator

Download the Mac version directly from the official website.

two。 Wechat index Mini Program crawled 1. 5%. Start appium and enter appium2 at the terminal. Start NetEase mumu Android Simulator and install Wechat

3. View adb connected Devic

Adb devices

For the first time, you need to connect to the simulator NetEase mumu port number for 7555 terminal input

Adb connect 127.0.0.1:7555

4. Simulator installs mitmproxy certificate

Open the certificate and find the modify all trust in the keychain.

Then install the settings to open the simulator in the simulator-Security-install from the SD card

Open the internal storage space-MuMu shared folder-and drag the trusted certificate into it

5. Discover the interface of generating search_key on Wechat Index Mini Program by grabbing packets

Write a Python script to filter out the request and write the response content (search_key) of the request to the Mongo library

Import jsonimport timeimport sysfrom pymongo import MongoClientdef response (flow): client = MongoClient ("xx.xx.xx.xx" 27017) db = client ["Spider"] url = "https://search.weixin.qq.com/cgi-bin/searchweb/weapplogin" if flow.request.url.startswith (url): text = flow.response.text data = json.loads (text) search_key = data.get (" data "). Get (" search_key ") with open (". / search_key.txt " 'as') as f: f.write (search_key)''the search_key blogger enters the library and then the scrapy crawler reads the search_key from the library to make a request. How to use it according to the situation.

Execute Python scripts using mitmdump-s xxx.py

Mitmdump-s test.py

First manually click to enter the interface where Wechat Index Mini Program triggers the generation of search_key. Then mitmproxy runs the python program to filter the request according to the code and write the search_key in the response to the local file.

By this point, you should already know how Wechat Index Mini Program crawls. Here we talk about the trigger rules for generating the search_key interface: enter Wechat Index Mini Program 2 for the first time. 30 minutes search_key invalidation

6. Write appium simulation Click Wechat to enter Wechat Index Mini Program trigger search_key instruction codes import timefrom appium import webdriverfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.common.exceptions import NoSuchElementExceptionfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as ECfrom pymongo import MongoClientPLATFORM='Android'deviceName='emulator-5554'# app_package and app_activity can obtain app_package='com.tencent.mm'app_activity= through adb shell '.ui.LauncherUI'driver_server=' http://127.0.0.1:4723/wd/hub'class Moments (): def _ _ init__ (self): self.desired_caps= {' platformName':PLATFORM 'deviceName':deviceName,' appPackage':app_package, 'appActivity':app_activity,' noReset': "True",} self.driver=webdriver.Remote (driver_server,self.desired_caps) self.wait=WebDriverWait (self.driver,300) def login (self): # allow you to get xx yunxu1 = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.android.packageinstaller:id/permission_allow_button')) yunxu1.click () time.sleep (5) # allow you to get xxx yunxu2 = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.android.packageinstaller:id/permission_allow_button')) yunxu2.click () time.sleep (5) # login button login = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.tencent.mm:id/d75')) login.click () time.sleep (3) # Mobile number phone = self.wait.until ((By.ID) 'com.tencent.mm:id/hz')) phone.send_keys ("xxxxxx") time.sleep (3) # next step nextButton = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.tencent.mm:id/alr')) nextButton.click () time.sleep (2) # password passButton = self.wait.until ((By.ID) "com.tencent.mm:id/hz")) passButton.send_keys ("xxxxx") time.sleep (2) # Log in login2 = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.tencent.mm:id/alr')) login2.click () time.sleep (6) # No access to address book notButton = self.wait.until (EC.presence_of_element_located ((By.ID) "com.tencent.mm:id/an2")) notButton.click () time.sleep (5) def test (self):''after logging in, click to find the Mini Program Wechat index trigger interface' 'time.sleep (10) self.driver.tap ([(428, 1214), (471, 1251)] Time.sleep (5) # found the coordinates of the page Mini Program self.driver.tap ([(85787), (148816)], 100) time.sleep (5) self.driver.tap ([(114237), (206269)], 100) time.sleep (20) self.driver.tap ([(644, (148816)], (708), 85)] Def main (self): # first login to self.login () self.test () M=Moments () M.main ()

Solemnly declare: each operation after the first login only needs to execute the test method and click on the Discovery-Mini Program-Wechat Index. You can set not to reinstall app every time through noReset:True, so it is not necessary to log in to the account every time to add unnecessary operations.

Get appium page elements through uiautomatorviewer for positioning

Thank you for reading! This is the end of this article on "how to achieve reverse Wechat index crawling with python". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report