In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article is about how python achieves reverse Wechat index crawling. The editor thinks it is very practical, so share it with you as a reference and follow the editor to have a look.
Wechat index crawling
Appium + mitmproxy + NetEase mumu Android Simulator implements Wechat Index Mini Program crawling
Transmit the instructions to the phone for related operations through appium, mitmproxy runs the Python script to filter out the relevant requests, and the Android simulator replaces the real machine to make the project better landing.
Build the environment of 1.MAC system Appium 1. Installation of homebrew / usr/bin/ruby-e "$(curl-fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"2.) Install nodebrew install node through brew
Check whether node is installed successfully
Node-v3. Install npmsudo bashsudo curl-L https://npmjs.org/install.sh | sh
Check whether the installation of npm is complete
Npm-v4. Install android-sdk-macosx
Link: android-sdk-macosx.
Download completed due to the lack of corresponding platform-tools and build-tools commands for sdk, check the pop-up window to download platform-tools and build-tools
5. Install jdk
Go to the official website to download: download dmg directly to install
Link: JDK
6. Environment variable configuration
You can refer to the following configuration
Cd ~
Vi .bash _ profile
JAVA_HOME=/Library/java/JavaVirtualMachines/jdk1.8.0_201.jdk/Contents/HomeCLASSPAHT=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jarPATH=$JAVA_HOME/bin:$PATH:export JAVA_HOMEexport CLASSPATHexport PATHexport ANDROID_HOME=/Users/admin/Desktop/android-sdk-macosxexport PATH=$PATH:$ANDROID_HOME/toolsexport PATH=$PATH:$ANDROID_HOME/platform-tools
Source .bash _ profile
7. Install appium-doctor
Check whether all existing environments have been successful
Npm install-g appium-doctor
After the installation of appium-doctor, enter the appium-doctor command on the terminal to automatically check whether the package on which appium depends is missing.
8. Install appium command line version npm install-g appium
Appium-v to check the version number
9. Install mitmproxy
(grab package, man-in-the-middle agent tool, support SSL)
Brew install mitmproxy
Self-study of specific usage this article is only a simple use
10. Install NetEase mumu Android Simulator
Download the Mac version directly from the official website.
two。 Wechat index Mini Program crawled 1. 5%. Start appium and enter appium2 at the terminal. Start NetEase mumu Android Simulator and install Wechat
3. View adb connected Devic
Adb devices
For the first time, you need to connect to the simulator NetEase mumu port number for 7555 terminal input
Adb connect 127.0.0.1:7555
4. Simulator installs mitmproxy certificate
Open the certificate and find the modify all trust in the keychain.
Then install the settings to open the simulator in the simulator-Security-install from the SD card
Open the internal storage space-MuMu shared folder-and drag the trusted certificate into it
5. Discover the interface of generating search_key on Wechat Index Mini Program by grabbing packets
Write a Python script to filter out the request and write the response content (search_key) of the request to the Mongo library
Import jsonimport timeimport sysfrom pymongo import MongoClientdef response (flow): client = MongoClient ("xx.xx.xx.xx" 27017) db = client ["Spider"] url = "https://search.weixin.qq.com/cgi-bin/searchweb/weapplogin" if flow.request.url.startswith (url): text = flow.response.text data = json.loads (text) search_key = data.get (" data "). Get (" search_key ") with open (". / search_key.txt " 'as') as f: f.write (search_key)''the search_key blogger enters the library and then the scrapy crawler reads the search_key from the library to make a request. How to use it according to the situation.
Execute Python scripts using mitmdump-s xxx.py
Mitmdump-s test.py
First manually click to enter the interface where Wechat Index Mini Program triggers the generation of search_key. Then mitmproxy runs the python program to filter the request according to the code and write the search_key in the response to the local file.
By this point, you should already know how Wechat Index Mini Program crawls. Here we talk about the trigger rules for generating the search_key interface: enter Wechat Index Mini Program 2 for the first time. 30 minutes search_key invalidation
6. Write appium simulation Click Wechat to enter Wechat Index Mini Program trigger search_key instruction codes import timefrom appium import webdriverfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.common.exceptions import NoSuchElementExceptionfrom selenium.webdriver.common.by import Byfrom selenium.webdriver.support.ui import WebDriverWaitfrom selenium.webdriver.support import expected_conditions as ECfrom pymongo import MongoClientPLATFORM='Android'deviceName='emulator-5554'# app_package and app_activity can obtain app_package='com.tencent.mm'app_activity= through adb shell '.ui.LauncherUI'driver_server=' http://127.0.0.1:4723/wd/hub'class Moments (): def _ _ init__ (self): self.desired_caps= {' platformName':PLATFORM 'deviceName':deviceName,' appPackage':app_package, 'appActivity':app_activity,' noReset': "True",} self.driver=webdriver.Remote (driver_server,self.desired_caps) self.wait=WebDriverWait (self.driver,300) def login (self): # allow you to get xx yunxu1 = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.android.packageinstaller:id/permission_allow_button')) yunxu1.click () time.sleep (5) # allow you to get xxx yunxu2 = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.android.packageinstaller:id/permission_allow_button')) yunxu2.click () time.sleep (5) # login button login = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.tencent.mm:id/d75')) login.click () time.sleep (3) # Mobile number phone = self.wait.until ((By.ID) 'com.tencent.mm:id/hz')) phone.send_keys ("xxxxxx") time.sleep (3) # next step nextButton = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.tencent.mm:id/alr')) nextButton.click () time.sleep (2) # password passButton = self.wait.until ((By.ID) "com.tencent.mm:id/hz")) passButton.send_keys ("xxxxx") time.sleep (2) # Log in login2 = self.wait.until (EC.presence_of_element_located ((By.ID) 'com.tencent.mm:id/alr')) login2.click () time.sleep (6) # No access to address book notButton = self.wait.until (EC.presence_of_element_located ((By.ID) "com.tencent.mm:id/an2")) notButton.click () time.sleep (5) def test (self):''after logging in, click to find the Mini Program Wechat index trigger interface' 'time.sleep (10) self.driver.tap ([(428, 1214), (471, 1251)] Time.sleep (5) # found the coordinates of the page Mini Program self.driver.tap ([(85787), (148816)], 100) time.sleep (5) self.driver.tap ([(114237), (206269)], 100) time.sleep (20) self.driver.tap ([(644, (148816)], (708), 85)] Def main (self): # first login to self.login () self.test () M=Moments () M.main ()
Solemnly declare: each operation after the first login only needs to execute the test method and click on the Discovery-Mini Program-Wechat Index. You can set not to reinstall app every time through noReset:True, so it is not necessary to log in to the account every time to add unnecessary operations.
Get appium page elements through uiautomatorviewer for positioning
Thank you for reading! This is the end of this article on "how to achieve reverse Wechat index crawling with python". I hope the above content can be of some help to you, so that you can learn more knowledge. if you think the article is good, you can share it out for more people to see!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.