In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "the method of android Douyin data collection". In the daily operation, I believe that many people have doubts about the method of android Douyin data collection. The editor consulted all kinds of information and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "android Douyin data collection method". Next, please follow the editor to study!
The tools used this time: mobile automation tool Appium, Night God simulator (can also be replaced by a real machine), adb tool.
Preparation of operating environment
To start the preparatory work for data acquisition, on the basis of the previous article, to build an automated environment, you need to configure android-sdk first. For installation tutorials, please refer to the following link. Finally, you need to verify whether the adb command is available. Run adb version in the command line window, and the version number information appears, indicating that the adb tool is available. For android sdk download and installation tutorials, please refer to the following link:
Https://www.cnblogs.com/woniu123/p/10755262.html
Once android-sdk is configured, you can install Appium. The download address is as follows. Here we choose the appium-desktop-setup-1.9.0.exe version:
Https://github.com/appium/appium-desktop/releases/download/v1.9.0/appium-desktop-setup-1.9.0.exe
After downloading, the installation is basically the next step. After the installation is completed, start the application. The following window appears to prove that the installation is successful:
Click "Start Server V 1.9.0" to start the service, and the following page proves that the service starts successfully. The port is 4723:
When you open the previously configured simulator and run adb devices in the command line window, the connected simulator device will appear, proving that the running environment is ready to complete
Next is the runtime environment configuration. Click appium's Start Inspector Session.
The following startup parameters need to be configured:
{"platformName": "Android", "platformVersion": "5.1.1", "deviceName": "127.0.0.1 deviceName", "appPackage": "com.ss.android.ugc.aweme", "appActivity": "com.ss.android.ugc.aweme.main.MainActivity", "noReset": true}
PlatformName: the platform on which the simulator runs. Enter Android.
PlatformVersion: check the Android version of the simulator and enter it.
DeviceName: here is the device obtained by running the adb devices command. The current simulator is 127.0.0.1 62001.
AppPackage and appActivity: the package name and startup method name of Douyin app, which can be obtained from the aapt.exe tool under android-sdk\ build-tools\ 29.0.2
After the configuration is finished, click Star Session and see that the simulator starts Douyin app, which proves that the environment is configured correctly.
Business scenario description
With the running environment, next introduce the requirements of this time, open the Douyin app installed in the simulator, first slide down to refresh the video, and then enter the user's home page to collect home page data, follow data, fan data, works and favorite tabs. The corresponding actions to be done by appium are as follows:
1. Slide down and refresh the video
two。 Slide left to enter the user's home page
3. Click the follow button
4. Start to slide down the watch list until "there is no more for the time being."
5. Return to the user's home page
6. Click the fan button
7. Start to slide down the fan list until "there are no more for the time being."
8. Return to the user's home page
9. Click the works page to sign
10. Slide down the video list of works until "No more for the time being" appears.
11. Click on the favorite tab
twelve。 Slide down the list of favorite videos until "No more for the time being" appears.
13. Return to the video page and repeat step 1
Code preparation
Install the Appium client for python:
Pip install Appium-Python-Client
Get ready to do the code.
1. Start app
Device_name = '127.0.0.1:62001'device_port =' 4723'desired_caps = {"platformName": "Android", "platformVersion": "5.1.1", "deviceName": device_name, "appPackage": "com.ss.android.ugc.aweme", "appActivity": "com.ss.android.ugc.aweme.main.MainActivity", "noReset": True, "unicodeKeyboard": True "resetKeyboard": True} device_driver = webdriver.Remote ('http://127.0.0.1:' + str (device_port) +' / wd/hub', desired_caps)
After waiting for app to be launched, we begin to process business 1, slide down and refresh the video, and call our own encapsulated sliding method here:
Swipe_page (device_driver, 0.5,0.25,0.5,0.75) def swipe_page (driver, x1, y1, x2, y2): screen = AppiumOprationPage.get_size (driver) screen_x1 = int (screen [0] * x1) screen_y1 = int (screen [1] * y1) screen_x2 = int (screen [0] * x2) screen_y2 = int (screen [1] * y2) driver.swipe (screen_x1, screen_y1) Screen_x2, screen_y2)
After waiting for the video to refresh, execute Business 2, and swipe left to enter the user's home page:
Flick_page (device_driver, 0.8,0.5,0.2,0.5)
Here the flick method and swipe method are the same, appium provides two sliding methods, swipe is ordinary sliding, sliding through the given coordinates, flick is fast sliding, after sliding through the given coordinates, start to decelerate sliding until stop, the sliding speed is fast.
After entering the user's home page, we need to determine the button's id, tag location and other parameters to click. Next, we mainly describe how to obtain the tag location of "follow":
After launching app using appium, manually slide to the user's home page, refresh the refresh button in the middle of the appium page, and click "follow" on the left. At this time, you can see the xml structure listed in the middle and the basic information of the button on the right. Through this information, you can get a Xpath of the follow button:
/ / android.widget.TextView [@ text=' follow']
Click on this button to enter the follow page and start the cycle down to the end:
Driver.find_element_by_xpath ("/ / android.widget.TextView [@ text=' follow']") .click () flick_page (device_driver, 0.5,0.75,0.5,0.25)
After completing the slide, you need to use the same method to get the xpath of the button to return to the previous layer:
/ / android.widget.ImageView [@ resource-id='com.ss.android.ugc.aweme:id/nj']
Then click to return to the previous level to return to the user's home page:
Driver.find_element_by_xpath ("/ / android.widget.ImageView [@ resource-id='com.ss.android.ugc.aweme:id/nj']") .click ()
Tips:
1. Do not use the absolute path to obtain the xpath. After a lot of testing, the absolute path is different in different environments, but the relative path is more stable.
two。 You can use some page text elements, id for relative positioning, and then get the final required elements
3. Do not use resource-id for positioning, after a lot of testing, this id is not unique, can only be located to the first
4. You can also use the uiautomatorviewer tool under android-sdk\ tools to locate xpath, but you need to upgrade uiautomatorviewer. After a lot of tests, uiautomatorviewer cannot get the xpath of some high versions of Douyin app.
Through the same method, click [follow] [fans] [works] [like] respectively to carry out a complete operation. After using the mitmproxy mentioned last time as a proxy, parsing all the data into the database, you can collect all the data into your own database, or download the video to the local hard disk.
Advanced level
After a large number of tests, the data collected every day is very limited, and the problems are as follows:
1. A simulator with limited sliding speed
two。 The efficiency of data analysis is not high.
In view of the above two problems, a new scheme is added later, which supports the horizontal expansion of the simulator (which requires the computer hardware to reach the standard), as well as the distributed analysis of data and batch storage.
After the completion of the new scheme, two simulators were used in the two-day test. On the first day, 530000 data were collected in 10 hours, while on the second day, the performance test was done, and the amount of data reached 1.116 million in 10 hours. I feel that the data analysis has not reached saturation, and it is predicted that I can drag 4 simulators, but my computer hardware configuration is not so high. I can't run 4 simulators. So there is no limit test.
The following is a screenshot of the sliding process of the two simulators and a statistical chart of the amount of data collected every day:
At this point, the study of "android Douyin data collection method" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.