Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to implement Douyin Video Crawler by java+pycharm

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article is about java+pycharm how to achieve audio and video crawler content. Xiaobian thinks it is quite practical, so share it with everyone for reference. Let's follow Xiaobian and have a look.

1. Tools used:

Charles (any bag grab tool can be used, which is convenient to use)

Dynamic field: x-gorgon: 0408***(beginning)

Vibrato Version: 12.8.0 (the latest version at the time of posting) or Vibrato Speed Edition (fewer files, faster compilation)

IDA or JEB

Jadx-gui

frida

Pycharm

root Android or emulator

2. Decompilation:

Apk directly thrown into Jadx-gui or Apk==>zip ==> decompress ==> .dex All packed and thrown into Jadx-gui

Keyword search: x-girgon

cdn.nlark.com/yuque/0/2020/png/97322/1607128046539-933da4e3-cc27-4aa6-86dd-93ecf6c9281f.png? x-oss-process=image/resize,w_1500">

First look at the call:

The full name of this so file: libcms.so apk can be found in the zip file.

Next, look at the declaration of the function:

Code can not be perfectly restored by tools, tried a few tools are like this ball, do not waste time tossing. Look at it directly, good foundation can directly look at the Smali code.

You can see that r8 is the value you want. Look up r8.

Where com.ss.a.b.a.a () is an overloaded function, you need to pay attention to it when hooking.

Frida code: note overload("")

This code can be displayed normally, and it is easy to restore:

Now comes the hard part:

byte[] r0 = com.ss.sys.ces.a.leviathan(r8, r7, r0)

This is not easy to restore, three parameters:

The leviathan () function modified by native, the corresponding method body is in the earliest loaded www.example.com. cms.so

You can use ida or jeb to check, and you will find that there are no functions everywhere, which use encryption measures such as flower instruction obfuscation.

Cracking is too difficult, you can consider changing a simple way to call to solve

Priority recommendation: frida-rpc, implementation ideas are as follows: specific self-improvement

(Java that I use myself makes active calls. If you want to know, you can leave a message after paying attention, or WeChat)

This method is relatively simple and can also open micro services.

The parameters passed in can be obtained using the frida hook

The output is:

Parameter 1: -1

Parameter 2: The ten-bit timestamp (from the thirteen bit timestamp in the url parameter) is the same as x-khronos.

Parameter 3: The data part of the post parameter is r0 in the figure below.

Parameter arrangement:

r0 = md5(url?The parameters of the URL are md5.

r13 = x-ss-stub, valid only if post, otherwise 32 zeros

r11 = md5(cookie) md5 for cookies

r12 = md5(cookie ['sessionid ']) md5 for sessionid in cookie, otherwise 32 zeros

Thank you for reading! About "java+pycharm how to achieve audio and video crawler" This article is shared here, I hope the above content can be of some help to everyone, so that everyone can learn more knowledge, if you think the article is good, you can share it to let more people see it!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report