In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly introduces the python crawler how to quickly crack the js content, has a certain reference value, interested friends can refer to, I hope you can learn a lot after reading this article, the following let the editor take you to understand it.
Preface
Generally, there are two ways to crack js, one is to rewrite js logic with Python, and the other is to use a third-party library to call js content to get the results. These two methods have their own advantages and disadvantages. The first method has good performance, but it requires high mastery of js and Python. The second method is fast and convenient and is very effective for some complex js encryption. This time we will use a third-party library to crack js.
Target website
This website is [business card]. The website encrypts the data displayed, so you can't find it directly.
Target url: https://www.qimingpian.com/finosda/project/pinvestment
Js Analysis and debugging tool
Browsers that analyze and debug js must use Google browser, which is really convenient for debugging and testing. First, we press F12 to open the developer tool, select the network option and check the preserve log option, and then enter the URL url to grab the package. At this point, you will find that there is no content displayed in the source code of the web page, and you can't find what we see in the search, which means that the content of the web page is encrypted.
At this time, you can look at each package one by one to find out what suspicious content there is. Of course, we usually look at the contents of xhr first. At a glance, we find that there is encrypt_data data in it, which looks very similar.
Let's ctrl plus shift plus f key to search encrypt_data, find the encrypt_data content in the first js, and then turn to the return e.encrypt_data line below and enter a breakpoint to see what it will be. (usually when we search for return, we have to hit a breakpoint to take a look at the content we search.)
After hitting the breakpoint, we refresh the page to observe, select the e.encrypt_data right button to have what in console, click this will appear below what our selected content is. Then try the following Object (d.a) (e.encrypt_data) in the same way and find that there is no web page content.
Be sure to pay attention to the breakpoint we hit here. If you press once and repeat the above steps, you will find that Object (d.a) (e.encrypt_data) this is what we want! E.encrypt_data, this is the encrypt_data,Object (d.a) in our xhr, which is a function that encrypts the content. We just have to crack this function and OK it.
Select Object (d.a) to see where it is, and click to jump to it. You can see that the function returned a json object. The returned result includes an s function, with only one variable a.a.decode (t) in the argument, and the rest are constants.
So we use the same method to find the details of the s function and the a.a.decode () function. The way to do this is to re-break the point here in return JSON.parse, click on the next step of the breakpoint, and then find the contents of the above function.
S function
A.a.decode ()
Use the webstorm editor to run the above functions for debugging
First of all, we install Nodejs, go to the official website to download and install, this is the js environment. This is your own search for an installation tutorial, which is not introduced too much here.
2 WebStorm is activated after installation, there are many tutorials on the Internet, you can search on your own. It is used in a similar way to PyCharm.
We deduct all the functions we need above and put them into webstorm, then run them. Note that there are no defined parameters in some functions. When we encounter these, we can find them one by one in Google browser. Generally, there are a lot of constants that can be replaced directly.
Then we call the above method to see that we can get the data normally. Here I renamed some of the methods in js. Note that here are some code snippets.
Finally, use Python to call the decryption function on the line, here in order to protect the site does not directly paste the complete code.
Thank you for reading this article carefully. I hope the article "how to quickly crack the js content of python crawler" shared by the editor will be helpful to everyone. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.