Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Python to grab Weibo comments regularly

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to use Python to grab Weibo comments regularly". The content in the article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "how to use Python to grab Weibo comments regularly".

[Part1-- theory]

Just imagine a question, if we want to grab the comment data of a Weibo Big V Weibo, how should we achieve it? The easiest way is to find the Weibo comment data interface, and then change the parameters to get the latest data and save it. First, look for the interface to grab comments from Weibo api, as shown in the following figure.

But unfortunately, the frequency of the interface is limited, it is banned in a few times, and it gets cool before it starts to take off.

Next, the editor selects Weibo's mobile website, logs in first, then finds the Weibo where we want to grab comments, opens the browser's own traffic analysis tool, keeps dropping comments, and finds the comment data interface, as shown in the following figure.

Then click the parameters tab, and you can see that the parameters are as shown in the following figure:

You can see that there are a total of 4 parameters, of which the first and second parameters are the id of the Weibo, just like the ID number of the person. This is equivalent to the "ID card number" of the Weibo. Max_id is a parameter for changing page numbers, which must be changed each time. The next max_id parameter value is in the returned data of this request.

[Part2-- actual combat article]

With the basics above, let's start to iterate the code and implement it using Python.

1. First, distinguish between url. The first time you do not need max_id, and the second time you need to use the max_id returned for the first time.

2. You need to bring cookie data with you when making a request. Weibo cookie is valid for a long time, which is enough to catch a comment from Weibo. Cookie data can be found in browser analysis tools.

3. Then convert the returned data into json format, and extract the data such as comment content, commentator nickname and comment time, and the output is shown in the following figure.

4. In order to save the comment content, we need to remove the facial expressions from the comments and use regular expressions to process them, as shown in the following figure.

5. Then save the content to the txt file and implement it using a simple open function, as shown in the following figure.

6. The key point is that a maximum of 16 pages of data can be returned through this API (20 entries per page). It is also said on the Internet that 50 pages can be returned, but the number of returned data items varies with different interfaces, so I added a for loop in one step. Traversal is still very powerful, as shown in the following figure.

7. The function is named job here. In order to get the latest data all the time, we can use schedule to add a timing function to the program and grab it every 10 minutes or half an hour, as shown in the following figure.

8. Do de-reprocessing to the obtained data, as shown in the following figure. If the comment is already in it, just pass it. If not, continue to add it.

The work is almost finished at this point.

Thank you for your reading, the above is the content of "how to use Python to grab Weibo comments regularly". After the study of this article, I believe you have a deeper understanding of how to use Python to grab Weibo comments regularly, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report