Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does Python climb Tencent Video's on-screen comment

2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how Python climbs Tencent Video's on-screen comment". In daily operation, I believe that many people have doubts about how Python climbed Tencent Video's on-screen comment. The editor consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the question of "how does Python climb Tencent Video's on-screen comment". Next, please follow the editor to study!

Basic development environment

Python 3.6

Pycharm

Use of related modules

Jieba

Wordcloud

Install Python and add it to the environment variable, and pip installs the relevant modules you need.

First, define the needs

Choose to crawl the on-screen comment sent by netizens.

Second, analyze the web page data

Copy the on-screen comment from the web page and search it in the developer's tool.

There is the corresponding on-screen comment data. One small feature of this url address is that the link contains danmu, so give it a try, filter and search for the keyword danmu to see if there is anything like it.

You can climb the entire video on-screen comment by iterating through it.

Third, analyze the data

Here I would like to ask, what kind of data do you think is returned by requesting this url address? You'll have three seconds to think about it.

1... 2... 3...

OK, now the answer is announced. It's a string. You heard it right. If you get respons.json () directly, you will report an error.

So how can it program json data? after all, json data is better to extract data.

The first method

Regular matching extracts data from the middle part of the data

Import json module, string to json data

Import requestsimport reimport jsonimport pprinturl = 'https://mfm.video.qq.com/danmu?otype=json&callback=jQuery19108312825154929784_1611577043265&target_id=6416481842%26vid%3Dt0035rsjty9&session_key=30475%2C0%2C1611577043 × tamp=105&_=1611577043296'headers = {' user-agent': 'Mozilla/5.0 (Windows NT 10.0) WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'} response = requests.get (url=url, headers=headers) result = re.findall ('jQuery19108312825154929784_1611577043265\ ((. *?)\), response.text) [0] json_data = json.loads (result) pprint.pprint (json_data)

The second method

Delete the callback=jQuery19108312825154929784_1611577043265 in the link and you can directly use response.json ()

Import requestsimport pprinturl = 'https://mfm.video.qq.com/danmu?otype=json&target_id=6416481842%26vid%3Dt0035rsjty9&session_key=30475%2C0%2C1611577043 × tamp=105&_=1611577043296'headers = {' user-agent': 'Mozilla/5.0 (Windows NT 10.0) WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'} response = requests.get (url=url, headers=headers) # result = re.findall ('jQuery19108312825154929784_1611577043265\ ((. *?)\), response.text) [0] json_data = response.json () pprint.pprint (json_data)

This is also possible, and it makes the code simpler.

Small knowledge points:

Pprint is a formatting output module that makes data output similar to json more beautiful.

The complete implementation code import requestsfor page in range (15150,15): url = 'https://mfm.video.qq.com/danmu' params = {' otype': 'json',' target_id': '6416481842 vidperformt0035rsjty913,' session_key': '30475 people 0Magi 161157704313,' timestamp': page,'_': '1611577043296' } headers = {'user-agent':' Mozilla/5.0 (Windows NT 10.0) WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36'} response = requests.get (url=url, params=params, headers=headers) json_data = response.json () contents = json_data ['comments'] for i in contents: content = I [' content'] with open ('comedian barrage .txt', mode='a' Encoding='utf-8') as f: f.write (content) f.write ('\ n') print (content) so far The study on "how Python climbs Tencent Video's on-screen comment" is over. I hope I can solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report