Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How does Python climb the video barrage of bilibili and make a word cloud picture?

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

Python how to crawl B station video bullet screen and make word cloud map, I believe many inexperienced people can do nothing about it, for this reason this article summarizes the causes and solutions of the problem, through this article I hope you can solve this problem.

preface

Today, we introduce a Python extension library for obtaining data from station B-bilibili_api

Data available include:

video-Video Module

user-User module

dynamic-Dynamic module

I'm the contrast.

Without comparison, there will be no harm, just like the recent "Harbin Institute of Technology" and "Zhejiang University" students.

This is the process of obtaining bullet comments before:

1. Barrage data interface

https://comment.bilibili.com/123072475.xml (a fixed url + cid + .xml of the video)

2. Use the Request module to obtain data

3. Use Xpath to parse data

Now, it's time for real technology.

Encapsulated by bilibili_api, the bullet screen data acquisition part uses only one line of code:

danmu = video_info.get_danmaku()

It is equally convenient to obtain basic information and comment information of the video.

basic_info = video_info.get_video_info()comments = video_info.get_comments()

quick start

Next, use bilibili_api to get the bullet screen data of "Running Man" 10th anniversary special and draw the word cloud.

Link to video:

https://www.bilibili.com/video/BV1gC4y1h722

Station B has AV number and bv number. After revision, bv number is directly displayed in the link. One of these two must be provided.

bvid is a new unique video identifier of station b. It consists of 12 digits and letters. It is case-sensitive. Please include the header "BV" when passing in.

For example: "BV1gC 4y1 h722"

0) Installation process

Installation requires the dependency request module, which encapsulates the API of the B station data.

Install via pip:

pip install bilibili_api

1) Import module

from bilibili_api import Verifyfrom bilibili_api.video import VideoInfofrom bilibili_api.video import Danmaku

VideoInfo-Get video information (bullet screen, comments, number of coins, number of views, etc.)

Danmaku class-Barrage class, used to get and send barrages

Verify class, available or not. Some video information needs to be logged in (that is, SESSDATA is required) before it can be used (such as historical bullet screen acquisition).

SESSDATA and csrf are required for user actions such as likes and coins for videos.

For more information on how to obtain SESSDATA and csrf, please refer to the following links:

https://github.com/Passkou/bilibili_api/wiki/SESSDATA and CSRF acquisition methods (Chrome for example)

2) Get barrage data

Create a VideoInfo object and pass in two parameters:

bvid="BV1gC 4y1 h722"(BV number of video)

verify=verify (get bullet screen based on sessdata and csrf)

The obtained barrage data is a list of "Danmaku class". You can traverse and print its text.

Post a code:

verify = Verify(sessdata="yours", csrf="yours")video_info = VideoInfo(bvid="BV1gC4y1h722", verify=verify)danmu = video_info.get_danmaku()for i in danmu: print(i.text)

3) Draw word cloud

Draw word clouds with jieba participles and WorldCloud.

Parameters such as "background color,""background image," and "font" can be passed through WordCloud objects.

Post a code:

wc = WordCloud( background_color='white', mask=background_Image, font_path=r'./ SourceHanSerifCN-Medium.otf', color_func=random_color_func, random_state=50,)word_cloud = wc.generate(words_str) #generate word cloud word_cloud.to_file("rm.jpg") #save picture #show word cloud picture plt.imshow(word_cloud)plt.axis ('off ')plt.show()

4) Final effect

Through the word cloud, you can see that the most obvious ones are "Happy 10th Anniversary","RM 10th Anniversary","Hahahaha", etc., thanks to Running Man...

After reading the above content, do you know how Python crawls the video bullet screen of Station B and makes it into a word cloud map? If you still want to learn more skills or want to know more related content, welcome to pay attention to the industry information channel, thank you for reading!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report