In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
How to use Python to analyze bilibili barrage, many novices are not very clear about this, in order to help you solve this problem, the following editor will explain for you in detail, people with this need can come to learn, I hope you can gain something.
Analysis of bilibili's on-screen comment with Python
Paper towels are dry rubbish no matter how wet they are. No matter how dry the melon seed skin is, it will be wet rubbish. Recently, everyone has been tortured by garbage sorting. Can you carry it foolishly? Since 2019.07.01, Shanghai has taken the lead in implementing a garbage sorting system, and those who violate the regulations will face fines.
In order to avoid huge losses, I decided to come to bilibili to learn the skills of garbage sorting. Why did you come to bilibili? I heard that this is one of the most popular ways for young people to learn.
When I opened bilibili and searched for garbage sorting, I was scared (sucked) to (quoted) by the title: the humiliating correct posture in Shanghai.
Of course, the humiliation here is not humiliating, it refers to the littering.
Click open found that it is a cross talk, or two cute girls (AI) cross talk, instantly came to interest, elaborated on how to sort garbage.
After watching it over and over again, I can't stop. I've turned on brainwashing mode. after all, the video is very fun, and the on-screen comment in the video is even more fun!
Being happy alone is not as good as having fun, and why not use Python to save the barrage and make a cloud picture of a word? It was decided so happily!
1 Environment
Operating system: Windows
Python version: 3.7.3
2 demand analysis
We first need to develop and debug tools to query the cid data of the on-screen comment of this video.
After you get the cid, fill in the link below.
Http://comment.bilibili.com/{cid}.xml
After opening it, you can see a list of on-screen comments for the video.
After we have the on-screen comment data, we need to parse it first and save it locally to facilitate further processing, such as making a word cloud map for display.
3 code implementation
Here, we use the requests module for the request to get the web page; parse the URL with the help of the beautifulsoup4 module; save it as CSV data, and borrow the pandas module here. Because they are all third-party modules, if there is no pip in the environment, you can install them.
Pip install requestspip install beautifulsoup4pip install lxmlpip install pandas
After the module is installed, import it
Import requestsfrom bs4 import BeautifulSoupimport pandas as pd
Request, parse and save on-screen comment data
# request on-screen comment data url = 'http://comment.bilibili.com/99768393.xml'html = requests.get (url). Content# parses on-screen comment data html_data = str (html,' utf-8') bs4 = BeautifulSoup (html_data) 'lxml') results = bs4.find_all (' d') comments = [comment.text for comment in results] comments_dict = {'comments': comments} # Save the on-screen comment data locally br = pd.DataFrame (comments_dict) br.to_csv (' barrage.csv', encoding='utf-8')
Next, we will carry on the deep processing to the saved on-screen comment data.
To make the word cloud, we need to use the wordcloud module, the matplotlib module and the jieba module, which are also third-party modules and installed directly with pip.
Pip install wordcloudpip install matplotlibpip install jieba
After the module is installed, import it. Since we use the panda module to read the file, we can import it as well.
From wordcloud import WordCloud, ImageColorGeneratorimport matplotlib.pyplot as pltimport pandas as pdimport jieba
We can choose a picture by ourselves and generate a custom word cloud map based on this picture. We can customize some word cloud styles. The code is as follows:
# parse background image mask_img = plt.imread ('Bulb.jpg')' set word cloud style'wc = WordCloud (# set font font_path='SIMYOU.TTF', # allows maximum vocabulary max_words = 2000, # set maximum font size max_font_size = 80, # set background image mask = mask_img, # set output picture background color background_color=None, mode= "RGBA" # set how many randomly generated states there are That is, how many color schemes are there (random_state=30)
Next, we will read the text information (on-screen comment data), segment the words and connect them:
# read the contents of the file br = pd.read_csv ('barrage.csv', header=None) # to segment words and concatenate them with spaces text =' 'for line in br [1]: text + =' .join (jieba.cut (line, cut_all=False))
Finally, let's take a look at our effect picture.
Have you ever felt everyone's enthusiasm for the topic of garbage sorting? inexplicable joy welled up in your mind.
4 postscript
These two cute AI girls say crosstalk is very good, so I don't know how Guo Degang will feel when he sees this work. Back to the topic of garbage sorting, the Shanghai Municipal solid waste Management regulations have been formally implemented, and friends who are not in Shanghai should not be too happy. The Ministry of Housing and Construction said that 46 other key cities across the country are about to experience it.... Ha, interesting!
Is it helpful for you to read the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.