In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)05/31 Report--
This article introduces the relevant knowledge of "python Visualization data instance Analysis". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!
1. Word frequency statistics
We use the jieba word segmentation and matplotlib module in Python to analyze the vocabulary of the whole article, and extract the words in the top 20 of word frequency. The results are as follows:
Indeed, the words related to the two stars are the most, followed by Sister Alijie (I don't know if she is the embodiment of the author). Zanzan is a female character in it. No wonder fans are furious. In this way, we don't seem to see much inclusion, so let's refine the dimension a little bit.
From a sensitive point of view, what will the frequency of this word be? Because I really can't accept the purity, so I put a little mosaic (if you can guess what the word is. Yes. It is recommended to see more Teletubbies):
According to statistics, there are 20367 non-yellow words and 284 pornography-related words in the article. The probability of the occurrence of pornographic words is about 1.4%, that is, there is a yellow word in every 100 words, which is quite high. I feel that "Norwegian Forest" is slightly inferior, and "Paradise lost" can be fought.
Finally, let's end this part with a word cloud:
two。 Sentence pattern analysis
We use Lstm to analyze the whole article by line to see if the emotional features of these sentences tend to be lopsided. When the positive reliability is greater than 0.7, or the negative reliability is greater than 0.7, they are divided into positive classification and negative classification, respectively. Other cases are neutral:
The results are as follows:
> {'neg': 988,' pos': 332, 'mid': 471}
Negative sentences account for 55%, and there are more negative emotions in the articles. Negative emotions only measure the emotional tendency of an article and don't mean anything.
The next key is to identify the pornographic degree of the sentence, and similarly, classify it when the probability confidence is greater than 0.7:
The results are as follows:
> {'porn': 280,' not_porn': 1511}
This is the end of the content of "python Visual data instance Analysis". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.