Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Case Analysis of python Visualization data

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "python Visualization data instance Analysis". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

1. Word frequency statistics

We use the jieba word segmentation and matplotlib module in Python to analyze the vocabulary of the whole article, and extract the words in the top 20 of word frequency. The results are as follows:

Indeed, the words related to the two stars are the most, followed by Sister Alijie (I don't know if she is the embodiment of the author). Zanzan is a female character in it. No wonder fans are furious. In this way, we don't seem to see much inclusion, so let's refine the dimension a little bit.

From a sensitive point of view, what will the frequency of this word be? Because I really can't accept the purity, so I put a little mosaic (if you can guess what the word is. Yes. It is recommended to see more Teletubbies):

According to statistics, there are 20367 non-yellow words and 284 pornography-related words in the article. The probability of the occurrence of pornographic words is about 1.4%, that is, there is a yellow word in every 100 words, which is quite high. I feel that "Norwegian Forest" is slightly inferior, and "Paradise lost" can be fought.

Finally, let's end this part with a word cloud:

two。 Sentence pattern analysis

We use Lstm to analyze the whole article by line to see if the emotional features of these sentences tend to be lopsided. When the positive reliability is greater than 0.7, or the negative reliability is greater than 0.7, they are divided into positive classification and negative classification, respectively. Other cases are neutral:

The results are as follows:

> {'neg': 988,' pos': 332, 'mid': 471}

Negative sentences account for 55%, and there are more negative emotions in the articles. Negative emotions only measure the emotional tendency of an article and don't mean anything.

The next key is to identify the pornographic degree of the sentence, and similarly, classify it when the probability confidence is greater than 0.7:

The results are as follows:

> {'porn': 280,' not_porn': 1511}

This is the end of the content of "python Visual data instance Analysis". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report