In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly shows you "how to use python to make word map", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use python to make word map" this article.
[sample code]
# coding=utf-8
# @ Software: PyCharm
Import numpy as np
Import jieba
From PIL import Image
From wordcloud import WordCloud, STOPWORDS
Import matplotlib.pyplot as plt
Def draw_word_cloud (word):
Words = jieba.cut (word)
Wordstr = "" .join (words)
Sw = set (STOPWORDS)
Sw.add ("ok")
Mask = np.array (Image.open ('2.jpg'))
Wc = WordCloud (
Font_path='C:/Windows/Fonts/simhei.ttf', # formatting fonts
Mask=mask
Max_words=200
Max_font_size=100
Stopwords=sw
Scale=4
) .generate (wordstr)
# display word cloud image
Plt.imshow (wc)
Plt.axis ("off")
Plt.show ()
# Save word cloud image
Wc.to_file ('result.jpg')
If _ name__ = = "_ _ main__":
With open ("test2.txt", "rb") as f:
Word = f.read ()
Draw_word_cloud (word)
[the effect is as follows]
[knowledge points]
1. Before making a word cloud picture, you need to prepare a few things:
(1) download python wordcloud library, which is also the key library of word graph library. When I download this library, I often fail to download because of the network timeout. What should I do? Try it a few more times.
(2) numpy library, which is used for image processing to parse pictures into arrays after reading.
(3) if you want to segment a Chinese sentence, you need a jieba library; if it is an English word segmentation, you do not have to download it.
(4) if you want to display the word cloud picture directly on the interface, you need matlplotlib to draw the picture.
(5) to deal with pictures, PIL is indispensable. After all, it is the official image processing library.
2. The next step is to prepare the content to be analyzed. The txt content in the sample code is my last article. Then there is the shape of the word cloud map. The image 2.jpg in the sample code is as follows:
3. When the preparatory work is done, it will naturally begin to code.
(1) jieba.cut (): when you segment the txt content, you notice that it is a generator, so you need to convert it into a string. Of course, you can also use jieba.lcut (), which is the list.
(2) the setting of the STOPWORDS collection: what are the stop words? The main thing is to filter some words you don't want, such as "good" and "yes". In addition, when filtering stop words, there are two ways, one is like the sample code, which takes stopwords as a parameter of the wordcloud method, which is the easiest, or you can write your own code to filter the stop words artificially.
(3) Open the shape picture of the cloud image you want and pass it to the wordcloud method as a parameter.
(4) for the meaning of the parameters of the wordcloud method, please refer to other posts:
Https://blog.csdn.net/kouyi5627/article/details/80530569
What I want to focus on here is the regexp parameter, that is, regular expressions. Yes, it's a regular expression, and with this parameter, we can use regular expression rules to further filter our own words, such as\ d displaying only numbers. I've come across this pit before, so we'll talk about it later.
In addition, the images generated by scale=4 are usually about 500KB. If you leave them empty, the default is only about 10 KB.
(5) when generating a word cloud image from the content, the generate method is the simplest, and you can directly pass the string in. If the generate_from_frequencies method is used, you need to input a dictionary and count the number of times of each word.
(6) the code for displaying the word cloud image in the interface is very simple. Axis ("off") is not to display coordinates, so it is more beautiful.
(7) the word cloud map to be generated is saved locally without much explanation.
To sum up, it is roughly the flow of the sample code, isn't it quite simple? Then do it yourself.
4. Next, I would like to talk about the pit I encountered in making a word cloud map.
At the beginning, I wanted to make a word cloud analysis of the two-color ball numbers in each issue, and the result was always wrong, as follows:
ValueError: We need at least 1 word to plot a word cloud, got 0.
Does that mean the wordstr I passed in is empty? How is that possible? I have a number. Finally, I found the reason in the official description of the wordcloud method:
Did you see that? If the regexp parameter is left empty, individual words are automatically filtered by default, so numbers are filtered all the time for this reason. How to solve it? There are two ways. The first is to pass in the regexp parameter, such as regexp= "\ d *"; the second is to use the
Generate_from_frequencies method, so that because each number is frequently present, it will not be automatically filtered out. Judging from the results of my own use, the effect of the second method is better.
The above is all the content of the article "how to use python to make word cloud map". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.