In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
How to use Python to understand Wechat friends, many novice is not very clear, in order to help you solve this problem, the following editor will explain in detail for you, people with this need can come to learn, I hope you can get something.
I have been using Wechat for several years, and I have a lot of WeChat accounts, but do you really know your friends? Which city has the most friends? What is the ratio of good friends to male to female? What are the signatures of good friends? Today we come to fully understand our Wechat friends.
Running platform: Windows
Python version: Python3.6
IDE: Sublime Text
1. Preparatory work
1.1 introduction to the library
Only by logging in to Wechat can you get the information of Wechat friends. This paper uses wxpy, the third-party library, to log in to Wechat and obtain the information.
Wxpy on the basis of itchat, through a large number of interface optimization to improve the ease of use of the module, and rich functional expansion.
Some common scenarios for wxpy:
Control routers, smart homes and other things with open interfaces
Automatically send logs to your Wechat when you run the script
Add the group owner as a friend and automatically pull it into the group.
Forward messages across numbers or groups
Chat with people automatically
Make people play.
In a word, it can be used to realize the automatic operation of all kinds of Wechat personal accounts.
1.2 wxpy Library installation
Wxpy supports Python 3.4-3.6 and version 2.7
Replace "pip" with "pip3" or "pip2" in the following command to ensure that it is installed in the corresponding version of Python
Download and install from the PYPI official source (which may be slow or unstable in China):
Pip install-U wxpy
Download and install from Douban PYPI image source (recommended for domestic users):
Pip install-U wxpy-I "https://pypi.doubanio.com/simple/"
1.3Login to Wechat
There is a robot object in wxpy, and the robot Bot object can be understood as a Web Wechat client. Bot will log in when it is initialized, which requires the mobile phone to scan and log in.
Through the chats (), friends (), groups (), mps () methods of the robot object Bot, you can get all the chat objects, friends, group chats and official account lists of the current robot respectively.
This article mainly obtains all the friend information through friends (), and then carries on the data processing.
From wxpy import * # initialize the robot, log in to bot = Bot () # to get all friends my_friends = bot.friends () print (type (my_friends))
The following is the output message:
Getting uuid of QR code. Downloading QR code. Please scan the QR code to log in. Please press confirm on your phone. Loading the contact, this may take a little while.
Wxpy.api.chats.chats.Chats object is a collection of multiple chat objects, which can be used for search or statistics, including sex (gender), province (province), city (city) and signature (personality signature).
2. Male-to-female ratio of Wechat friends
2.1 data Statistics
Use a dictionary, sex_dict, to count the number of men and women in your friends.
# use a dictionary to count the number of male and female friends sex_dict = {'male': 0,' female': 0} for friend in my_friends: # gender if friend.sex = = 1: sex_dict ['male'] + = 1 elif friend.sex = 2: sex_dict [' female'] + = 1 print (sex_dict)
The following is the output:
{'male': 255,' female': 104}
2.2 data presentation
This article uses ECharts pie chart to render the data. Open the link http://echarts.baidu.com/echarts2/doc/example/pie1.html, and you can see the following:
1. Original content of echarts pie chart
From the figure, you can see that the left side is the data, the right side is the rendered data graph, and other forms of diagrams are also this kind of left and right structure. Take a look at the data on the left:
Option = {title: {text: 'user access source of a site', subtext: 'purely fictional', tooltip: {trigger: 'item', formatter: "{a} {b}: {c} ({d}%)"}, legend: {orient:' vertical' X: 'left', data: [' direct access', 'email marketing', 'affiliate advertising', 'video advertising', 'search engine']}, toolbox: {show: true, feature: {mark: {show: true}, dataView: {show: true, readOnly: false} MagicType: {show: true, type: ['pie',' funnel'], option: {funnel: {x:'25%, width:'50%, funnelAlign: 'left' Max: 1548}, restore: {show: true}, saveAsImage: {show: true}, calculable: true, series: [{name:' access Source', type:'pie' Radius:'55% email, center: ['50% email,'60%'], data: [{value:335, name:' Direct access'}, {value:310, name:' email Marketing'}, {value:234, name:' Alliance Advertising'}, {value:135 Name:' video ads'}, {value:1548, name:' search engine'}]}
You can see the data in JSON format in the curly braces after option =. Next, analyze the data:
Title: title
Text: title content
Subtext: subtitle
X: title location
Tooltip: prompt. Hold the mouse over the pie chart and you will see the prompt.
Legend: legend
Orient: direction
X: legend location
Data: legend content
Toolbox: toolbox, icons arranged horizontally at the top right of the pie chart
Mark: auxiliary line switch
DataView: data View. Click to view pie chart data.
MagicType: pie and funnel toggle
Restore: restore
SaveAsImage: saving as picture
Calculable: I don't know what it does for a while.
Series: main data
Data: rendered data
Other types of graph data have similar formats and will not be analyzed in detail later. You only need to modify data, legend- > data, series- > data, and the modified data are:
Option = {title: {text: 'sex ratio of Wechat friends', subtext: 'real data', XRV 'item', formatter'}, tooltip: {trigger: 'item', formatter: "{a} {b}: {c} ({d}%)"}, legend: {orient:' vertical' X: 'left', data: [' male', 'female']}, toolbox: {show: true, feature: {mark: {show: true}, dataView: {show: true, readOnly: false}, magicType: {show: true, type: ['pie' 'funnel'], option: {funnel: {x:' 25%, width:'50%, funnelAlign: 'left' Max: 1548}, restore: {show: true}, saveAsImage: {show: true}, calculable: true, series: [{name:' access Source', type:'pie' Radius:'55% female, center: ['50% female,'60%'], data: [{value:255, name:' male'}, {value:104, name:' female'}]}]}
After the data modification is completed, click the green refresh button on the page to get the pie chart as follows (you can modify the theme according to your preferences):
2. Sex ratio of good friends
Place your mouse over the pie chart to see the detailed data:
3. View the data on the sex ratio of friends
3. National distribution map of Wechat friends
3.1 data Statistics
# use a dictionary to count the number of friends in each province province_dict = {'Beijing': 0, 'Shanghai': 0, 'Tianjin': 0, 'Chongqing': 0, 'Hebei': 0, 'Shanxi': 0, 'Jilin': 0, 'Liaoning': 0, 'Heilongjiang': 0, 'Shaanxi': 0, 'Gansu': 0, 'Qinghai': 0, 'Shandong': 0 Fujian: 0, Zhejiang: 0, Taiwan: 0, Henan: 0, Hubei: 0, Hunan: 0, Jiangxi: 0, Jiangsu: 0, Anhui: 0, Guangdong: 0, Hainan: 0, Sichuan: 0, Guizhou: 0, Yunnan: 0, Inner Mongolia: 0 'Xinjiang': 0, 'Ningxia': 0, 'Guangxi': 0, 'Xizang': 0, 'Hong Kong': 0, 'Macao': 0} # Statistical province for friend in my_friends: if friend.province in province_dict.keys (): province_ province [Friend.friendship] + = 1 # to facilitate the presentation of data Generate JSON Array format data data = [] for key, value in province_dict.items (): data.append ({'name': key,' value': value}) print (data)
The following is the output:
[{'name':' Beijing', 'value': 91}, {' name': 'Shanghai', 'value': 12}, {' name': 'Tianjin', 'value': 15}, {' name': 'Chongqing', 'value': 1}, {' name': 'Hebei', 'value': 53}, {' name': 'Shanxi', 'value': 2}, {' name': 'Jilin' 'value': 1}, {' name': 'Liaoning', 'value': 1}, {' name': 'Heilongjiang', 'value': 2}, {' name': 'Shaanxi', 'value': 3}, {' name': 'Gansu', 'value': 0}, {' name': 'Qinghai', 'value': 0}, {' name': 'Shandong', 'value': 7} {'name':' Fujian', 'value': 3}, {' name': 'Zhejiang', 'value': 4}, {' name': 'Taiwan', 'value': 0}, {' name': 'Henan', 'value': 1}, {' name': 'Hubei', 'value': 4}, {' name': 'Hunan', 'value': 4}, {' name': 'Jiangxi' 'value': 4}, {' name': 'Jiangsu', 'value': 9}, {' name': 'Anhui', 'value': 2}, {' name': 'Guangdong', 'value': 63}, {' name': 'Hainan', 'value': 0}, {' name': 'Sichuan', 'value': 2}, {' name': 'Guizhou', 'value': 0} {'name':' Yunnan', 'value': 1}, {' name': 'Inner Mongolia', 'value': 0}, {' name': 'Xinjiang', 'value': 2}, {' name': 'Ningxia', 'value': 0}, {' name': 'Guangxi', 'value': 1}, {' name': 'Xizang', 'value': 0} {'name':' Hong Kong', 'value': 0}, {' name': 'Macao', 'value': 0}]
It can be seen that the province with the most friends is Beijing. So the question is: why reorganize the data into this format? Because ECharts maps need data in this format.
3.2 data presentation
ECharts map is used to present the data of friend distribution. Open the URL and change the data on the left to:
Option = {title: {text: 'national distribution map of Wechat friends', subtext: 'real data', tooltip: {trigger: 'item'}, legend: {orient:' vertical', x virtual friends, data: ['number of friends']} DataRange: {min: 0, max: 100, x: 'left', y:' bottom', text: ['high', 'low'], / / text Default is numeric text calculable: true}, toolbox: {show: true, orient: 'vertical', x:' right', y: 'center', feature: {mark: {show: true}, dataView: {show: true, readOnly: false}, restore: {show: true} SaveAsImage: {show: true}}, roamController: {show: true, x: 'right', mapTypeControl: {' china': true}}, series: [{name: 'number of friends', type: 'map' MapType: 'china', roam: false, itemStyle: {normal: {label: {show:true}}, emphasis: {label: {show:true}, data: [{' name': 'Beijing', 'value': 91} {'name':' Shanghai', 'value': 12}, {' name': 'Tianjin', 'value': 15}, {' name': 'Chongqing', 'value': 1}, {' name': 'Hebei', 'value': 53}, {' name': 'Shanxi' 'value': 2}, {' name': 'Jilin', 'value': 1}, {' name': 'Liaoning', 'value': 1}, {' name': 'Heilongjiang', 'value': 2}, {' name': 'Shaanxi', 'value': 3} {'name':' Gansu, 'value': 0}, {' name': 'Qinghai', 'value': 0}, {' name': 'Shandong', 'value': 7}, {' name': 'Fujian', 'value': 3}, {' name': 'Zhejiang' 'value': 4}, {' name': 'Taiwan', 'value': 0}, {' name': 'Henan', 'value': 1}, {' name': 'Hubei', 'value': 4}, {' name': 'Hunan', 'value': 4} {'name':' Jiangxi', 'value': 4}, {' name': 'Jiangsu', 'value': 9}, {' name': 'Anhui', 'value': 2}, {' name': 'Guangdong', 'value': 63}, {' name': 'Hainan' 'value': 0}, {' name': 'Sichuan', 'value': 2}, {' name': 'Guizhou', 'value': 0}, {' name': 'Yunnan', 'value': 1}, {' name': 'Inner Mongolia', 'value': 0} {'name':' Xinjiang, 'value': 2}, {' name': 'Ningxia', 'value': 0}, {' name': 'Guangxi', 'value': 1}, {' name': 'Xizang', 'value': 0}, {' name': 'Hong Kong' 'value': 0}, {' name': 'Macau', 'value': 0}]}
Pay attention to two points:
DataRange- > max adjust appropriately according to the statistical data
Data format of series- > data
After clicking the Refresh button, you can generate the following map:
National distribution map of friends
From the picture, we can see that my friends are mainly distributed in Beijing, Hebei and Guangdong.
Interestingly, there is a slider on the left side of the map, which represents the range of map data. We pull the upper slider to the bottom to see the provinces where there is no Wechat friend distribution:
5. Provinces without Wechat friends
According to this idea, we can see the exact number of provinces where friends are distributed on the map, and readers can give it a try.
4. Signature statistics of friends
4.1 data Statistics
Def write_txt_file (path, txt):''write txt text' 'with open (path,' asides, encoding='gb18030', newline='') as f: f.write (txt) # Statistical signature for friend in my_friends: # Clean the data The factors affecting word frequency statistics, such as punctuation marks, are removed from pattern = re.compile (r'[one-word] +') filterdata = re.findall (pattern, friend.signature) write_txt_file ('signatures.txt',' '.join (filterdata))
The above code implements the function of cleaning and saving the friend's signature, and the signatures.txt file will be generated in the current directory after execution.
4.2 data presentation
The data is presented by word frequency statistics and word cloud display, through which we can understand the life attitude of Wechat friends.
Jieba, numpy, pandas, scipy and wordcloud libraries are used for word frequency statistics. If these libraries are not available on your computer, execute the installation instructions:
Pip install jieba pip install pandas pip install numpy pip install scipy pip install wordcloud
4.2.1 read txt file
Now that you have saved your friend's signature in the txt file, let's read it out:
Def read_txt_file (path):''read txt text' 'with open (path,' ringing, encoding='gb18030', newline='') as f: return f.read ()
4.2.2 stop word
Let's introduce a concept: stop word, there are a large number of common words in the website, such as "in", "inside", "also", "of", "it" and "Wei". These words are used so frequently that they exist on almost every web page, so search engine developers ignore all these words. If there are a lot of such words on our website, it would be a waste of resources.
Search stpowords.txt in Baidu for download and put it in the same level directory of py file.
Content = read_txt_file (txt_filename) segment = jieba.lcut (content) words_df=pd.DataFrame ({'segment':segment}) stopwords=pd.read_csv ("stopwords.txt", index_col=False,quoting=3,sep= "", names= [' stopword'], encoding='utf-8') words_df=words_df [~ words_df.segment.isin (stopwords.stopword)]
4.2.3 word frequency statistics
Here comes the highlight, word frequency statistics using numpy:
Import numpy words_stat = words_df.groupby (by= ['segment']) [' segment'] .agg ({"count": numpy.size}) words_stat = words_stat.reset_index () .sort_values (by= ["count"], ascending=False)
4.2.4 word Frequency Visualization: word Cloud
Although the word frequency statistics come out, you can see the ranking, but not *, and then we will visualize it. Use the wordcloud library, see github for more information.
From scipy.misc import imread from wordcloud import WordCloud, ImageColorGenerator # set word cloud attribute color_mask = imread ('background.jfif') wordcloud = WordCloud (font_path= "simhei.ttf", # set font can display Chinese background_color= "white", # background color max_words=100, # word cloud display * * words mask=color_mask # set background image max_font_size=100, # font * * values random_state=42, width=1000, height=860, margin=2,# set the default size of the picture, but if you use a background image # then the saved picture size will be saved according to its size, margin for the word edge distance) # generate word cloud, you can use generate to enter all the text Alternatively, we can calculate the word frequency and use the generate_from_frequencies function word_frequence = {x [0]: X [1] for x in words_stat.head. Values} print (word_frequence) word_frequence_dict = {} for key in word_frequence: word_frequence_dict [key] = word_ color wordcloud.generate_from_frequencies (word_frequence_dict) # generate the color value image_colors = ImageColorGenerator from the background image (color_mask) # recolor wordcloud.recolor (color_func=image_colors) # Save the picture wordcloud.to_file ('output.png') plt.imshow (wordcloud) plt.axis ("off") plt.show ()
The running effect picture is as follows (the background image on the left and the word cloud image on the right):
6. Comparison between background map and word cloud picture
From the word cloud map, we can analyze the characteristics of friends:
Do-action faction
Life, Life-Love Life
Happy.-optimistic.
Choice-decision
Professional-Professional
-Love.-Love.
5. Summary
At this point, the analysis of Wechat friends has been completed, and there are still many functions of wxpy, such as chatting, viewing official account information and so on. Interested readers please check the official documents by themselves.
6. Complete code
The above code is relatively loose. I encapsulated the functional modules into functions in the complete code shown below:
#-*-coding: utf-8-*-import re from wxpy import * import jieba import numpy import pandas as pd import matplotlib.pyplot as plt from scipy.misc import imread from wordcloud import WordCloud, ImageColorGenerator def write_txt_file (path, txt):''write txt text' with open (path, 'asides, encoding='gb18030' Newline='') as f: f.write (txt) def read_txt_file (path):''read txt text' 'with open (path,' ringing, encoding='gb18030', newline='') as f: return f.read () def login (): # initialize the robot Log in to bot = Bot () # to get all your friends my_friends = bot.friends () print (type (my_friends)) return my_friends def show_sex_ratio (friends): # use a dictionary to count the number of male and female friends sex_dict = {'male': 0 'female': 0} for friend in friends: # Statistical gender if friend.sex = 1: sex_dict [' male'] + = 1 elif friend.sex = = 2: sex_dict ['female'] + = 1 print (sex_dict) def show_area_distribution (friends): # use a dictionary to count the number of friends in each province province_dict = {' Beijing': 0 Shanghai: 0, Tianjin: 0, Chongqing: 0, Hebei: 0, Shanxi: 0, Jilin: 0, Liaoning: 0, Heilongjiang: 0, Shaanxi: 0, Gansu: 0, Qinghai: 0, Shandong: 0, Fujian: 0, Zhejiang: 0 Taiwan: 0, Henan: 0, Hubei: 0, Hunan: 0, Jiangxi: 0, Jiangsu: 0, Anhui: 0, Guangdong: 0, Hainan: 0, Sichuan: 0, Guizhou: 0, Yunnan: 0, Inner Mongolia: 0, Xinjiang: 0 'Ningxia': 0, 'Guangxi': 0, 'Xizang': 0, 'Hong Kong': 0, 'Macao': 0} # Statistics province for friend in friends: if friend.province in province_dict.keys (): province_ province [Friend.friendship] + = 1 # to facilitate the presentation of data Generate data in JSON Array format data = [] for key, value in province_dict.items (): data.append ({'name': key,' value': value}) print (data) def show_signature (friends): # Statistical signature for friend in friends: # Clean the data The factors affecting word frequency statistics, such as punctuation marks, are removed from pattern = re.compile (r'[one-word] +') filterdata = re.findall (pattern, friend.signature) write_txt_file ('signatures.txt'). '' .join (filterdata)) # read file content = read_txt_file ('signatures.txt') segment = jieba.lcut (content) words_df = pd.DataFrame ({' segment':segment}) # read stopwords stopwords = pd.read_csv ("stopwords.txt", index_col=False,quoting=3,sep= "", names= ['stopword'] Encoding='utf-8') words_df = words_df [~ words_df.segment.isin (stopwords.stopword)] print (words_df) words_stat = words_df.groupby (by= ['segment']) [' segment'] .agg ({"count": numpy.size}) words_stat = words_stat.reset_index () .sort_values (by= ["count"] Ascending=False) # set word cloud attribute color_mask = imread ('background.jfif') wordcloud = WordCloud (font_path= "simhei.ttf", # set font to display Chinese background_color= "white", # background color max_words=100, # word cloud display mask=color_mask # set background image max_font_size=100, # font * * values random_state=42, width=1000, height=860, margin=2,# set the default size of the picture, but if you use a background image # then the saved picture size will be saved according to its size, margin for the word edge distance) # generate word cloud, you can use generate to enter all the text Alternatively, we can calculate the word frequency and use the generate_from_frequencies function word_frequence = {x [0]: X [1] for x in words_stat.head. Values} print (word_frequence) word_frequence_dict = {} for key in word_frequence: word_frequence_dict [key] = word_ words [key] wordcloud.generate_from_frequencies (word_frequence_dict) # Scene image generation color value image_colors = ImageColorGenerator (color_mask) # recolor wordcloud.recolor (color_func=image_colors) # Save picture wordcloud.to_file ('output.png') plt.imshow (wordcloud) plt.axis ("off") plt.show () def main (): friends = login () show_sex_ratio (friends) show_area_distribution (friends) show_signature (friends) if _ _ name__ ='_ _ main__': main () is it helpful for you to finish reading the above content? If you want to know more about the relevant knowledge or read more related articles, please follow the industry information channel, thank you for your support.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.