Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Python to send love

2025-03-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

Today Xiaobian to share with you how to use Python to send love related knowledge points, detailed content, clear logic, I believe most people still know too much about this knowledge, so share this article for everyone to refer to, I hope you read this article after some gains, let's learn about it together.

preparations

After having an idea, I started to act. Naturally, the first thing I thought of was to use Python. The general idea was to climb down the Weibo data. After the data was cleaned and processed, it was processed into word segmentation. The processed data was handed over to the word cloud tool, which was used to make images with scientific calculation tools and drawing tools. The toolkits involved were:

requests for network request to crawl micro blog data, stutter word segmentation for Chinese word segmentation processing, word cloud processing library wordcloud, image processing library Pillow, scientific computing tools NumPy, similar to MATLAB 2D drawing library Matplotlib

tool mounting

When installing these toolkits, different system platforms may have different errors, wordcloud, requests, jieba can be installed online through ordinary pip mode,

pip install wordcloudpip install requestspip install jieba

Installing Pillow, NumPy, Matplotlib directly online with pip will cause various problems on Windows platforms. One recommended way is to download the corresponding.whl file on a third-party platform called Python Extension Packages for Windows 1. You can download and install cp27 for python2.7 and amd64 for 64-bit systems according to your system environment. Download it locally and install it

pip install Pillow-4.0.0-cp27-cp27m-win_amd64.whl

pip install scipy-0.18.0-cp27-cp27m-win_amd64.whl

pip install numpy-1.11.3+mkl-cp27-cp27m-win_amd64.whl

pip install matplotlib-1.5.3-cp27-cp27m-win_amd64.whl

Other platforms can be resolved by Google based on the error prompt. Or directly based on Anaconda, which is a branch of Python with a large number of built-in scientific computing and machine learning modules.

obtain data

Sina Weibo official API is a slag, can only get the user's latest release of 5 pieces of data, the second best, use the crawler to grab the data, grab before evaluating the difficulty, to see if someone has written, in GitHub around, basically did not meet the needs. It gave me some ideas, so I decided to write my own crawler. Use http://m.weibo.cn/ mobile URL to crawl data. Discover interface http://m.weibo.cn/index/my? format=cards&page=1 You can get Weibo data by page, and the returned data is in json format, which saves a lot of trouble, but the interface requires cookies after logging in. Log in to your account and you can find cookies through Chrome browser.

Implementation code:

def fetch_weibo():

api = "http://m.weibo.cn/index/my? format=cards&page=%s"

for i in range(1, 102):

response = requests.get(url=api % i, cookies=cookies)

data = response.json()[0]

groups = data.get("card_group") or []

for group in groups:

text = group.get("mblog").get("text")

text = text.encode("utf-8")

text = cleanring(text).strip()

yield text

The total number of pages to view tweets is 101. Considering that returning a list object at once is too memory-intensive, the function returns a generator with yield, and also performs data cleansing on the text, such as removing punctuation marks, HTML tags, and words such as "retweet."

save data

After the data is acquired, we need to save it offline for reuse next time and avoid repeated crawling. Save in csv format to weibo.csv file for later use. Data saved to csv file may be garbled when opened, it doesn't matter, use notepad++ to view not garbled.

def write_csv(texts):

with codecs.open('weibo.csv', 'w') as f:

writer = csv.DictWriter(f, fieldnames=["text"])

writer.writeheader()

for text in texts:

writer.writerow({"text": text})

def read_csv():

with codecs.open('weibo.csv', 'r') as f:

reader = csv.DictReader(f)

for row in reader:

yield row ['text '] participle processing

Each microblog read from weibo.csv file is segmented and then handed over to wordcloud to generate word cloud. Stuttering participles are suitable for most Chinese usage scenarios. Use stopwords.txt to filter out useless information (such as: , then, because, etc.).

def word_segment(texts):

jieba.analyse.set_stop_words("stopwords.txt")

for text in texts:

tags = jieba.analyse.extract_tags(text, topK=20)

yield " ".join(tags) generate images

After the data segmentation processing, you can give wordcloud processing, wordcloud according to the frequency of each word in the data, weight column display keyword font size. Generate a square image, as shown in the figure:

Yes, the generated picture has no aesthetic feeling. After all, it is to be given to others. It is only good to show off, right? Then we find an artistic picture as a template and copy a beautiful picture. I found a heart on the Internet:

Generate image code:

def generate_img(texts):

data = " ".join(text for text in texts)

mask_img = imread('./ heart-mask.jpg', flatten=True)

wordcloud = WordCloud(

font_path='msyh.ttc',

background_color='white',

mask=mask_img

).generate(data)

plt.imshow(wordcloud)

plt.axis('off')

plt.savefig('./ heart.jpg', dpi=600)

Note that when processing, you need to specify the Chinese font for matplotlib, otherwise it will display garbled characters. Find the font folder: C:\Windows\Fonts\Microsoft YaHei UI Copy the font and copy it to the matplotlib installation directory: C:\Python27\Lib\site-packages\matplotlib\mpl-data\fonts\ttf

Something like that.

The above is all the contents of this article "How to send love with Python". Thank you for reading it! I believe everyone has a great harvest after reading this article. Xiaobian will update different knowledge for everyone every day. If you want to learn more knowledge, please pay attention to the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 249

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report