Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Python to calculate the emotion of Financial Market text data

2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article will explain in detail how to use Python for emotional calculation of financial market text data. The content of the article is of high quality, so the editor shares it for you as a reference. I hope you will have a certain understanding of the relevant knowledge after reading this article.

1. Introduction of tushare

Tushare library is a popular open source free economic database. Tushare has ordinary version and premium version, in which the ordinary version does not need points to use, while the advanced version needs to use points to use.

The tushare Foundation class provides:

Transaction data, such as historical quotations, restoration data, real-time quotations, etc.

Investment reference data, such as allocation scheme, performance notice, lifting of restricted shares, fund holdings, Sina data, margin trading

Stock classification data, industry, concept, region, small and medium-sized board, gem, closed school warning board students

Fundamental data, stock list, performance report (main table), profitability, operating ability, solvency, etc.

Macroeconomic data, such as deposit interest rate, loan interest rate, GDP data, industrial product entry price index, household consumption festival

News event data, such as Sina stock

Dragon and Tiger list data

Interbank lending theory

Movie box office

Installation

! pip3 install tushare

Run

Collecting tushare [? 25l Downloading https://files.pythonhosted.org/packages/a9/8b/2695ad38548d474f4ffb9c95548df126e82adb24e56e2c4f7ce1ef9fbd01/tushare-1.2.43.tar.gz (168kB) [K 100% | ██ | 174kB 162kB/s ta: 00:01 [? 25hBuilding wheels for collected packages: tushare Running setup.py bdist_wheel for tushare. [? 25ldone [? 25h Stored in directory: / Users / thunderhit/Library/Caches/pip/wheels/4b/28/7b/62d7a4155b34be251c1840e7cecfa4c374812819c59edba760Successfully built tushareInstalling collected packages: tushareSuccessfully installed tushare-1.2.43 [33mYou are using pip version 18.1 However version 19.2.3 is available.You should consider upgrading via the 'pip install-- upgrade pip' command. [0m

II. News data

The news event interface mainly provides rolling news about domestic finance and economics, securities, Hong Kong stocks and futures, as well as lei data of individual stocks. But at present, only the interface of Sina Bar api is available, and others need to use tushare Advanced Edition.

Get the key news on the home page of sina Financial stocks Bar. Stock bar data currently obtain about 17 key data, can be set according to the parameters whether to display the content of the message, the default is not displayed.

Parameter description:

Show_content:boolean, whether to display the content, default False

Return value description:

Title, message title

Content, message content (in the case of show_content=True)

Ptime, release time

Rcounts, number of readings

Call method

Import tushare as ts# displays details newsdata = ts.guba_sina (show_content=True) newsdata.head (10)

Third, read the dictionary

The Chinese financial emotion dictionary made before is in csv file format, and we use pandas to read it.

Import pandas as pddf = pd.read_csv ('CFSD/pos.csv', encoding='gbk') df.head ()

We define the reading dictionary as a function.

Def read_dict (file, header): "" file: dictionary path header: field name in csv file. For example, postive reads csv dictionary and returns word list "df = pd.read_csv (file, encoding='gbk') return list (df [header]) poswords = read_dict (file= 'CFSD/pos.csv', header =' postive') negwords = read_dict (file= 'CFSD/neg.csv', header =' negative') negwords [: 5]

Run

['working behind closed doors', 'blocking','in the clouds', 'dragging', 'overheating']

Third, the method of emotion analysis

Here we make an emotional analysis of the news content content, and the idea of the analysis is to count the proportion of positive and negative words in content. We will use the df.agg (func) method of pandas to evaluate the text of the content column. This requires defining an emotion calculation function to be called. Note that it is possible that the denominator is 0, so the defined function uses try except catch 0 to remove the exception and returns 0. 0.

Import jiebadef pos_senti (content): "" content: the proportion of positive words to the total number of words returned by the text to be analyzed "" try: pos_word_num = words = jieba.lcut (content) for kw in poswords: pos_word_num + = words.count (kw) return pos_word_num/len (words) except: return def neg_senti (content): "" content: the text to be analyzed returns negative words to occupy the text The proportion of this total number of words "" try: neg_word_num = words = jieba.lcut (content) for kw in negwords: neg_word_num + = words.count (kw) return neg_word_num/len (words) except: return 0

Execute the emotion calculation function possenti,negsenti on the content column, and assign the score to the pos and neg columns.

Newsdata ['pos'] = newsdata [' content'] .agg (pos_senti) newsdata ['neg'] = newsdata [' content'] .agg (neg_senti) newsdata.head (10)

There are two scores of pos and neg in our data, and we can also define a judgment function to judge the emotional classification of the text.

When pos is larger than neg, it is judged to be positive.

When pos is smaller than neg, it is judged to be negative.

It is not rigorous here, for the sake of simplicity of the tutorial, the equality situation is not considered.

Newsdata ['senti_classification'] = newsdata [' pos'] > newsdata ['neg'] newsdata [' senti_classification'] = newsdata ['senti_classification'] .map ({True: positive, False: negative) newsdata.head (10)

Summary

In fact, at this point, simple emotional computing is realized.

In addition, when using this article, you must pay attention to:

The emotion dictionary used in this Python course is CFSD Chinese Financial emotion Dictionary. You can use the dictionary in your own field to get poswords and negwords.

It is also important to note that affective computing functions (possenti and negsenti) have different results with different algorithms.

Judging by the positive and negative tendencies, I am relatively rough here, and I do not consider the equal neutral problem.

Note the above points, and the code of this Python tutorial can be reused. However, no matter how good the code is, the premise is that you must know python, understand programming thinking, and know how to write code and change code, otherwise it will be more difficult for everyone to use.

On how to use Python for financial market text data emotional calculation is shared here, I hope the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report