Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to draw a numerical box diagram with python's seaborn

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

This article mainly shows you "how to use python's seaborn to draw a numerical box map", the content is simple and clear, hoping to help you solve your doubts, the following let the editor lead you to study and learn "how to use python's seaborn to draw a numerical box map" this article.

I. introduction of the concept

Box diagram (box-plot), also known as box diagram, box diagram, box diagram. In the process of data exploration or descriptive analysis, we are often used to show the numerical distribution of multi-class continuous data to facilitate inter-class comparison and rapid identification of outliers.

In a box diagram, a continuous sequence of values forms a box, as shown below.

Each box mainly shows the upper quartile Q1 (25%), the median (50%), and the lower quartile Q3 (75%). The limits of outliers are called upper and lower limits, and the distance from Q1 to Q3 is 1.5IQR (IQR=Q3-Q1, called quartile distance), and the points beyond the upper and lower limits are called outliers. Outliers are valued differently in different scenarios. If we want to study the salary level of the target group, we often focus on the median and IQR rather than outliers.

Second, data display

With the help of crawler technology, we randomly obtained the job post information of three types of search words (data analyst, data mining engineer, algorithm engineer) in eight cities on the boss direct employment home page at a certain time. The total excel table is shown below.

The data in each table is as follows:

(the average monthly salary is calculated simply by the median salary period / 12 of the range. For example, 8k-10k*16 salary, the average monthly salary is 12000.

III. Data import

You only need to import the job title and average monthly salary column in each table.

Import pandas as pdcity8_fullname = [Beijing, Chongqing, Wuhan, Shenzhen, Nanjing, Guangzhou, Chengdu, Shanghai] job_type = [data analyst, data Mining engineer' 'algorithmic engineer'] salary_dic = {} for i in range (len (city8_fullname)): df = pd.DataFrame (pd.read_excel ('. / Boss direct employment data-eight cities / Boss direct employment -'+ city8_ fullname [I] + '.xls')) salary_ [city8 _ fullame [I]] = df [['job title'' 'average monthly salary'] salary_ [city8 _ fullname [I]] ['city'] = pd.Series ([city8_ fullname [I]] * df.shape [0]) # # integrated into the format required for drawing: salary = salary_ [city8 _ fullname [0]] for i in range (1): salary = pd.concat ([city 8 _ fullame [I]], ignore_index= "true")

The resulting data structure is as follows:

4. Drawing

Here, we use the python language, with the help of the seaborn package.

Import matplotlib.pyplot as pltimport seaborn as sns # Chinese and plus or minus sign display settings plt.rcParams ['font.sans-serif'] =' Microsoft YaHei'plt.rcParams ['axes.unicode_minus'] = False# Drawplt.figure (figsize= (14pr 8), dpi= 100) sns.boxplot (x cities, y cities' average monthly salary, data=salary, hue=' job names') sns.stripplot (x cities, y cities' average monthly salary, data=salary, color='black', size=2 Jitter=1) for i in range (len (salary ['city']. Unique ()-1): plt.vlines (item.5,10,45, linestyles='solid', colors='gray', alpha=0.2) plt.title ('eight cities' professional salary distribution, fontsize=20) plt.legend (title=' job type') plt.xticks (fontsize=14) plt.xlabel ('city', fontsize=16) plt.ylabel ('average monthly salary') Fontsize=16) plt.yticks (fontsize=14) plt.savefig (rust. / drawing results / payroll-payroll distribution-box chart .png')

Sns.stripplot- is used to draw scatter plots (not suitable for large samples, but there is a half-density and half-box category to avoid scatter coverage)

Plt.vlines- draws auxiliary lines

Hue- can be understood as the number of groups, which here is equivalent to the classification of the second dimension outside the city.

The result is as follows:

In the above picture, we add scattered dots on the basis of the box chart in order to understand the centralized distribution of wages in each city.

The above is all the contents of the article "how to draw a numerical box diagram with python's seaborn". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report