In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces the relevant knowledge of how to draw a box chart based on the Python matplotlib library, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value. I believe you will gain something after reading this article on how to draw a box chart based on the Python matplotlib library. Let's take a look.
1. On Box Diagram and plt.boxplot () method
Box diagram is also called box chart, and in some places it can also be called box chart. The advantage of using box chart is that the discrete distribution of data can be described in a relatively stable way and the outliers in the data can be identified.
The plt.boxplot () method is used to draw a box chart in pthon's matplotlib library.
The main parameters of this method are as follows
Parameter description x whether the data to be drawn in the box chart notch displays the box chart in a bump form. The default is the shape of the outlier specified by the non-bump sym, and the default is the plus sign (+) indicating whether the vert needs to place the box chart vertically the distance between the upper and lower limit specified by whis and the upper and lower quartile. The default quartile difference position specifies the location of the box diagram. The default is [0,1,2] widths to specify the width of the box diagram, and whether 0.5patch_artist fills the box color meanline represents the mean in the form of lines, and points by default. This parameter makes sense only when showmeans is True. Whether the showmeans displays the mean or not does not show whether the showcaps displays the two lines at the top and the end of the box chart by default. The default is whether the showbox that does not display displays the box, whether the showfliers displays outliers, and whether the boxprops sets the properties of the box, such as border color, filling color, and so on. When patch_artist is True, fill the box color (facecolor key) to effectively set the median medianprops properties, such as line type, thickness and other meanprops to set the mean value properties, such as the point size color and other capprops to set the box map top and end line properties, such as color, thickness and other whiskerprops to set the required properties. Such as color, thickness, line type, etc. 2. Draw a simple box diagram
Use random number seeds to randomly generate three sets of random but fixed data. It is used to draw three individual box lines (one picture).
Global fonts use italics.
Import matplotlib.pyplot as pltimport numpy as npfig = plt.figure (1, facecolor='#33ff99', figsize= (10,6)) plt.rcParams ['font.sans-serif'] = [' STKAITI'] plt.rcParams ['axes.unicode_minus'] = Falseplt.rcParams [' axes.facecolor'] ='# cc00ff'np.random.seed (30) data1 = np.random.randint (20,100,200) data2 = np.random.randint (30,120,200) data3 = np.random.randint (40,110,200) plt.boxplot ([data1] Data2, data3]) plt.xticks (range (1,4), ['A','B','C'], fontsize=20) plt.yticks (fontsize=20) plt.title ('Box Chart', fontsize=25, color='#0033cc') plt.show ()
The image effect is as follows:
3. Draw a more elaborate image
In the following data, the data is modified. The random data generated above is relatively uniform, so it is difficult to generate outliers and can not achieve the expected display effect of the box chart.
Use the * symbol to mark outliers. And use lines to mark the mean of each set of data.
Import matplotlib.pyplot as pltimport numpy as npfig = plt.figure (1, facecolor='#33ff99', figsize= (10,6) plt.rcParams ['font.sans-serif'] = [' STKAITI'] plt.rcParams ['axes.unicode_minus'] = Falseplt.rcParams [' axes.facecolor'] ='# cc00ff'np.random.seed (110) data1 = np.random.randint (20,100,200) data2 = np.random.randint (30,120,200) data3 = np.random.randint (40,110) 200) # modify several values As an abnormal value It is convenient to display data1 [100plt.boxplot 102] = [142150] data3 [100plt.boxplot] = [1,5154] plt.boxplot ([data1, data2, data3], notch=True, sym='*', patch_artist=True, boxprops= {'color':' # ffff00', 'facecolor':' # 0066ff2}, capprops= {'color':' # ff3333', 'linewidth': 2}) Showmeans=True, meanline=True) plt.xticks (range (1,4), ['A','B','C'], fontsize=20) plt.yticks (fontsize=20) plt.title ('Box Chart', fontsize=25, color='#0033cc') plt.show ()
The effect of code execution is as follows:
4. The standard of outliers
The criteria for judging outliers can be modified by the whis parameter. By default, the judgment that is not in the range of [mean ±1.5 times quartile difference] is an outlier.
Make a slight modification based on the above code:
Set up whis=2
Import matplotlib.pyplot as pltimport numpy as npfig = plt.figure (1, facecolor='#33ff99', figsize= (10,6) plt.rcParams ['font.sans-serif'] = [' STKAITI'] plt.rcParams ['axes.unicode_minus'] = Falseplt.rcParams [' axes.facecolor'] ='# cc00ff'np.random.seed (110) data1 = np.random.randint (20,100,200) data2 = np.random.randint (30,120,200) data3 = np.random.randint (40,110) 200) # modify several values As an abnormal value It is convenient to display data1 [100plt.boxplot 102] = [142150] data3 [100plt.boxplot] = [1,5154] plt.boxplot ([data1, data2, data3], whis=2, notch=True, sym='*', patch_artist=True, boxprops= {'color':' # ffff00', 'facecolor':' # 0066ff'}, capprops= {'color':' # ff3333') 'linewidth': 2}, showmeans=True, meanline=True) plt.xticks (range (1,4), [' A','B','C'], fontsize=20) plt.yticks (fontsize=20) plt.title ('Box Line', fontsize=25, color='#0033cc') plt.show ()
Then there are no more outliers in the result:
5. Output of outliers
The above just presents the outliers in a visual way in front of the reader. Of course, this is not enough for data analysis, and it is usually necessary to process the data, such as removal.
The following python code completes the outlier output:
Import numpy as npnp.random.seed data1 = np.random.randint (20,100,200) data2 = np.random.randint (30,120,200) data3 = np.random.randint (40,110,200) # modify several values as outliers It is convenient to show data1 [100np.quantile 102] = [142150] data3 [100data1 103] = [1,5,154] Q1 = np.quantile (a=data3, qpig 0.25) Q3 = np.quantile (a=data3, Qothers 0.75) # calculate the quartile difference QR = Q3-QQing lower limit and the upper line low_limit = Q1-1.5 * QRup_limit = Q3 + 1.5 * QRprint ('lower limit is:', low_limit) print ('upper limit is:' Up_limit) print ('outliers are:') print (data3 [(data3)
< low_limit) + (data3 >Up_limit)])
This is the end of the article on "how to draw a Box Diagram based on Python matplotlib Library". Thank you for reading! I believe that everyone has a certain understanding of the knowledge of "how to draw box line diagram based on Python matplotlib library". If you want to learn more knowledge, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.