Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A case study of getting started with Python data Visualization

2025-04-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly explains the "introduction to Python data visualization case analysis", the article explains the content is simple and clear, easy to learn and understand, now please follow the editor's ideas slowly in depth, together to study and learn "introduction to Python data visualization case analysis" bar!

First of all, what libraries do we use to draw pictures?

Matplotlib

The most basic drawing library in python is matplotlib, which is the most basic Python visualization library. Generally, Python data visualization starts from matplotlib, and then begins to expand vertically and horizontally.

Seaborn

Is an advanced visualization effect library based on matplotlib, aiming at variable feature selection in data mining and machine learning. Seaborn can use short code to draw visualization effects that describe more dimensional data.

Other libraries include

Bokeh (a library for browser-side interaction visualization, which enables analysts to interact with data); Mapbox (a more powerful visualization tool library for dealing with geographic data engines), etc.

This article mainly uses matplotlib for case study.

Step 1: identify the problem and select the graphic

The business may be complex, but after a split, we need to find out what specific problems we want to express graphically. The training of analytical thinking can learn the methods in McKinsey method and Pyramid principle.

This is a summary of the selection of chart types on the Internet.

In python, we can summarize the following four basic visual elements to present graphics:

Points: scatter plot 2D data, suitable for simple 2D relationships

Lines: line plot 2D data, suitable for time series

Column: bar plot 2D data, suitable for category statistics

Color: heatmap is suitable for displaying the third dimension

There are relationships among data, such as distribution, composition, comparison, connection and changing trend. Corresponding to different relationships, select the corresponding graphics to display.

Step 2: convert data and apply functions

A great deal of programming work in data analysis and modeling is based on data preparation: loading, cleaning, transformation, and reshaping. In the visualization step, we also need to sort out the data, convert it to the format we need, and then use the visualization method to complete the drawing.

Here are some common data conversion methods:

Merge: merge,concat,combine_frist (similar to full external connections in a database)

Reshape: reshape; Axial rotation: pivot (similar to excel PivotTable)

Weight removal: drop_duplicates

Mapping: map

Fill replacement: fillna,replace

Rename axis index: rename

Convert classified variables into the get_dummies function of 'dummy variable matrix' and take a limit value for a column of data in df, and so on.

The function looks for the corresponding function in python according to the selected graph in the first step.

Step 3: parameter setting, clear at a glance

After the original drawing is finished, we can modify the color (color), linetype (linestyle), mark (maker) or other chart decoration item title (Title), axis label (xlabel,ylabel), axis scale (set_xticks), and legend (legend) to make the graph more intuitive.

The third step is on the basis of the second step, in order to make the graphics more clear, do the modification work. The specific parameters can be found in the drawing function.

The basis of visual drawing

The basis of Matplotlib drawing

# Import package import numpy as npimport pandas as pdimport matplotlib.pyplot as plt

Figure and Subplot

Matplotlib's graphics are all in the Figure (canvas), and Subplot creates the image space. You cannot draw through figure, you must create one or more subplot with add_subplot.

Figsize can specify the image size.

# create canvas fig = plt.figure () # create subplot,221 indicates that this is the first image in a 2-row and 2-column table. Ax1 = fig.add_subplot (221) # but now more accustomed to using the following methods to create canvases and images. 2Magne2 means that this is a 2p2 canvas that can place four image fig, and the sharex and sharey parameters of axes = plt.subplots (2PowerShaft TrueSharedShaft True) # plt.subplot can specify that all subplot use the same xMagine y-axis scale.

Color color, Mark marker, and Linetype linestyle

The plot function of matplotlib accepts a set of X and Y coordinates, as well as a string abbreviation for color and linetype: 'GMurmuri', indicating that the color is green green and the linetype is'--'dashed line. You can also use parameters to specify explicitly.

Linetypes can also be marked (marker) to highlight the location of data points. Tags can also be placed in a format string, but the tag type and linetype must be placed after the color.

Plt.plot (np.random.randn (30), color='g',linestyle='--',marker='o') []

Ticks, labels and legends

The xlim, xticks, and xtickslabels methods of plt control the range and scale position and scale label of the chart, respectively.

The current parameter value is returned when the method is called without a parameter, and the parameter value is set when the method is called with a parameter.

Plt.plot (np.random.randn (30), color='g',linestyle='--',marker='o') plt.xlim ([0jue 15]) # horizontal axis scale changed to 0-15 (0Jue 15)

Plt.plot (np.random.randn (30), color='g',linestyle='--',marker='o') plt.xlim ([0jue 15]) # horizontal axis scale changed to 0-15 (0Jue 15)

Set title, axis label, scale and scale label

Fig = plt.figure () Ax = fig.add_subplot (1 My first Plot' 1) ax.plot (np.random.randn (1000). Cumsum () ticks = ax.set_xticks ([0min250 pint 500pje 750pr 1000]) # set scale value labels = ax.set_xticklabels (['one','two','three','four','five']) # set scale label ax.set_title (' My first Plot') # set title ax.set_xlabel ('Stage') # set axis label Text (0.5pc0mt Stage`)

Add Legend

Legend legend is another important tool for identifying icon elements. You can pass in the label parameter when you add subplot.

Fig = plt.figure (figsize= (12pm 5)) Ax = fig.add_subplot (1000) ax.plot (np.random.randn (1000). Cumsum () # pass in the label parameter, and define the label name ax.plot (np.random.randn (1000). Cumsum (), 'KMB (1000)) ax.plot (np.random.randn (1000). Cumsum (),) # after the graph is created, you only need to call the legend parameter to call the label. Ax.legend (loc='best') # if the requirement is not very strict, it is recommended to use the loc='best' parameter to let it choose the best location.

Notes

In addition to standard chart objects, we can also customize the addition of some text notes or arrows.

Annotations can be added through functions such as text,arrow and annotate. The text function can draw the text in the specified XMagne y coordinate location, and can also customize the format.

Plt.plot (np.random.randn (1000). Cumsum ()) plt.text (600, 10) family='monospace',fontsize=10) # Chinese annotations do not display properly in the default environment, so you need to modify the configuration file to support Chinese fonts. Please search for specific steps by yourself.

Save the chart to a file

Using plt.savefig, you can save the current chart to a file. For example, to save the chart as a png file, you can execute

The file type is based on the extension. Other parameters include:

Fname: a string containing the file path, and the extension specifies the file type

Dpi: resolution, default background color of 100facecolor,edgcolor image, default'w 'white

Format: display settings file format ('png','pdf','svg','ps','jpg', etc.)

Bbox_inches: the part of the chart that needs to be retained. If set to "tight", an attempt is made to cut out the white space around the image

Plt.savefig ('. / plot.jpg') # saves the drawing function in Pandas of jpg format image with plot name

Matplotlib drawing

Matplotlib is the most basic drawing function and a relatively low-level tool. Assembling a chart requires individual calls to the underlying components. There are many advanced matplotlib-based drawing methods in Pandas, and charts that originally require multiple lines of code need only a few lines to use pandas.

What we use is the drawing package in pandas.

Import matplotlib.pyplot as plt

Line pattern diagram

Both Series and DataFrame have a plot method for generating various types of charts. By default, they generate linetypes.

S = pd.Series (np.random.randn (10). Cumsum (), index=np.arange (0meme 100j 10)) the index index of the s.plot () # Series object is passed to matplotlib to draw the x-axis.

Df = pd.DataFrame (np.random.randn (10) 4) .Cumsum (0), columns= ['Achilles df.plot () # plot automatically changes colors for different variables and adds legends.

Parameters of the Series.plot method

Label: the label for the chart

Style: style string, 'gmurmuri'

Alpha: fill opacity of the image (0-1)

Kind: chart type (bar,line,hist,kde, etc.)

Xticks: setting x-axis scale valu

Yticks: setting y-axis scale valu

Xlim,ylim: set the limit of the axis, [0recom 10]

Grid: displays grid lines, off by default

Rot: rotating scale label

Use_index: use the index of an object as a scale label

Logy: using a logarithmic ruler on the Y axis

Parameters of the DataFrame.plot method

DataFrame has some unique options in addition to the parameters in Series.

Subplots: draw individual DataFrame columns into a separate subplot

Sharex,sharey: sharing XBI y axes

Figsize: controlling image size

Title: image titl

Legend: add legend, displayed by default

Sort_columns: draw columns alphabetically, using the current order by default

Bar chart

Add kind='bar' or kind='barh', to the code that generates a linetype chart to generate a bar chart or a horizontal bar chart.

Fig,axes = plt.subplots (2) data = pd.Series (np.random.rand (10), index=list ('abcdefghij')) data.plot (kind='bar',ax=axes [0], rot=0,alpha=0.3) data.plot (kind='barh',ax=axes [1], grid=True)

There is a very practical method for bar charts:

Use value_counts to graphically display the frequency of values in Series or DF.

For example, df.value_counts (). Plot (kind='bar')

The basic syntax of Python visualization ends here, and other graphics are drawn in more or less the same way.

The key point is to follow the train of thought of three steps to think, choose and apply. More practice can make you more proficient.

Thank you for your reading, the above is the content of "introduction to Python data Visualization case Analysis". After the study of this article, I believe you have a deeper understanding of the problem of Python data Visualization introduction case Analysis, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report