In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "what are the steps to achieve visualization of Python data". In daily operation, I believe that many people have doubts about the steps to achieve visualization of Python data. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts of "what are the steps to achieve visualization of Python data?" Next, please follow the editor to study!
There are three steps for Python to achieve visualization:
Determine the problem, select the drawing
Convert data, apply function
Parameter setting is clear at a glance
1. First of all, what libraries do we use to draw pictures?
Matplotlib
The most basic drawing library in python is matplotlib, which is the most basic Python visualization library. Generally, Python data visualization starts from matplotlib, and then begins to expand vertically and horizontally.
Seaborn
Is an advanced visualization effect library based on matplotlib, aiming at variable feature selection in data mining and machine learning. Seaborn can use short code to draw visualization effects that describe more dimensional data.
Other libraries include
Bokeh (a library for browser-side interaction visualization, which enables analysts to interact with data); Mapbox (a more powerful visualization tool library for dealing with geographic data engines), etc.
This article mainly uses matplotlib for case study.
Step 1: identify the problem and select the graphic
The business may be complex, but after a split, we need to find out what specific problems we want to express graphically. The training of analytical thinking can learn the methods in McKinsey method and Pyramid principle.
This is a summary of the selection of chart types on the Internet.
In python, we can summarize the following four basic visual elements to present graphics:
Points: scatter plot 2D data, suitable for simple 2D relationships
Lines: line plot 2D data, suitable for time series
Column: bar plot 2D data, suitable for category statistics
Color: heatmap is suitable for displaying the third dimension
There are relationships among data, such as distribution, composition, comparison, connection and changing trend. Corresponding to different relationships, select the corresponding graphics to display.
Step 2: convert data and apply functions
A great deal of programming work in data analysis and modeling is based on data preparation: loading, cleaning, transformation, and reshaping. In the visualization step, we also need to sort out the data, convert it to the format we need, and then use the visualization method to complete the drawing.
Here are some common data conversion methods:
Merge: merge,concat,combine_frist (similar to full external connections in a database)
Reshape: reshape; Axial rotation: pivot (similar to excel PivotTable)
Weight removal: drop_duplicates
Mapping: map
Fill replacement: fillna,replace
Rename axis index: rename
Convert classified variables into the get_dummies function of 'dummy variable matrix' and take a limit value for a column of data in df, and so on.
The function looks for the corresponding function in python according to the selected graph in the first step.
Step 3: parameter setting, clear at a glance
After the original drawing is finished, we can modify the color (color), linetype (linestyle), mark (maker) or other chart decoration item title (Title), axis label (xlabel,ylabel), axis scale (set_xticks), and legend (legend) to make the graph more intuitive.
The third step is on the basis of the second step, in order to make the graphics more clear, do the modification work. The specific parameters can be found in the drawing function.
2. The basis of visual drawing
The basis of Matplotlib drawing
# Import package import numpy as np import pandas as pd import matplotlib.pyplot as plt
Figure and Subplot
Matplotlib's graphics are all in the Figure (canvas), and Subplot creates the image space. You cannot draw through figure, you must create one or more subplot with add_subplot.
Figsize can specify the image size.
Figure () # creates a subplot,221 to indicate that this is the first image in a 2-row and 2-column table. Ax1 = fig.add_subplot (221) # but now more accustomed to using the following methods to create canvases and images. 2Magne2 means that this is a 2p2 canvas that can place four image fig, and the sharex and sharey parameters of axes = plt.subplots (2PowerShaft TrueSharedSharedTrue) # plt.subplot can specify that all subplot use the same xMagine y-axis scale.
The spacing can be adjusted by using Figure's subplots_adjust method.
Subplots_adjust (left=None,bottom=None,right=None,top=None,wspace=None,hspace=None)
Color color, Mark marker, and Linetype linestyle
The plot function of matplotlib accepts a set of X and Y coordinates, as well as a string abbreviation for color and linetype: 'GMurmuri', indicating that the color is green green and the linetype is'--'dashed line. You can also use parameters to specify explicitly.
Linetypes can also be marked (marker) to highlight the location of data points. Tags can also be placed in a format string, but the tag type and linetype must be placed after the color.
Plt.plot (np.random.randn (30), color='g',linestyle='--',marker='o') []
Ticks, labels and legends
The xlim, xticks, and xtickslabels methods of plt control the range and scale position and scale label of the chart, respectively.
The current parameter value is returned when the method is called without a parameter, and the parameter value is set when the method is called with a parameter.
Plt.plot (np.random.randn (30), color='g',linestyle='--',marker='o') plt.xlim () # calls without parameters to display the current parameters; # you can replace xlim with the other two methods (- 1.45000000000002,30.45)
Plt.plot (np.random.randn (30), color='g',linestyle='--',marker='o') plt.xlim ([0jue 15]) # horizontal axis scale changed to 0-15 (0Jue 15)
Set title, axis label, scale and scale label
Fig = plt.figure () Ax = fig.add_subplot (1 My first Plot' 1) ax.plot (np.random.randn (1000). Cumsum () ticks = ax.set_xticks ([0min250 pint 500pje 750pr 1000]) # set scale value labels = ax.set_xticklabels (['one','two','three','four','five']) # set scale label ax.set_title (' My first Plot') # set title ax.set_xlabel ('Stage') # set axis label Text (0.5pc0mt Stage`)
Add Legend
Legend legend is another important tool for identifying icon elements. You can pass in the label parameter when you add subplot.
Fig = plt.figure (figsize= (12pm 5)) Ax = fig.add_subplot (1000) ax.plot (np.random.randn (1000). Cumsum () # pass in the label parameter, and define the label name ax.plot (np.random.randn (1000). Cumsum (), 'KMB (1000)) ax.plot (np.random.randn (1000). Cumsum (),) # after the graph is created, you only need to call the legend parameter to call the label. Ax.legend (loc='best') # if the requirement is not very strict, it is recommended to use the loc='best' parameter to let it choose the best location.
Notes
In addition to standard chart objects, we can also customize the addition of some text notes or arrows.
Annotations can be added through functions such as text,arrow and annotate. The text function can draw the text in the specified XMagne y coordinate location, and can also customize the format.
Plt.plot (np.random.randn (1000). Cumsum ()) plt.text (600, 10) family='monospace',fontsize=10) # Chinese annotations do not display properly in the default environment, so you need to modify the configuration file to support Chinese fonts. Please search for specific steps by yourself.
Save the chart to a file
Using plt.savefig, you can save the current chart to a file. For example, to save the chart as a png file, you can execute
The file type is based on the extension. Other parameters include:
Fname: a string containing the file path, and the extension specifies the file type
Dpi: resolution, default background color of 100facecolor,edgcolor image, default'w 'white
Format: display settings file format ('png','pdf','svg','ps','jpg', etc.)
Bbox_inches: the part of the chart that needs to be retained. If set to "tight", an attempt is made to cut out the white space around the image
Plt.savefig ('. / plot.jpg') # saves an image in jpg format with the name of plot
3. Drawing function in Pandas
Matplotlib drawing
Matplotlib is the most basic drawing function and a relatively low-level tool. Assembling a chart requires individual calls to the underlying components. There are many advanced matplotlib-based drawing methods in Pandas, and charts that originally require multiple lines of code need only a few lines to use pandas.
What we use is the drawing package in pandas.
Import matplotlib.pyplot as plt
Line pattern diagram
Both Series and DataFrame have a plot method for generating various types of charts. By default, they generate linetypes.
S = pd.Series (np.random.randn (10). Cumsum (), index=np.arange (0meme 100j 10)) the index index of the s.plot () # Series object is passed to matplotlib to draw the x-axis.
Df = pd.DataFrame (np.random.randn (10) 4) .Cumsum (0), columns= ['Achilles df.plot () # plot automatically changes colors for different variables and adds legends.
Parameters of the Series.plot method
Label: the label for the chart
Style: style string, 'gmurmuri'
Alpha: fill opacity of the image (0-1)
Kind: chart type (bar,line,hist,kde, etc.)
Xticks: setting x-axis scale valu
Yticks: setting y-axis scale valu
Xlim,ylim: set the limit of the axis, [0recom 10]
Grid: displays grid lines, off by default
Rot: rotating scale label
Use_index: use the index of an object as a scale label
Logy: using a logarithmic ruler on the Y axis
Parameters of the DataFrame.plot method
DataFrame has some unique options in addition to the parameters in Series.
Subplots: draw individual DataFrame columns into a separate subplot
Sharex,sharey: sharing XBI y axes
Figsize: controlling image size
Title: image titl
Legend: add legend, displayed by default
Sort_columns: draw columns alphabetically, using the current order by default
Bar chart
Add kind='bar' or kind='barh', to the code that generates a linetype chart to generate a bar chart or a horizontal bar chart.
Fig,axes = plt.subplots (2) data = pd.Series (np.random.rand (10), index=list ('abcdefghij')) data.plot (kind='bar',ax=axes [0], rot=0,alpha=0.3) data.plot (kind='barh',ax=axes [1], grid=True)
Bar charts have a very practical method: use value_counts to graphically display the occurrence frequency of values in Series or DF.
For example, df.value_counts (). Plot (kind='bar')
The basic syntax of Python visualization ends here, and other graphics are drawn in more or less the same way.
The key point is to follow the train of thought of three steps to think, choose and apply. More practice can make you more proficient.
At this point, the study on "what are the steps to achieve visualization of Python data" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.