In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/03 Report--
This article mainly shows you the "Python visualization toolkit what are", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn what the Python visualization toolkit has "this article.
Matplotlib, Seaborn and Pandas
There are several reasons for putting these three packages together: first, Seaborn and Pandas are built on Matplotlib, and when you use df.plot () in Seaborn or Pandas, you are actually using code written by someone else in Matplotlib. Therefore, these diagrams are similar in terms of beautification, and the syntax used to customize the diagrams is also very similar.
When it comes to these visualization tools, I think of three words: Exploratory, Data, and Analysis. These packages are suitable for exploring data for the first time, but they are not enough for demos.
Matplotlib is a relatively low-level library, but the degree of customization it supports is incredible (so don't simply exclude it from the package used in the demo! But there are other tools that are more suitable for presentation.
Matplotlib can also choose style (style selection), which simulates popular beautification tools such as ggplot2 and xkcd. The following is an example diagram I have done with Matplotlib and related tools:
When dealing with basketball team salary data, I want to find the team with the highest median salary. In order to show the results, I color the salary of each team into a bar chart to show which team players join in order to get better treatment.
Import seaborn as snsimport matplotlib.pyplot as pltcolor_order = ['xkcd:cerulean',' xkcd:ocean', 'xkcd:black','xkcd:royal purple',' xkcd:royal purple', 'xkcd:navy blue','xkcd: powder blue','xkcd: light maroon',' xkcd:lightish blue','xkcd:navy'] sns.barplot (x=top10.Team, y=top10.Salary Palette=color_order) .set_title ('Teams with Highest Median Salary') plt.ticklabel_format (style='sci', axis='y', scilimits= (0jin0))
The second graph is the QmurQ diagram of the regression experiment residual. The main purpose of this picture is to show how to make a useful picture with as few lines as possible, although it may not be so beautiful.
Import matplotlib.pyplot as pltimport scipy.stats as stats#model2 is a regression modellog_resid = model2.predict (X_test)-y_teststats.probplot (log_resid, dist= "norm", plot=plt) plt.title ("Normal Qmurq plot") plt.show ()
It turns out that Matplotlib and its related tools are very efficient, but they are not the best tools for demonstration.
Ggplot (2)
You might ask, "Aaron,ggplot is the most commonly used visualization package in R, but don't you want to write Python packages?" ". People have implemented ggplot2 in Python, copying everything from beautification to syntax of the package.
Of all the materials I've seen, it looks a lot like ggplot2, but the advantage of this package is that it depends on the Pandas Python package. However, the Pandas Python package has recently deprecated some methods, resulting in incompatible versions of Python.
If you want to use real ggplot in R (their look, feel, and syntax are the same except for dependencies), I discussed this in another article.
In other words, if you must use ggplot in Python, you must install version 0.19.2 of Pandas, but I recommend that you do not lower the version of Pandas in order to use lower-level drawing packages.
What makes ggplot2 (and, I think, Python's ggplot, too) important is that they use "graphic syntax" to build pictures. The basic premise is that you can instantiate the graph and then add different features; that is, you can beautify the title, axis, data point, trend line, and so on.
The following is a simple example of ggplot code. We first instantiate the diagram with ggplot, set the beautification properties and data, and then add points, themes, axes and title tags.
# All Salariesggplot (data=df, aes (x=season_start, y=salary, colour=team)) + geom_point () + theme (legend.position= "none") + labs (title = 'Salary Over Time', xylene yearing, y='Salary ($)')
Bokeh
Bokeh is beautiful. Conceptually, Bokeh is similar to ggplot in that they both use graphical syntax to build pictures, but Bokeh has an easy-to-use interface that makes professional graphics and business reports. To illustrate this point, I wrote the code to make the histogram based on the 538 Masculinity Survey dataset:
Import pandas as pdfrom bokeh.plotting import figurefrom bokeh.io import show# is_masc is an one-hot encoded dataframe of responses to the question:# "Do you identify as masculine?" # Dataframe Prepcounts = is_masc.sum () resps = is_masc.columns#Bokehp2 = figure (x=resps, top=counts, width=0.6, fill_color='red') p2.vbar Line_color='black') show (p2) # Pandascounts.plot (kind='bar')
Using Bokeh to express the survey results
The red bar chart shows 538 people about "do you think you are manly?" "the answer to this question. 14 lines of Bokeh code builds an elegant and professional response count histogram-- font size, y-axis scale, and format are all reasonable.
Most of the code I write is used to mark axes and titles, and to add colors and borders to bar charts. When making beautiful and expressive pictures, I prefer to use Bokeh--, which has done a lot of beautification for us.
Using Pandas to represent the same data
The blue figure is the 17th line of code above. The values of the two histograms are the same, but for different purposes. In exploratory settings, it is convenient to write a line of code in Pandas to view the data, but the beautification of Bokeh is very powerful.
All the conveniences provided by Bokeh are customized in matplotlib, including the angle of the x-axis label, background line, y-axis scale, and fonts (size, italics, bold), and so on. The following figure shows some random trends that are more customized: using legends and different colors and lines.
Bokeh is also a great tool for making interactive business reports.
Plotly
Plotly is very powerful, but it takes a lot of time to set up and create graphics, and it's not intuitive. After working on Plotly for most of the morning, I almost didn't do anything, so I went straight to dinner. I only created a bar chart without coordinate labels and a "scatter chart" where the lines could not be deleted. There are some points to pay attention to when getting started with Ploty:
When installing, you need to have the API key and register, not just pip installation.
The data and layout objects drawn by Plotly are unique, but not intuitive.
The picture layout is of no use to me (40 lines of code makes no sense! )
But it also has advantages, and all the disadvantages in the setup have corresponding solutions:
You can edit pictures in the Plotly website and Python environment
Support for interactive pictures and business reports
Plotly works with Mapbox to customize maps
Has great potential to draw excellent graphics.
Here is the code I wrote for this package:
# plot 1-barplot# * * note**-the layout lines do nothing and trip no errorsdata = [go.Bar (x=team_ave_df.team, y=team_ave_df.turnovers_per_mp)] layout = go.Layout (title=go.layout.Title (text='Turnovers per Minute by Team', xref='paper', xylene 0), xaxis=go.layout.XAxis (text='Team', font=dict (family='Courier New, monospace', size=18) Color='#7f7f7f'), yaxis=go.layout.YAxis (title = go.layout.yaxis.Title (text='Average Turnovers/Minute', font=dict (family='Courier New, monospace', size=18, color='#7f7f7f')), autosize=True, hovermode='closest') py.iplot (figure_or_data=data, layout=layout, filename='jupyter-plot', sharing='public', fileopt='overwrite') # plot 2-attempt ata scatterplotdata = [go.Scatter (x=player_year.minutes_played) Y=player_year.salary, marker=go.scatter.Marker (color='red', size=3)] layout= go.Layout (title= "test", xaxis=dict (title='why'), yaxis=dict (title='plotly')) py.iplot (figure_or_data=data, layout=layout, filename='jupyter-plot2', sharing='public') [Image: image.png]
A bar chart showing the average number of mistakes per minute for different NBA teams.
A scatter chart showing the relationship between salary and playing time in NBA
Overall, the out-of-the-box beautification tool looks good, but I failed many times when I tried to copy the document verbatim and change the axis label. But the following picture shows the potential of Plotly and why I spend hours on it:
Some sample diagrams on the Plotly page
Pygal
Pygal is less famous, and like other common drawing packages, it uses graphics framework syntax to build images. Because the drawing goal is relatively simple, this is a relatively simple drawing package. Using Pygal is very simple:
Instantiate a picture
Format with picture target properties
Use figure.add () to add data to the picture.
The main problem I encountered when using Pygal was image rendering. You have to use the render_to_file option, and then open the file in a web browser to see what I just built.
In the end, it seems to be worth it, because the picture is interactive and has a satisfying and customizable beautification feature. All in all, the package looks good, but it's troublesome to create and render the file.
Networkx
Although Networkx is based on matplotlib, it is still an excellent solution for graphical analysis and visualization. Graphics and networking are not my areas of expertise, but Networkx can quickly and easily represent the connections between networks. Here are the different representations I've built for a simple graph, as well as some code downloaded from Stanford SNAP (about drawing a small Facebook network).
I color-coded each node according to the number (1-10). The code is as follows:
Options = {'node_color': range (len (G)),' node_size': 300, 'width': 1,' with_labels': False, 'cmap': plt.cm.coolwarm} nx.draw (G, * * options)
The code to visualize the sparse Facebook graphics mentioned above is as follows:
Import itertoolsimport networkx as nximport matplotlib.pyplot as pltf = open ('data/facebook/1684.circles', 'r') circles = [line.split () for line in f] f.close () network = [] for circ in circles: cleaned = [int (val) for val in circ [1:]] network.append (cleaned) G = nx.Graph () for v in network: G.add_nodes_from (v) edges = [itertools.combinations (net) 2) for net in network] for edge_group in edges: G.add_edges_from (edge_group) options = {'node_color':' lime', 'node_size': 3,' width': 1, 'with_labels': False,} nx.draw (G, * * options)
This graph is very sparse, and Networkx shows this sparseness by maximizing the interval between each cluster.
There are many packages for data visualization, but it is impossible to say which is the best. Hopefully, after reading this article, you can learn how to use different beautification tools and code in different situations.
These are all the contents of the article "what are the Python Visualization kits?" Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.