In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces the "Python data visualization to achieve bubble accumulation association diagram", in the daily operation, I believe that many people have doubts in the Python data visualization to achieve bubble accumulation correlation diagram, Xiaobian consulted all kinds of information, sorted out a simple and easy-to-use method of operation, hope to answer the "Python data visualization to achieve bubble accumulation association diagram" doubts will be helpful! Next, please follow the editor to study!
Preface
Some friends say that diagrams made with matplotlib are not as beautiful as other js-based libraries (pyechart, bokeh, plotly, etc.), and they can interact with each other. At the same time, seaborn based on matplotlib packaging also seems to save code.
I wanted to write an article about the comparison of these libraries as a whole, but it doesn't fit my style without practical examples.
Therefore, today's goal chart is that other upper-level visualization libraries are difficult to do (or cannot do at all):
This chart imitates the Economist and is a chart of Canadian immigrants related to their place of birth.
Those visualization libraries based on js packaging should be possible in js environment. But in Python, it's not so optimistic.
I will share d3.js 's practice when I have the opportunity, and you will find that his thinking is very similar to that of matplotlib.
The libraries required for this article are as follows:
The line 8:cycler package is just for the convenience of defining color swatches.
The data goes like this:
Row 3: data column of the bubble chart
Row 4: data columns of the stacking graph
All the general functions in this article are based on a wide table, with a row index on the X axis, and each column as a different chart series.
This is the definition of color:
M_color_cycle defines 7 series of colors, and color values are extracted from the sample chart.
M_bubble_color is the color of the bubble chart.
Space is limited, so I won't explain all the knowledge points in detail.
Break down one by one
Usually complex visualization is formed by a combination of many types of graphics, and it is obvious that the target chart this time consists of three parts:
The accumulation diagram is actually a quadrilateral figure.
A bubble chart is actually a circle figure.
A rectangle in the middle that serves as a join modifier.
Why do I use "graphics" to describe them?
If you use some upper-level visualization libraries, you will find that they are classified at the chart type level, which undoubtedly allows you to draw quickly. But the disadvantage is that the chart can not be customized at will.
Matplotlib provides control of the underlying "graphics" as well as basic chart operations.
First, let's take a look at how to make a stacking diagram. Here are two series as examples:
Line 7: you can draw a bar chart using the Axes.bar method, where the bottom parameter determines the starting position of each column, all 0 by default
Line 11: when drawing the second series, as long as the y value of the first series is set to the starting point of the second series, the effect of the stacking diagram will be made naturally.
The chart is as follows:
Knowing this principle, you can define general functions:
All the general functions in this article are based on wide table data.
Line 3: calculate the bottom value of each series through the cumulative sum + offset operation
Line 5: traverse each column directly from the DataFrame and draw the columns separately. M_color_cycle is a previously defined color swatch
Line 3 is the basic pandas operation. If you are interested, you can refer to my pandas column.
The call is as follows:
Row 3: there are extra columns in the original data. Select the desired columns and sort them horizontally according to the values of the first year.
The chart is as follows:
Make the basic chart, and finally adjust some details (such as the position of the y-axis, tick marks, etc.), because these are just some operations, very simple.
Next, make a bubble chart.
Graphic attribute mapping
The essence of data visualization is actually the mapping of data to graphic elements.
Taking a look at the previous accumulation diagram, we successfully mapped the three dimensional data in the data:
Year, mapped to the horizontal position of the column (x-axis position)
Numeric value, mapped to the height of the column (parameter height when calling the bar method)
Region, mapping to the color of the column
Look at an extreme example. There is also a list of immigrants (migrant) in the data, which we can still map to the stacking map:
Although the chart looks very strange now, it does work:
The width of the column for each year is associated with the data migrant. The wider the column, the greater the number of immigrants that year.
Now, you should feel the nature of data visualization, and at the same time, you can see that there are limited dimensions that each chart can reasonably map.
For example, the column width of the stacking diagram above is obviously not a reasonable mapping attribute.
The solution is to continue mapping with other "graphics".
We draw scatter plots in the same coordinate system, and the mapping relationship is as follows:
The horizontal position of the dot is mapped to the year
The vertical position of the dot is mapped to a fixed value (as long as it is below the column)
The radius of the dot is mapped to data migrant
The code is as follows:
All common functions in this article are based on DataFrame fixed column names. For example, the data needs to have a column named size, which is the size of the bubble.
The dot can be drawn by 6:Axes.scatter, and the parameter s is the radius of the dot.
The parameter clip_on is set to False to prevent the dot from being cropped beyond the visible area.
The call is as follows:
Line 6: change the column name appropriately
Line 7: parameter y, which determines the location of the bubble. Note that the-25 here is the value of the y-axis on the chart.
Look at the chart:
Next, add a rectangle modified by the middle connection.
Draw a figure
Matplotlib has a lot of basic graphics built into it, so it's not hard to create graphics:
This is in
Line 9: create a rectangle, and the first parameter is the series, indicating the position of x and y.
Line 10: add this graph to the coordinate system
Note that the values of the parameters set in line 9 above are represented by data by default.
For example, 40 of [0BZ 40] equals that the lower-left corner of the specified rectangle is located at the y-axis value of 40.
However, the 0 of [0jue 40] should represent the x-axis, why 0?
This is because when we draw, what is passed to the x-axis is a string:
At this point, the x-axis of the coordinate system is converted by matplotlib to an ascending sequence starting with 0.
Matplotlib has six kinds of coordinate system conversion, which is the most important core mechanism, which will not be explained in depth here.
Look at the effect:
The lower left corner of the rectangle is in the middle of the first column, the position of y-axis point 40
The height is exactly 20 units of the y-axis.
The width is exactly the sum of 10 column widths.
Knowing the principle, the requirements are very easy:
Look at the effect:
Very good, label the bubble chart with data, the principle is the same as before:
Finally, adjust the details of the shaft as required:
The full call is as follows:
The effect is as follows:
You will find that we have been setting the relationship between data and graphics throughout the process, which is the core idea of matplotlib!
It may seem like a lot of code, but it's very easy to make them reuse between different data.
Next I will continue to write more unconventional requirements of the chart, please stay tuned!
At this point, the study of "Python data visualization bubble accumulation correlation diagram" is over. I hope to be able to solve everyone's doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.