In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces what are the commonly used functions of Pandas, which are introduced in great detail and have certain reference value. Friends who are interested must finish reading!
First of all, we will randomly generate a data table with 5 rows and 3 columns. Save to csv file and read.
Import pandas as pd
Import numpy as np
Sample = np.array (np.random.randint (0Jing 100, size=15))
Sample_reshape = sample.reshape ((5pm 3))
Sample_pd = pd.DataFrame (sample_reshape)
Sample_pd.to_csv ("sample.csv", header=None, index=None)
Import pandas as pd
Import numpy as np
Sample = pd.read_csv ("sample.csv", header=None)
Print sample.head ()
"
0 1 2
0 6 40 24
1 5 24 56
2 59 21 44
3 58 4 25
4 83 74 58
"
# sort
First of all, let's introduce how to sort the data box. generally speaking, pandas provides two sorting methods, one is to sort according to the index value, and the other is to sort according to a column or row in the data box, which is the same as the sort in Excel, but the sorting result is extended to the whole data table, not by a single row or column, if you want to sort rows or columns separately. You can index rows or columns first, and then sort them.
# # sort_index
The by parameter specifies the column name, axis defaults to 0, and eucalyptus columns are sorted. After sorting, you get 4,21,24,40,74. You can specify axis as 1, sort by row, and the result is 5,24,56.
Import pandas as pd
Sample = pd.read_csv ("sample.csv", header=None)
Sort_index_1 = sample.sort_index (by=1)
Print sort_index_1
"
0 1 2
3 58 4 25
2 59 21 44
1 5 24 56
0 6 40 24
4 83 74 58
"
Sort_index_axis_1 = sample.sort_index (by=1, axis=1)
Print sort_index_axis_1
"
0 1 2
0 6 40 24
1 5 24 56
2 59 21 44
3 58 4 25
4 83 74 58
"
The ascending parameter specifies the descending sort, from large to small.
Sort_index_ascend = sample.sort_index (by=1, ascending=False)
Print sort_index_ascend
"
0 1 2
4 83 74 58
0 6 40 24
1 5 24 56
2 59 21 44
3 58 4 25
"
# # sort_values
From the results, we find that sort_values and sort_index are almost the same. But, sort_index will be discarded later. So you can just learn the usage of sort_values.
Import pandas as pd
Sample = pd.read_csv ("sample.csv", header=None)
Sample_sort_value = sample.sort_values (by=1)
Print sample_sort_value
Print "- * -" * 5
Sample_sort_axis = sample.sort_values (by=1, axis=1)
Print sample_sort_axis
Print "- * -" * 5
Sort_value_ascend = sample.sort_values (by=1, ascending=False)
Print sort_value_ascend
"
0 1 2
3 58 4 25
2 59 21 44
1 5 24 56
0 6 40 24
4 83 74 58
-*
0 1 2
0 6 40 24
1 5 24 56
2 59 21 44
3 58 4 25
4 83 74 58
-*
0 1 2
4 83 74 58
0 6 40 24
1 5 24 56
2 59 21 44
3 58 4 25
"
Let's take a look at a slightly more advanced game, what to do if you want to sort by the maximum value of a row or column. First of all, we add a new column to find the maximum value of each row. Then we can sort it in descending order according to the maximum value.
Import pandas as pd
Sample = pd.read_csv ("sample.csv", header=None)
Sample ['row_max'] = sample.apply (lambda x: x.max (), axis=1)
New = sample.sort_values (by='row_max', ascending=False)
Print new
"
0 1 2 row_max
4 83 74 58 83
2 59 21 44 59
3 58 4 25 58
1 5 24 56 56
0 6 40 24 40
"
Learn how to sort by maximum, then other statistics are fine, such as mean, minimum, and so on.
# apply, applymap, map
Of the three functions, the first two are for DataFrame, while map is for Series. First of all, take a look at the function documentation, and it is basically clear how they use it.
DataFrame.apply (func, axis=0, broadcast=False, raw=False, reduce=None, args= (), * * kwds)
DataFrame.applymap (func)
Series.map (arg, na_action=None)
The apply function applies a function func to the elements of DataFrame, where axis specifies the dimension of the data, and the other parameters are not commonly used. We will not talk about it here, and then you can take a look at it when you need it. Applymap applies the function func directly to each element; the map function corresponds the value to a Series. Let's take a look at the chestnut.
Import pandas as pd
Import numpy as np
Df = pd.DataFrame (np.random.randn (3,3))
Print df
Df = df.applymap (lambda x:'% .2f'% x)
Print df
"
0 1 2
0 0.776506-0.605382 1.843036
1 0.522743 1.267487 1.288286
2 0.495450 0.583332-0.590918
0 1 2
0 0.78-0.61 1.84
1 0.52 1.27 1.29
2 0.50 0.58-0.59
"
Import pandas as pd
X = pd.Series ([1,2,3], index= ['one',' two', 'three'])
Print x
Y = pd.Series (['foo',' bar', 'baz'], index= [1,2,3])
Print y
Print x.map (y)
"
One 1
Two 2
Three 3
Dtype: int64
1 foo
2 bar
3 baz
Dtype: object
One foo
Two bar
Three baz
Dtype: object
"
# grouping
DataFrame.groupby (by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, * * kwargs)
Then or the previous data, we add a new column, column name key1, grouping means to divide the data box into different groups with a certain mark, here select key1 as the grouping basis, so it is divided into two groups, the role of grouping we can count the statistics in each group. For example, group statistics will be used to analyze problems such as different genders, different ages, and so on.
Note that grouped is a SeriesGroupBy object. For specific statistics, you need to use the SeriesGroupBy method.
Import pandas as pd
Sample = pd.read_csv ("sample.csv", header=None)
Sample ['key1'] = [' averse, 'baked,' baked, 'axed,' b']
Print sample
Grouped = sample [1] .groupby (sample ['key1'])
Print grouped
Print grouped.mean ()
Print grouped.max ()
"
0 1 2 key1
0 6 40 24 a
1 5 24 56 b
2 59 21 44 b
3 58 4 25 a
4 83 74 58 b
Key1
A 22.000000
B 39.666667
Name: 1, dtype: float64
Key1
A 40
B 74
Name: 1, dtype: int64
"
These are all the contents of this article entitled "what are the common functions of Pandas?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.