Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the common functions of Pandas

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces what are the commonly used functions of Pandas, which are introduced in great detail and have certain reference value. Friends who are interested must finish reading!

First of all, we will randomly generate a data table with 5 rows and 3 columns. Save to csv file and read.

Import pandas as pd

Import numpy as np

Sample = np.array (np.random.randint (0Jing 100, size=15))

Sample_reshape = sample.reshape ((5pm 3))

Sample_pd = pd.DataFrame (sample_reshape)

Sample_pd.to_csv ("sample.csv", header=None, index=None)

Import pandas as pd

Import numpy as np

Sample = pd.read_csv ("sample.csv", header=None)

Print sample.head ()

"

0 1 2

0 6 40 24

1 5 24 56

2 59 21 44

3 58 4 25

4 83 74 58

"

# sort

First of all, let's introduce how to sort the data box. generally speaking, pandas provides two sorting methods, one is to sort according to the index value, and the other is to sort according to a column or row in the data box, which is the same as the sort in Excel, but the sorting result is extended to the whole data table, not by a single row or column, if you want to sort rows or columns separately. You can index rows or columns first, and then sort them.

# # sort_index

The by parameter specifies the column name, axis defaults to 0, and eucalyptus columns are sorted. After sorting, you get 4,21,24,40,74. You can specify axis as 1, sort by row, and the result is 5,24,56.

Import pandas as pd

Sample = pd.read_csv ("sample.csv", header=None)

Sort_index_1 = sample.sort_index (by=1)

Print sort_index_1

"

0 1 2

3 58 4 25

2 59 21 44

1 5 24 56

0 6 40 24

4 83 74 58

"

Sort_index_axis_1 = sample.sort_index (by=1, axis=1)

Print sort_index_axis_1

"

0 1 2

0 6 40 24

1 5 24 56

2 59 21 44

3 58 4 25

4 83 74 58

"

The ascending parameter specifies the descending sort, from large to small.

Sort_index_ascend = sample.sort_index (by=1, ascending=False)

Print sort_index_ascend

"

0 1 2

4 83 74 58

0 6 40 24

1 5 24 56

2 59 21 44

3 58 4 25

"

# # sort_values

From the results, we find that sort_values and sort_index are almost the same. But, sort_index will be discarded later. So you can just learn the usage of sort_values.

Import pandas as pd

Sample = pd.read_csv ("sample.csv", header=None)

Sample_sort_value = sample.sort_values (by=1)

Print sample_sort_value

Print "- * -" * 5

Sample_sort_axis = sample.sort_values (by=1, axis=1)

Print sample_sort_axis

Print "- * -" * 5

Sort_value_ascend = sample.sort_values (by=1, ascending=False)

Print sort_value_ascend

"

0 1 2

3 58 4 25

2 59 21 44

1 5 24 56

0 6 40 24

4 83 74 58

-*

0 1 2

0 6 40 24

1 5 24 56

2 59 21 44

3 58 4 25

4 83 74 58

-*

0 1 2

4 83 74 58

0 6 40 24

1 5 24 56

2 59 21 44

3 58 4 25

"

Let's take a look at a slightly more advanced game, what to do if you want to sort by the maximum value of a row or column. First of all, we add a new column to find the maximum value of each row. Then we can sort it in descending order according to the maximum value.

Import pandas as pd

Sample = pd.read_csv ("sample.csv", header=None)

Sample ['row_max'] = sample.apply (lambda x: x.max (), axis=1)

New = sample.sort_values (by='row_max', ascending=False)

Print new

"

0 1 2 row_max

4 83 74 58 83

2 59 21 44 59

3 58 4 25 58

1 5 24 56 56

0 6 40 24 40

"

Learn how to sort by maximum, then other statistics are fine, such as mean, minimum, and so on.

# apply, applymap, map

Of the three functions, the first two are for DataFrame, while map is for Series. First of all, take a look at the function documentation, and it is basically clear how they use it.

DataFrame.apply (func, axis=0, broadcast=False, raw=False, reduce=None, args= (), * * kwds)

DataFrame.applymap (func)

Series.map (arg, na_action=None)

The apply function applies a function func to the elements of DataFrame, where axis specifies the dimension of the data, and the other parameters are not commonly used. We will not talk about it here, and then you can take a look at it when you need it. Applymap applies the function func directly to each element; the map function corresponds the value to a Series. Let's take a look at the chestnut.

Import pandas as pd

Import numpy as np

Df = pd.DataFrame (np.random.randn (3,3))

Print df

Print

Df = df.applymap (lambda x:'% .2f'% x)

Print df

"

0 1 2

0 0.776506-0.605382 1.843036

1 0.522743 1.267487 1.288286

2 0.495450 0.583332-0.590918

0 1 2

0 0.78-0.61 1.84

1 0.52 1.27 1.29

2 0.50 0.58-0.59

"

Import pandas as pd

X = pd.Series ([1,2,3], index= ['one',' two', 'three'])

Print x

Y = pd.Series (['foo',' bar', 'baz'], index= [1,2,3])

Print

Print y

Print

Print x.map (y)

"

One 1

Two 2

Three 3

Dtype: int64

1 foo

2 bar

3 baz

Dtype: object

One foo

Two bar

Three baz

Dtype: object

"

# grouping

DataFrame.groupby (by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, * * kwargs)

Then or the previous data, we add a new column, column name key1, grouping means to divide the data box into different groups with a certain mark, here select key1 as the grouping basis, so it is divided into two groups, the role of grouping we can count the statistics in each group. For example, group statistics will be used to analyze problems such as different genders, different ages, and so on.

Note that grouped is a SeriesGroupBy object. For specific statistics, you need to use the SeriesGroupBy method.

Import pandas as pd

Sample = pd.read_csv ("sample.csv", header=None)

Sample ['key1'] = [' averse, 'baked,' baked, 'axed,' b']

Print sample

Print

Grouped = sample [1] .groupby (sample ['key1'])

Print grouped

Print

Print grouped.mean ()

Print

Print grouped.max ()

"

0 1 2 key1

0 6 40 24 a

1 5 24 56 b

2 59 21 44 b

3 58 4 25 a

4 83 74 58 b

Key1

A 22.000000

B 39.666667

Name: 1, dtype: float64

Key1

A 40

B 74

Name: 1, dtype: int64

"

These are all the contents of this article entitled "what are the common functions of Pandas?" Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report