Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to realize the apply conversion of groupby packets by Pandas

2025-04-07 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/02 Report--

This article introduces the knowledge about "Pandas how to realize the apply conversion of groupby grouping". In the operation process of actual cases, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!

Knowledge: Pandas GroupBy follows split, apply, combine patterns

Here split refers to the group by of pandas, we implement the apply function ourselves, the result returned by apply is combined by pandas to get the result

GroupBy.apply(function)

The first parameter of function is dataframe.

function returns results, but dataframes, series, single values, or even completely unrelated to the input dataframe

This example demonstrates:

How do I normalize a column of values by grouping?

How do I get the TOPN data for each group?

Example 1: How to normalize a column of values by grouping?

Normalize the numerical columns of different ranges and map them to the interval [0,1]:

It is easier to compare data horizontally, for example, the price field is hundreds to thousands, and the increase field is 0 to 100.

Machine learning models learn faster and perform better

Normalized formula:

Demo: Normalization of user ratings for movies

Each user's rating varies, some optimists score high, some pessimists score low, normalized by user

import pandas as pd

ratings = pd.read_csv(

"./ datas/movielens-1m/ratings.dat",

sep="::",

engine='python',

names="UserID::MovieID::Rating::Timestamp".split("::")

)

ratings.head()

#Implement grouping by user ID and then normalizing one of the columns

def ratings_norm(df):

"""

@param df: dataframe for each user group

"""

min_value = df["Rating"].min()

max_value = df["Rating"].max()

df["Rating_norm"] = df["Rating"].apply(

lambda x: (x-min_value)/(max_value-min_value))

return df

ratings = ratings.groupby("UserID").apply(ratings_norm)

ratings[ratings["UserID"]==1].head()

You can see that UserID==1, Rating==3 is his lowest score, is an optimist, we normalized to 0 points;

Example 2: How do I get the TOPN data for each packet?

Get the highest temperature data for 2 days per month in 2018

fpath = "./ datas/beijing_tianqi/beijing_tianqi_2018.csv"

df = pd.read_csv(fpath)

#Replace suffix ℃ for temperature

df.loc[:, "bWendu"] = df["bWendu"].str.replace("℃", "").astype('int32')

df.loc[:, "yWendu"] = df["yWendu"].str.replace("℃", "").astype('int32')

#Add a new column for the month

df['month'] = df['ymd'].str[:7]

df.head()

def getWenduTopN(df, topn):

"""

df here is df for each month grouping group

"""

return df.sort_values(by="bWendu")[["ymd", "bWendu"]][-topn:]

df.groupby("month").apply(getWenduTopN, topn=1).head()

We see that the dataframe returned by group's apply function can actually be completely different from the original dataframe.

"Pandas how to implement groupby group apply conversion" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report