In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
I would like to share with you the example analysis of the grouping operation of time series data in pandas. I believe most people don't know much about it, so share this article for your reference. I hope you will gain a lot after reading this article. Let's learn about it together.
Brief introduction of Python big data Analysis 1
When we use pandas to analyze and process time series data, we often need to group and aggregate the data under the original time granularity according to different time granularity, such as calculating the lowest and highest closing price of each month based on the stock closing price of each trading day.
In pandas, we can use resample (), groupby () and Grouper () to accomplish such tasks very efficiently and quickly for different application scenarios.
Figure 12 time grouping aggregation in pandas
In pandas, according to different task scenarios, grouping and aggregation of time series can be achieved in the following two ways:
2.1 grouping and aggregation of time series data using resample ()
Resample originally means "resampling", which can be divided into "upsampling" and "downsampling", while we usually use "downsampling", that is, we calculate lower frequency data from high-frequency data according to certain rules, as we said at the beginning of the monthly summary of daily data.
If you are familiar with the groupby () grouping operation in pandas, you can quickly understand how resample () is used, which is essentially "grouping" time series data, with the most basic parameter being rule, which is used to set how to resample, as in the following example:
Import pandas as pd
# recorded the share price of Apple on every trading day from 2013-02-08 to 2018-02-07
AAPL = pd.read_csv ('AAPL.csv', parse_dates= [' date'])
# calculate the monthly maximum closing price of stocks on a monthly basis
(
AAPL
.set _ index ('date') # set date to index
.resample ('M') # in months
.agg ({
'close': ['max',' min']
})
)
Figure 2
As you can see, in the above example, we apply the resample () method to the DataFrame whose index is of date-time type. The parameter'M 'passed in is the parameter rule in the first position of the resample, which is used to determine the rules of the time window. For example, the string' M 'here represents "the month and the last day of the corresponding month in the aggregate result". The commonly used solidified time window rules are shown in the following table:
The rules specify W week, M month, displayed as the last day of the month, MS month, shown as the first day of the month, Q quarter, shown as the last day of the current quarter, QS quarter, shown as the first day of the current quarter, A year, displayed as the last day of the year AS year, displayed as the first day of the year, H hours TT or min minutes S seconds L or ms milliseconds
And these rules can be preceded by adding numbers to achieve a multiple effect:
# calculate the average monthly closing price of stocks with 6 months as the statistical window and display it as the first day of the month
(
AAPL
.set _ index ('date') # set date to index
.resample ('6MS') # in 6 months
.agg ({
'close': 'mean'
})
)
Figure 3
And what's very sweet about resample () is that it automatically aligns you to regular time units. For example, we only have records on trading days. If there are no corresponding records in the time units we set, we will also keep the time points with missing records for you:
(
AAPL
.set _ index ('date') # set date to index
.resample ('1D') # in 1 day
.agg ({
'close': 'mean'
})
)
Figure 4
Through the parameter closed, we can set the interval closure method for fine-grained time units. For example, when we set closed to 'right', starting from the first row of records to calculate the time window we fall into, it corresponds to the right boundary of the time window, thus affecting the division of all subsequent time units:
(
AAPL
.set _ index ('date') # set date to index
.resample ('2Downs, closed='right')
.agg ({
'close': 'mean'
})
)
Figure 5
Even if your data box index is not of date-time type, you can use the parameter on to pass in the date-time column name to achieve the same effect.
2.2 mixed grouping using groupby () + Grouper ()
In some cases, we need to group not only using time type columns, but also multiple columns, including time types, together, in which case we can use Grouper ().
It passes the parameter equivalent to rule in resample () through the parameter freq, and uses the parameter key to specify the corresponding time type column name, but it can help us to create a grouping rule and pass it into groupby ():
# Statistics on the average monthly closing prices of Apple and Microsoft respectively
(
Pd
.read _ csv ('AAPL&MSFT.csv', parse_dates= [' date'])
.groupby (['Name', pd.Grouper (freq='MS', key='date')])
.agg ({
'close': 'mean'
})
)
Figure 6
And in this mixed grouping mode, we can easily cooperate with apply, transform and other operations, so I will not repeat them here.
The above is all the contents of the article "example Analysis of time Series data grouping Operation in pandas". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.