How to calculate percent change of a pandas dataframe across groups with a custom period?
Question:
I am trying to compute the percent change in a a column of a pandas dataframe. I can get it to work when I don’t specify periods, but I would like to specify the number of periods to consider. It has to work across groups is the problem, I can get it to work without groups, but for some reason it’s not working now?
_df_ty['velocity_7'] = _df_ty.groupby('store')['sales'].apply(pd.Series.pct_change(periods=7)).abs()
It yells at me with:
TypeError: pct_change() missing 1 required positional argument: 'self'
Which I don’t understand how that’s happening inside of apply? Google and the pandas docs aren’t helping here.
Further, all the stack overflow answers I can find are just about calculating the percent change to begin with, I cannot find an example of one with grouping and a non-1 period in use.
Answers:
Q.Which I don’t understand how that’s happening inside of apply?
Your calling a instance/bound method (pct_change
) on a class Series
without actually creating an instance first..hence the error. Here is the simple example to understand this..
class Foo:
def greet(self, name):
print('Hi', name)
Foo.greet(name='shubham') # Wrong: this will raise TypeError
Foo().greet(name='shubham') # Correct
Solution to your problem.
_df_ty['velocity_7'] = _df_ty.groupby('store')['sales']
.apply(lambda s: s.pct_change(periods=7)).abs() # s is an instance of Series
Further you can simplify your solution since there is no need to use apply
_df_ty['velocity_7'] = _df_ty.groupby('store')['sales'].pct_change(periods=7).abs()
Instead, you should call the pct_change() method on a pd.Series object directly, and then group the resulting series by store to calculate the percent change within each store:
_df_ty[‘velocity_7’] = _df_ty.groupby(‘store’)[‘sales’].pct_change(periods=7).abs().groupby(‘store’).fillna(0)
Note that we’re calling fillna(0) at the end to replace any missing values (which will occur for the first 7 rows of each store) with 0, since the percent change for those rows is undefined.
I am trying to compute the percent change in a a column of a pandas dataframe. I can get it to work when I don’t specify periods, but I would like to specify the number of periods to consider. It has to work across groups is the problem, I can get it to work without groups, but for some reason it’s not working now?
_df_ty['velocity_7'] = _df_ty.groupby('store')['sales'].apply(pd.Series.pct_change(periods=7)).abs()
It yells at me with:
TypeError: pct_change() missing 1 required positional argument: 'self'
Which I don’t understand how that’s happening inside of apply? Google and the pandas docs aren’t helping here.
Further, all the stack overflow answers I can find are just about calculating the percent change to begin with, I cannot find an example of one with grouping and a non-1 period in use.
Q.Which I don’t understand how that’s happening inside of apply?
Your calling a instance/bound method (pct_change
) on a class Series
without actually creating an instance first..hence the error. Here is the simple example to understand this..
class Foo:
def greet(self, name):
print('Hi', name)
Foo.greet(name='shubham') # Wrong: this will raise TypeError
Foo().greet(name='shubham') # Correct
Solution to your problem.
_df_ty['velocity_7'] = _df_ty.groupby('store')['sales']
.apply(lambda s: s.pct_change(periods=7)).abs() # s is an instance of Series
Further you can simplify your solution since there is no need to use apply
_df_ty['velocity_7'] = _df_ty.groupby('store')['sales'].pct_change(periods=7).abs()
Instead, you should call the pct_change() method on a pd.Series object directly, and then group the resulting series by store to calculate the percent change within each store:
_df_ty[‘velocity_7’] = _df_ty.groupby(‘store’)[‘sales’].pct_change(periods=7).abs().groupby(‘store’).fillna(0)
Note that we’re calling fillna(0) at the end to replace any missing values (which will occur for the first 7 rows of each store) with 0, since the percent change for those rows is undefined.