How to calculate percent change of a pandas dataframe across groups with a custom period?

Question:

I am trying to compute the percent change in a a column of a pandas dataframe. I can get it to work when I don’t specify periods, but I would like to specify the number of periods to consider. It has to work across groups is the problem, I can get it to work without groups, but for some reason it’s not working now?

_df_ty['velocity_7'] = _df_ty.groupby('store')['sales'].apply(pd.Series.pct_change(periods=7)).abs()

It yells at me with:

TypeError: pct_change() missing 1 required positional argument: 'self'

Which I don’t understand how that’s happening inside of apply? Google and the pandas docs aren’t helping here.

Further, all the stack overflow answers I can find are just about calculating the percent change to begin with, I cannot find an example of one with grouping and a non-1 period in use.

Asked By: capnshanty

||

Answers:

Q.Which I don’t understand how that’s happening inside of apply?

Your calling a instance/bound method (pct_change) on a class Series without actually creating an instance first..hence the error. Here is the simple example to understand this..

class Foo:
    def greet(self, name):
        print('Hi', name)

Foo.greet(name='shubham') # Wrong: this will raise TypeError
Foo().greet(name='shubham') # Correct

Solution to your problem.

_df_ty['velocity_7'] = _df_ty.groupby('store')['sales']
                             .apply(lambda s: s.pct_change(periods=7)).abs() # s is an instance of Series

Further you can simplify your solution since there is no need to use apply

_df_ty['velocity_7'] = _df_ty.groupby('store')['sales'].pct_change(periods=7).abs()
Answered By: Shubham Sharma

Instead, you should call the pct_change() method on a pd.Series object directly, and then group the resulting series by store to calculate the percent change within each store:

_df_ty[‘velocity_7’] = _df_ty.groupby(‘store’)[‘sales’].pct_change(periods=7).abs().groupby(‘store’).fillna(0)

Note that we’re calling fillna(0) at the end to replace any missing values (which will occur for the first 7 rows of each store) with 0, since the percent change for those rows is undefined.

Answered By: August Infotech
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.