# How to get the growth between two rows

## Question:

I’m trying to get the growth (in %) between two values at different period. Here is how my DataFrame looks like:

``````     sessionSource dateRange activeUsers
0    instagram.com   current           5
1    instagram.com  previous           0
2  l.instagram.com   current          83
3  l.instagram.com  previous          11
4     snapchat.com   current           2
5     snapchat.com  previous           1

``````

What I’m trying to get is:

``````     sessionSource dateRange activeUsers  Growth
0    instagram.com   current           5     xx%
2  l.instagram.com   current          83     xx%
4     snapchat.com   current           2     xx%

``````

I’m not a Pandas expert, I tried few things but nothing came close to what I need.

Thanks a lot for any help.

You can use:

``````(df.sort_values(by=['sessionSource', 'dateRange'],
ascending=[True, False])
.groupby('sessionSource', as_index=False)
.agg({'dateRange': 'first', 'activeUsers': lambda s: s.pct_change().dropna().mul(100)})
)
``````

Output:

``````     sessionSource dateRange  activeUsers
0    instagram.com  previous          inf
1  l.instagram.com  previous   654.545455
2     snapchat.com  previous   100.000000
``````

Assuming you literally just need the percent change between current and previous and current/previous are in the correct order, you can just group the data based on the source and get the percent change of the group
.Use the `pandas.Series.pct_change()` method on the grouped object and you should be good.

``````# sort values before to make sure the order is maintained
df = df.sort_values(by=["sessionSource", "dateRange"], ascending=False)
df['Growth']= (df.groupby('sessionSource')['activeUsers'].apply(pd.Series.pct_change))
#drop na from the unavailable results and convert to %
df["growth"] = (df["growth"].dropna()*100).round(2)

``````

For ex.(taken from the official documentation and applied on a series):

``````s = pd.Series([90, 91, 85])
s
0    90
1    91
2    85
dtype: int64

s.pct_change()
0         NaN
1    0.011111
2   -0.065934
dtype: float64
``````

EDIT
As @Omar suggested, I posted a small edit to the code that fully solved his problem(just added manual reordering + converting percentage points into percentages). The main gist is still `group_by` + `pct_change`

Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.