Python Monthly Change Calculation (Pandas)

Question:

Here is data

id date population
1 2021-5 21
2 2021-5 22
3 2021-5 23
4 2021-5 24
1 2021-4 17
2 2021-4 24
3 2021-4 18
4 2021-4 29
1 2021-3 20
2 2021-3 29
3 2021-3 17
4 2021-3 22

I want to calculate the monthly change regarding population in each id. so result will be:

id date delta
1 5 .2353
1 4 -.15
2 5 -.1519
2 4 -.2083
3 5 .2174
3 4 .0556
4 5 -.2083
4 4 .3182

delta := (this month – last month) / last month

How to approach this in pandas? I’m thinking of groupby but don’t know what to do next

remember there might be more dates. but results is always

Asked By: Yehui He

||

Answers:

maybe you could try something like:

data['delta'] = data['population'].diff()
data['delta'] /= data['population']

with this approach the first line would be NaNs, but for the rest, this should work.

Answered By: Andre

Use GroupBy.pct_change with sorting columns first before, last remove misisng rows by column delta:

df['date'] = pd.to_datetime(df['date'])
df = df.sort_values(['id','date'], ascending=[True, False])

df['delta'] = df.groupby('id')['population'].pct_change(-1)

df = df.dropna(subset=['delta'])
print (df)
   id       date  population     delta
0   1 2021-05-01          21  0.235294
4   1 2021-04-01          17 -0.150000
1   2 2021-05-01          22 -0.083333
5   2 2021-04-01          24 -0.172414
2   3 2021-05-01          23  0.277778
6   3 2021-04-01          18  0.058824
3   4 2021-05-01          24 -0.172414
7   4 2021-04-01          29  0.318182
Answered By: jezrael

Try this:

df.groupby('id')['population'].rolling(2).apply(lambda x: (x.iloc[0] - x.iloc[1]) / x.iloc[0]).dropna()
Answered By: JMA
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.