How to include start and end date in pandas groupby/Grouper?

Question:

I have a example DataFrame below.

import pandas as pd
df = pd.DataFrame({
    'date':["2022-01-01", "2022-02-01", "2022-03-01",
             "2022-04-01", "2022-05-01", "2022-06-01"],
    'amount': [2,4,3,5,6,4],
    'category': ['credit', 'credit', 'debit', 'credit', 'debit', 'credit']})

df.date = pd.to_datetime(df.date)

print(df)
       date  amount category
0 2022-01-01       2   credit
1 2022-02-01       4   credit
2 2022-03-01       3    debit
3 2022-04-01       5   credit
4 2022-05-01       6    debit
5 2022-05-01       4    debit
6 2022-06-01       4   credit

I want to get the average sum of amount of debit’s per month. However, when I try the groupby using grouper I get:

print(df[df['category']=='debit'].groupby(pd.Grouper(key="date", freq="MS")).sum())

            amount
date              
2022-03-01       3
2022-04-01       0
2022-05-01       10

My desired output has all the months from the original DataFrame but 0 in the months there are no debits

            amount
date  
2022-01-01       0
2022-02-01       0     
2022-03-01       3
2022-04-01       0
2022-05-01       10
2022-06-01       0

I have tried using "Origin" to set start date but no luck 🙁

Answers:

Use df.mask and convert DatetimeIndex as PeriodIndex

out = (df['amount'].mask(df['category'] == 'credit', other=0)
                   .groupby(df['date'].dt.to_period('M'))
                   .sum().reset_index())
# Output
      date  amount
0  2022-01       0
1  2022-02       0
2  2022-03       3
3  2022-04       0
4  2022-05      10
5  2022-06       0

Another way with pd.Grouper:

out = (df.assign(amount=df['amount'] * df['category'].eq('debit'))
         .groupby(pd.Grouper(key='date', freq='MS'))['amount']
         .sum().reset_index())
print(out)

# Output
        date  amount
0 2022-01-01       0
1 2022-02-01       0
2 2022-03-01       3
3 2022-04-01       0
4 2022-05-01      10
5 2022-06-01       0

The idea is to set amount to 0 when the category is "credit".

Answered By: Corralien