Conditional merging with pandas

Question:

I have a pandas dataframe as below which details additional calls into an area:

CommsDate Area Day0 Incremental Day1 Incremental Day2 Incremental
01/01/24 Sales 43 36 29
01/01/24 Service 85 74 66
02/01/24 Sales 56 42 31
02/01/24 Service 73 62 49
03/01/24 Sales 48 32 24
03/01/24 Service 67 58 46

I am trying to calculate by date how many calls were received, so calls received for Sales on 1st Jan would be Day0_incremental for that date (43), 2nd Jan would be Day0 for 2nd Jan plus Day1 for 1st Jan (36+56) and 3rd Jan would be Day0 for 3rd Jan plus Day1 for 2nd Jan plus Day2 for 1st Jan (48+42+29), resulting in the below dataframe:

CallDate Sales Service
01/01/24 43 85
02/01/24 92 147
03/01/24 119 195
04/01/24 63 107
05/01/24 24 46

I have managed to create a shell of a dataframe for the second table with no values under the area columns, but at a loss as to the next steps:

df['CommsDate'] = pd.to_datetime(df['CommsDate'], format='%d/%m/%Y')
areaUnique = df['Area'].unique().tolist()
from datetime import timedelta
CallDate = pd.date_range(start=min(df['CommsDate']), end=max(df['CommsDate'])+timedelta(days=6), freq='D')

data = {area: [] for area in areaUnique}

dfNew = pd.DataFrame(data)

dfNew['CallDate'] = CallDate

dfNew = dfNew.melt(id_vars=['CallDate'], var_name='Area')

dfNew = dfNew.pivot(index='CallDate', columns='Area', values='value')

dfNew = dfNew.reset_index()

dfNew = dfNew[['CallDate'] + areaUnique]

I have started to write a for loop, but I’ve only got this far:

for i in range(1,len(areaUnique)+1):
    dfNew.columns(i) = 
Asked By: HanRei

||

Answers:

You can pivot, shift and add:

df['CommsDate'] = pd.to_datetime(df['CommsDate'], dayfirst=True)
tmp = df.pivot(index='CommsDate', columns='Area')

out = (tmp['Day0 Incremental']
       .add(tmp['Day1 Incremental'].shift(freq='1d'), fill_value=0)
       .add(tmp['Day2 Incremental'].shift(freq='2d'), fill_value=0)
       .reset_index().rename_axis(columns=None)
      )

Or, programmatically with functools.reduce using the extracted number from the DayX … string:

from functools import reduce
import re

reg = re.compile(r'Day(d+)')

df['CommsDate'] = pd.to_datetime(df['CommsDate'], dayfirst=True)
tmp = df.pivot(index='CommsDate', columns='Area')

out = reduce(lambda a,b: a.add(b, fill_value=0),
             (tmp[d].shift(freq=f'{reg.search(d).group(1)}d') for d in
              tmp.columns.get_level_values(0).unique())
            ).reset_index().rename_axis(columns=None)

Output:

   CommsDate  Sales  Service
0 2024-01-01   43.0     85.0
1 2024-01-02   92.0    147.0
2 2024-01-03  119.0    195.0
3 2024-01-04   63.0    107.0
4 2024-01-05   24.0     46.0
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.