python pandas resample monthly aggregated data to daily, then aggregate back to weekly

Question:

Here is my example data:

team sales month
a 100 1/1/2023
a 200 2/1/2023
b 600 1/1/2023
b 300 2/1/2023

load in pandas like so:

mydata = pd.DataFrame([
['team','sales','month'],
['a',   100,    '1/1/2023'],
['a',   200,    '2/1/2023'],
['b',   600,    '1/1/2023'],
['b',   300,    '2/1/2023']
])
mydata.columns = mydata.iloc[0]
mydata = mydata[1:]  
mydata['month'] = pd.to_datetime(mydata['month'])

My desired outcome for team "a" is this data aggregated by each week as starting on Monday, like this:

team sales Monday Week
a 22.58 1/2/2023
a 22.58 1/9/2023
a 22.58 1/16/2023
a 22.58 1/23/2023
a 42.17 1/30/2023
a 50 2/6/2023
a 50 2/13/2023
a 50 2/20/2023
a 14.29 2/27/2023

So the logic on the calculated sales per week is:
$100 of sales in January, so avg sales per day is 100/31 = 3.23 per day, * 7 days in a weeks = $22.58 for each week in January.
February is $200 over 28 days, so ($200/28)*7 = $50 a week in Feb.

The calculation on the week starting 1/30/2023 is a little more complicated. I need to carry the January rate the first 2 days of 1/30 and 1/31, then start summing the Feb rate for the following 5 days in Feb (until 2/5/2023). So it would be 5*(200/28)+2*(100/31) = 42.17

Is there a way to do this in Pandas? I believe the logic that may work is taking each monthly total, decomposing that into daily data with an average rate, then using pandas to aggregate back up to weekly data starting on Monday for each month, but I’m lost trying to chain together the date functions.

Asked By: barker

||

Answers:

I think you have miscalculation for team A for the week of 1/30/2023. It has no sales in Feb so its sales for the week should be 3.23 * 2 = 4.46.

Here’s one way to do that:

def get_weekly_sales(group: pd.DataFrame) -> pd.DataFrame:
    tmp = (
        # Put month to the index and convert it to monthly period
        group.set_index("month")[["sales"]]
        .to_period("M")
        # Calculate the average daily sales
        .assign(sales=lambda x: x["sales"] / x.index.days_in_month)
        # Up-sample the dataframe to daily
        .resample("1D")
        .ffill()
        # Sum by week
        .groupby(pd.Grouper(freq="W"))
        .sum()
    )
    # Clean up the index
    tmp.index = tmp.index.to_timestamp().rename("week_starting")

    return tmp

df.groupby("team").apply(get_weekly_sales)
Answered By: Code Different
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.