group by week in pandas

Question:

I’m having this data frame:

Name   Date    Quantity
Apple  07/11/17  20
orange 07/14/17  20
Apple  07/14/17  70
Orange 07/25/17  40
Apple  07/20/17  30

I want to aggregate this by Name and Date to get sum of quantities
Details:

Date: Group, the result should be at the beginning of the week (or just on Monday)

Quantity: Sum, if two or more records have same Name and Date (if falls on same interval)

The desired output is given below:

Name   Date    Quantity
Apple  07/10/17  90
orange 07/10/17  20
Apple  07/17/17  30
orange 07/24/17  40
Asked By: Ram

||

Answers:

Let’s use groupby, resample with W-Mon, and sum:

df.groupby('Name').resample('W-Mon', on='Date').sum().reset_index().sort_values(by='Date')

Output:

     Name       Date  Quantity
0   Apple 2017-07-17        90
3  orange 2017-07-17        20
1   Apple 2017-07-24        30
2  Orange 2017-07-31        40
Answered By: Scott Boston

First, convert column date to_datetime and subtract one week as we want the sum for the week ahead of the date and not the week before that date.

Then use groupby with Grouper by W-MON and aggregate sum:

df['Date'] = pd.to_datetime(df['Date']) - pd.to_timedelta(7, unit='d')
df = df.groupby(['Name', pd.Grouper(key='Date', freq='W-MON')])['Quantity']
    .sum()
    .reset_index()
    .sort_values('Date')
print (df)
  Name       Date  Quantity
0   Apple 2017-07-10        90
3  orange 2017-07-10        20
1   Apple 2017-07-17        30
2  Orange 2017-07-24        40
Answered By: jezrael

First convert column date to_datetime. This will group by week starting with Mondays. It will output the week number (but you can change that looking up in

http://strftime.org/

df.groupby(['name', df['date'].dt.strftime('%W')])['quantity'].sum()

Output:

name    date
apple   28      90
        29      30
orange  28      20
        30      40
Answered By: LN_P

This groups every row on the previous Monday (if the date is already Monday, nothing is changed). This has the effect of grouping by week:

import pandas as pd, datetime as dt

# df = ...

df['WeekDate'] = df.apply(lambda row: row['Date'] - dt.timedelta(days=row['Date'].weekday()), axis=1)

perweek = df['WeekDate'].groupby(df['WeekDate']).count()

Exemple:

Date           WeekDate
2020-06-20     2020-06-15 <- monday
2020-06-21     2020-06-15
2020-06-24     2020-06-22 <- monday 
2020-06-25     2020-06-22
2020-06-26     2020-06-22
Answered By: Basj

You already received a lot of good answer and the question is quite old, but, given the fact some of the solutions use deprecated functions and I encounted the same problem and found a different solution I think could be helpful to someone to share it.

Given the dataframe you proposed:

Name   Date    Quantity
Apple  07/11/17  20
orange 07/14/17  20
Apple  07/14/17  70
Orange 07/25/17  40
Apple  07/20/17  30

We have to convert the values in ‘Date’ as Pandas’ Datetime since they are strings right now.
Then we can use the Series’ dt property that allow us to handle DateTime-like series and extract informations.

df['Date'] = pd.to_datetime(df['Date'], format='%m/%d/%y')

By having a DateTime format allow us to use the dt parameters to extract the number of the week associated to the date.
In order to do not loose any information I prefer to add a new column with the week number.
Once retrieved the number of the week we can group by that week.

df['WeekNumber'] = df['Date'].dt.isocalendar().week
df.groupby(['Name', 'WeekNumber']).sum()

Name    WeekNumber  
Apple   28  90
        29  30
Orange  28  20
        30  40
Small problem: what if we consider different years?

There could be the case in whick our data have a range of years, in that situation we cannot consider only the week (otherwise we would mix up data from one year into another), so it would be useful to extract also the year column from isocalendar().

df['year'] = df['Date'].dt.isocalendar().year
df.groupby(['Name', 'WeekNumber', 'year']).sum()

Name    WeekNumber  year    Quantity
Apple   28          2017    90
        29          2017    30
Orange  28          2017    20
        30          2017    40
Answered By: AndreP

You can use the to_period method to get the date truncated to the first day of the week (or month if you use the period M):

df["Week"] = df["Date"].dt.to_period("W").dt.to_timestamp()
df.groupby(["Name", "Week")).sum()
Answered By: nbeuchat
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.