Group a DataFrame by months and an additional column
Question:
I have the next DataFrame:
data={
'date':['02/01/2023', '03/01/2023', '12/01/2023', '16/01/2023', '23/01/2023', '03/02/2023', '14/02/2023', '17/02/2023', '17/02/2023', '20/02/2023'],
'amount':[-2.6, -230.0, -9.32, -13.99, -12.99, -50.0, -5.84, -6.6, -11.95, -20.4],
'concept':['FOOD', 'REPAIR', 'HEALTH', 'NO CLASSIFIED', 'NO CLASSIFIED', 'REPAIR', 'FOOD', 'NO CLASSIFIED', 'FOOD', 'HEALTH']
}
df = pd.DataFrame(data)
I need to group the information first by months and then by the concept of each item. I tried something this:
df.groupby(['date','concept']).sum()
And it works for an individual day, but I need the same but grouped by the entire month.
I tried also converting that df.date
to datetime values: df.date = pd.to_datetime(df.date,dayfirst=True)
, but I don’t know how to indicate that the grouping should be by the each entire month.
The result I need would be something like this:
date
concept
amount
Jan-23
FOOD
-2.6
HEALTH
-9.32
NO CLASSIFIED
-26.98
REPAIR
-230
Feb-23
FOOD
-17.79
HEALTH
-20.4
NO CLASSIFIED
-6.6
REPAIR
-50
Answers:
You can use datetime module. First convert the date column to a datetime object and then create new year and month columns accessing by ‘dt’. Then groupby and sum. try below code:
import datetime
df["date"] = pd.to_datetime(df["date"],dayfirst=True)
df["month"] = df["date"].dt.month
df["year"] = df["date"].dt.year
df.drop("date",axis=1,inplace = True)
result = df.groupby(["year","month","concept"]).sum()
or you can use :
df["date"] = pd.to_datetime(df["date"],dayfirst=True)
df['month_year'] = df['date'].dt.to_period('M')
df.drop("date",axis=1,inplace = True)
result = df.groupby(["month_year","concept"]).sum()
I have the next DataFrame:
data={
'date':['02/01/2023', '03/01/2023', '12/01/2023', '16/01/2023', '23/01/2023', '03/02/2023', '14/02/2023', '17/02/2023', '17/02/2023', '20/02/2023'],
'amount':[-2.6, -230.0, -9.32, -13.99, -12.99, -50.0, -5.84, -6.6, -11.95, -20.4],
'concept':['FOOD', 'REPAIR', 'HEALTH', 'NO CLASSIFIED', 'NO CLASSIFIED', 'REPAIR', 'FOOD', 'NO CLASSIFIED', 'FOOD', 'HEALTH']
}
df = pd.DataFrame(data)
I need to group the information first by months and then by the concept of each item. I tried something this:
df.groupby(['date','concept']).sum()
And it works for an individual day, but I need the same but grouped by the entire month.
I tried also converting that df.date
to datetime values: df.date = pd.to_datetime(df.date,dayfirst=True)
, but I don’t know how to indicate that the grouping should be by the each entire month.
The result I need would be something like this:
date | concept | amount |
---|---|---|
Jan-23 | FOOD | -2.6 |
HEALTH | -9.32 | |
NO CLASSIFIED | -26.98 | |
REPAIR | -230 | |
Feb-23 | FOOD | -17.79 |
HEALTH | -20.4 | |
NO CLASSIFIED | -6.6 | |
REPAIR | -50 |
You can use datetime module. First convert the date column to a datetime object and then create new year and month columns accessing by ‘dt’. Then groupby and sum. try below code:
import datetime
df["date"] = pd.to_datetime(df["date"],dayfirst=True)
df["month"] = df["date"].dt.month
df["year"] = df["date"].dt.year
df.drop("date",axis=1,inplace = True)
result = df.groupby(["year","month","concept"]).sum()
or you can use :
df["date"] = pd.to_datetime(df["date"],dayfirst=True)
df['month_year'] = df['date'].dt.to_period('M')
df.drop("date",axis=1,inplace = True)
result = df.groupby(["month_year","concept"]).sum()