How To Create New Pandas Columns With Monthly Counts From DateTime

Question

I have a large dataframe of crime incidents, df, with four columns. Here INCIDENT_DATE is datatype datetime. There are three possible types as well (Violent, Property, and non-index).

ID	Crime	INCIDENT_DATE	Type
XL123445	Aggrevated Assault	2018-12-29	Violent
XL123445	Simple Assault	2018-12-29	Violent
XL123445	Theft	2018-12-30	Property
TX56784	Theft	2018-04-28	Property
…	…
CA45678	Sexual Assault	1991-10-23	Violent
LA356890	Burglary	2018-12-21	Property

I want to create a new dataframe, where I can get the monthly counts (for each ID) of type property and violent, and a row for the sum total of incidents for that ID during that month.

So I would want something like:

ID	Year_Month	Violent	Property	Total
XL123445	2018-08	19654	500	20154
TX56784	2011-07	17	15	32
…	…	…
CA45678	1992-06	100	100	200
LA356890	1993-05	Property	50	50

I have created a previous dataframe with column ‘Year_Month’ before that only took into account aggregated counts of crime incidents for each ID, but this ignored ‘Type’. I did this with:

df1 = (df.value_counts(['ID', df['INCIDENT_DATE'].dt.to_period('M').rename('Year_Month')])
     .rename('Count').reset_index())

Is there a way I can carry over this same logic while creating two additional columns, as desired.

Asked By: TheMaffGuy

||

Source

Answer 1

IIUC, you were very close:

df1 = df.value_counts([
    'ID', df['INCIDENT_DATE'].dt.to_period('M').rename('Year_Month'), 'Type',
]).unstack('Type', fill_value=0).rename_axis(None, axis=1)
df1 = df1.assign(Total=df1.sum(axis=1)).reset_index()

On your sample data:

>>> df1
         ID Year_Month  Property  Violent  Total
0   CA45678    1991-10         0        1      1
1  LA356890    2018-12         1        0      1
2   TX56784    2018-04         1        0      1
3  XL123445    2018-12         1        2      3

Answered By: Pierre D

How To Create New Pandas Columns With Monthly Counts From DateTime

Question:

Answers: