Time until next occurence of value in pandas dataframe
Question:
I am trying to find the time (in days) until next occurrence of each specific value. For example, say I have the data below:
column1 created_at
0 A 2018-09-03
1 B 2018-09-07
2 B 2018-09-08
3 A 2018-09-09
4 B 2018-09-12
The goal is to get time difference for each value in column1 chronologically. At row 3 column1 value is A and its creation date is 2018-09-09. The last creation before that for A is 6 days ago.
Trying to get this:
column1 created_at time_diff
0 A 2018-09-03 NaN
1 B 2018-09-07 NaN
2 B 2018-09-08 1
3 A 2018-09-09 6
4 B 2018-09-12 4
Answers:
Assuming this is the dataframe df
df['created_at'] = pd.to_datetime(df['created_at'])
df['time_diff'] = df.groupby('column1')['created_at'].diff().dt.days
First row converts is date time, you can skip this if you want.
Second row gives you what you want.
Tick this answer if this solves your problem.
I am trying to find the time (in days) until next occurrence of each specific value. For example, say I have the data below:
column1 created_at
0 A 2018-09-03
1 B 2018-09-07
2 B 2018-09-08
3 A 2018-09-09
4 B 2018-09-12
The goal is to get time difference for each value in column1 chronologically. At row 3 column1 value is A and its creation date is 2018-09-09. The last creation before that for A is 6 days ago.
Trying to get this:
column1 created_at time_diff
0 A 2018-09-03 NaN
1 B 2018-09-07 NaN
2 B 2018-09-08 1
3 A 2018-09-09 6
4 B 2018-09-12 4
Assuming this is the dataframe df
df['created_at'] = pd.to_datetime(df['created_at'])
df['time_diff'] = df.groupby('column1')['created_at'].diff().dt.days
First row converts is date time, you can skip this if you want.
Second row gives you what you want.
Tick this answer if this solves your problem.