how to get total elapsed time in timeseries pandas dataframe for each group?
Question:
I want to calculate elapsed time in min for each group in my dataframe.
for example, I have following data frame, i want to calculate total elapsed time for each group "foo" and "bar".
data = pd.DataFrame(
{
"name": ["foo", "foo", "foo", "bar", "bar", "bar"],
"time": [
pd.to_datetime("2021-06-01 00:00:00"),
pd.to_datetime("2021-06-01 00:15:00"),
pd.to_datetime("2021-06-01 00:30:00"),
pd.to_datetime("2021-06-01 00:00:00"),
pd.to_datetime("2021-06-01 01:00:00"),
pd.to_datetime("2021-06-01 01:30:00"),
],
}
)
expected outout
name time elapsed_min
0 foo 2021-06-01 00:00:00 0
1 foo 2021-06-01 00:15:00 15
2 foo 2021-06-01 00:30:00 30
3 bar 2021-06-01 00:00:00 0
4 bar 2021-06-01 01:00:00 60
5 bar 2021-06-01 01:30:00 90
Answers:
You can do groupby then transform to get the min of each group
data['elapsed_min'] = (data['time'].sub(data.groupby('name')['time'].transform(min))
.dt.total_seconds().div(60))
print(out)
name time elapsed_min
0 foo 2021-06-01 00:00:00 0.0
1 foo 2021-06-01 00:15:00 15.0
2 foo 2021-06-01 00:30:00 30.0
3 bar 2021-06-01 00:00:00 0.0
4 bar 2021-06-01 01:00:00 60.0
5 bar 2021-06-01 01:30:00 90.0
I want to calculate elapsed time in min for each group in my dataframe.
for example, I have following data frame, i want to calculate total elapsed time for each group "foo" and "bar".
data = pd.DataFrame(
{
"name": ["foo", "foo", "foo", "bar", "bar", "bar"],
"time": [
pd.to_datetime("2021-06-01 00:00:00"),
pd.to_datetime("2021-06-01 00:15:00"),
pd.to_datetime("2021-06-01 00:30:00"),
pd.to_datetime("2021-06-01 00:00:00"),
pd.to_datetime("2021-06-01 01:00:00"),
pd.to_datetime("2021-06-01 01:30:00"),
],
}
)
expected outout
name time elapsed_min
0 foo 2021-06-01 00:00:00 0
1 foo 2021-06-01 00:15:00 15
2 foo 2021-06-01 00:30:00 30
3 bar 2021-06-01 00:00:00 0
4 bar 2021-06-01 01:00:00 60
5 bar 2021-06-01 01:30:00 90
You can do groupby then transform to get the min of each group
data['elapsed_min'] = (data['time'].sub(data.groupby('name')['time'].transform(min))
.dt.total_seconds().div(60))
print(out)
name time elapsed_min
0 foo 2021-06-01 00:00:00 0.0
1 foo 2021-06-01 00:15:00 15.0
2 foo 2021-06-01 00:30:00 30.0
3 bar 2021-06-01 00:00:00 0.0
4 bar 2021-06-01 01:00:00 60.0
5 bar 2021-06-01 01:30:00 90.0