How to convert data into timeseries for column groups
Question:
I have data with timestamps. Users do tasks, and the timestamp is recorded. Each user is identified by a ‘uid’. I want to convert this data into 10-minute granular time series, but for each user separately. So, timestamp goes in chronological order for uid=1 separately, then for uid=2 and so on.
From:
timestamp uid var
2020-01-01 10:00 1 10
2020-01-01 10:04 2 20
2020-01-01 20:02 2 15
2020-01-01 21:20 1 10
..
2020-01-15 23:12 1 5
To:
timestamp uid var
2020-01-01 10:00 1 10
2020-01-01 10:10 1 NaN
2020-01-01 10:20 1 NaN
...
2020-01-15 23:10 1 5
2020-01-01 10:00 2 20
2020-01-01 10:10 2 NaN
2020-01-01 10:20 2 NaN
...
Answers:
grouped by uid
column and resample
10T
import numpy as np
(df.groupby('uid')
.resample(rule='10T')['var'].sum()
.reset_index(level=0)
.replace({0: np.NaN}))
I have data with timestamps. Users do tasks, and the timestamp is recorded. Each user is identified by a ‘uid’. I want to convert this data into 10-minute granular time series, but for each user separately. So, timestamp goes in chronological order for uid=1 separately, then for uid=2 and so on.
From:
timestamp uid var
2020-01-01 10:00 1 10
2020-01-01 10:04 2 20
2020-01-01 20:02 2 15
2020-01-01 21:20 1 10
..
2020-01-15 23:12 1 5
To:
timestamp uid var
2020-01-01 10:00 1 10
2020-01-01 10:10 1 NaN
2020-01-01 10:20 1 NaN
...
2020-01-15 23:10 1 5
2020-01-01 10:00 2 20
2020-01-01 10:10 2 NaN
2020-01-01 10:20 2 NaN
...
grouped by uid
column and resample
10T
import numpy as np
(df.groupby('uid')
.resample(rule='10T')['var'].sum()
.reset_index(level=0)
.replace({0: np.NaN}))