Check certain conditions looking back x hours (pandas)
Question:
I have some data like this:
import pandas as pd
dates = ["12/25/2021 07:47:01", "12/25/2021 08:02:32", "12/25/2021 13:57:40", "12/25/2021 14:17:11", "12/25/2021 17:23:01", "12/25/2021 23:48:55", "12/26/2021 08:22:32", "12/26/2021 11:11:11", "12/26/2021 14:53:40", "12/26/2021 16:07:07", "12/26/2021 23:56:07"]
is_manual = [0,0,0,0,1,1,0,0,0,0,1]
is_problem = [0,0,0,0,1,1,0,0,0,1,1]
df = pd.DataFrame({'dates':dates,
'manual_entry': is_manual,
'problem_entry': is_problem})
dates manual_entry problem_entry
0 12/25/2021 07:47:01 0 0
1 12/25/2021 08:02:32 0 0
2 12/25/2021 13:57:40 0 0
3 12/25/2021 14:17:11 0 0
4 12/25/2021 17:23:01 1 1
5 12/25/2021 23:48:55 1 1
6 12/26/2021 08:22:32 0 0
7 12/26/2021 11:11:11 0 0
8 12/26/2021 14:53:40 0 0
9 12/26/2021 16:07:07 0 1
10 12/26/2021 23:56:07 1 1
What I would like to do is to take every row where problem_entry == 1 and examine if every row in the 24 hours prior to that row is manual_entry == 0
While I know you can create a rolling lookback window of a certain number of rows, each row is not spaced a normal time period apart, so wondering how to look back 24 hours and determine whether the criteria above is met.
Thanks in advance
EDIT: Expected output:
dates manual_entry problem_entry
4 12/25/2021 17:23:01 1 1
10 12/26/2021 23:56:07 1 1
Answers:
Try the following.Extracted ‘manual_entry’ into a separate variable and collected the amounts in a sliding window of the day. If the current ‘manual_entry’ is equal to 1, then there were no other values during the day. Next, the dataframe is filtered where ‘problem_entry’, ‘manual_entry’ where are equal to 1.
df['dates'] = pd.to_datetime(df['dates'])
a = (df.rolling("86400s", on='dates', min_periods=1).sum()).loc[:, 'manual_entry']
print(df.loc[(df['problem_entry'] == 1) & (a == 1)])
Output:
dates manual_entry problem_entry
4 2021-12-25 17:23:01 1 1
10 2021-12-26 23:56:07 1 1
I have some data like this:
import pandas as pd
dates = ["12/25/2021 07:47:01", "12/25/2021 08:02:32", "12/25/2021 13:57:40", "12/25/2021 14:17:11", "12/25/2021 17:23:01", "12/25/2021 23:48:55", "12/26/2021 08:22:32", "12/26/2021 11:11:11", "12/26/2021 14:53:40", "12/26/2021 16:07:07", "12/26/2021 23:56:07"]
is_manual = [0,0,0,0,1,1,0,0,0,0,1]
is_problem = [0,0,0,0,1,1,0,0,0,1,1]
df = pd.DataFrame({'dates':dates,
'manual_entry': is_manual,
'problem_entry': is_problem})
dates manual_entry problem_entry
0 12/25/2021 07:47:01 0 0
1 12/25/2021 08:02:32 0 0
2 12/25/2021 13:57:40 0 0
3 12/25/2021 14:17:11 0 0
4 12/25/2021 17:23:01 1 1
5 12/25/2021 23:48:55 1 1
6 12/26/2021 08:22:32 0 0
7 12/26/2021 11:11:11 0 0
8 12/26/2021 14:53:40 0 0
9 12/26/2021 16:07:07 0 1
10 12/26/2021 23:56:07 1 1
What I would like to do is to take every row where problem_entry == 1 and examine if every row in the 24 hours prior to that row is manual_entry == 0
While I know you can create a rolling lookback window of a certain number of rows, each row is not spaced a normal time period apart, so wondering how to look back 24 hours and determine whether the criteria above is met.
Thanks in advance
EDIT: Expected output:
dates manual_entry problem_entry
4 12/25/2021 17:23:01 1 1
10 12/26/2021 23:56:07 1 1
Try the following.Extracted ‘manual_entry’ into a separate variable and collected the amounts in a sliding window of the day. If the current ‘manual_entry’ is equal to 1, then there were no other values during the day. Next, the dataframe is filtered where ‘problem_entry’, ‘manual_entry’ where are equal to 1.
df['dates'] = pd.to_datetime(df['dates'])
a = (df.rolling("86400s", on='dates', min_periods=1).sum()).loc[:, 'manual_entry']
print(df.loc[(df['problem_entry'] == 1) & (a == 1)])
Output:
dates manual_entry problem_entry
4 2021-12-25 17:23:01 1 1
10 2021-12-26 23:56:07 1 1