Find events in log that occur after a specific event

Question:

I have a log of events to analyze that looks like this:

+----+---------------------+----------+--------+
| id | timestamp           | record   | event  |
+====+=====================+==========+========+
| 1  | 2023-03-01 13:17:05 | record03 | Edit   |
+----+---------------------+----------+--------+
| 2  | 2023-03-02 02:57:49 | record02 | Edit   |
+----+---------------------+----------+--------+
| 3  | 2023-03-03 00:41:13 | record03 | Locked |
+----+---------------------+----------+--------+
| 4  | 2023-03-03 14:54:34 | record03 | View   |
+----+---------------------+----------+--------+
| 5  | 2023-03-04 07:29:55 | record03 | Edit   |
+----+---------------------+----------+--------+
| 6  | 2023-03-05 02:15:10 | record02 | Locked |
+----+---------------------+----------+--------+
| 7  | 2023-03-05 04:47:33 | record01 | View   |
+----+---------------------+----------+--------+
| 8  | 2023-03-05 15:39:04 | record02 | View   |
+----+---------------------+----------+--------+
| 9  | 2023-03-06 08:36:22 | record03 | View   |
+----+---------------------+----------+--------+
| 10 | 2023-03-06 18:37:28 | record02 | View   |
+----+---------------------+----------+--------+

What I’m looking for is any "Edit" event that occurs after a "Locked" event for a given record. For each record, any event that occurred prior to a "Locked" event can be ignored. Any "Edit" event that occurs after the "Locked" event must be reported.

For example, in the sample data above, only row 5 should be returned as it has an "Edit" event after a "Locked" event. If everything is working properly, there shouldn’t be any "Edit" events after a "Locked" event in the log. Any method that identifies row 5 among a list of results would be acceptable.

I’ve been trying to use groupby() and first(), but I’m struggling to figure out how to return the first occurrence of "Edit" for a given record after any occurrence of "Locked".

Thanks in advance for any tip!

Asked By: mrgou

||

Answers:

Code

# is the event locked?
is_locked = df['event'] == 'Locked'

# flag all rows per record after the first locked event
is_locked = is_locked.groupby(df['record']).cummax()

# Row is flagged and event is edit
mask = df['event'].eq('Edit') & is_locked

# filter all such rows
df[mask]

Result

   id            timestamp    record event
4   5  2023-03-04 07:29:55  record03  Edit
Answered By: Shubham Sharma
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.