Pandas: filtering rows based on condition

Question:

Existing Dataframe :

Id           Date            Status
A         26-01-2022         begin
A         26-01-2022         failed
A         27-01-2022         begin
A         27-01-2022         in-process
A         27-01-2022         success
B         01-02-2022         in-process
B         01-02-2022         success
B         02-02-2022         begin
B         02-02-2022         failed

Expected Dataframe :

Id           Date            Status
A         27-01-2022         begin
A         27-01-2022         in-process
A         27-01-2022         success
B         01-02-2022         in-process
B         01-02-2022         success

Need to keep all the activity of the Id where the status for that Id ends with success for that particular date.

tried grouping Id and Date, but stuck with dropping the relevant rows

Asked By: Roshankumar

||

Answers:

You can use groupby and filter by last value of Status column

out = (df.groupby(['Id', 'Date'])
       .filter(lambda g: g['Status'].iloc[-1] == 'success'))
print(out)

  Id        Date      Status
2  A  27-01-2022       begin
3  A  27-01-2022  in-process
4  A  27-01-2022     success
5  B  01-02-2022  in-process
6  B  01-02-2022     success
Answered By: Ynjxsjmh

Use transform('last'):

df[df.groupby(['Id', 'Date']).Status.transform('last') == 'success']
Answered By: Nuri Taş
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.