Pandas Intersection after grouping based on common errors between each group

Question

I have the following dataframe:

I want to find intersections based on ID that consistently have errors in all the Run.
So, all IDs are repeating in all Runs. I tried to group data by Run first, then as per this similar question. I tried the code, but it doesn’t return an intersection.

 filter=all_data_df.groupby('Run')['Error'].transform('nunique') == all_data_df['Error'].nunique()
 df = all_data_df.loc[filter]

The results is same dataframe I started with.
How can this be fixed?

I am expecting to obtain

Where only ID 234534 consistently has errors.

Asked By: user_04248753498

||

Source

Answer 1

You can use boolean indexing to keep only the Ids that have errors in all runs :

# Below, I'm using `df` instead of `all_data_df`

consistent_errors = df["Error"].eq("Yes").groupby(df["Id"]).transform("all")

out = df.loc[consistent_errors]

Answered By: Timeless

Pandas Intersection after grouping based on common errors between each group

Question:

Answers: