pandas dataframe: remove all rows that includes in other dataframe

Question:

I have pandas dataframe like below:

dataframe 1 (name: df)

enter image description here

as you can see: each (A,B,C) has n X’s and V’s

and I made outlier df as

df_outlier = df[(df["V"] > 150)]

Then, I want to remove all (A,B,C) that includes in df_outlier

for example, if df_outlier looks like below:

enter image description here

I want to remove below rows from original dataframe:
enter image description here

First, I tried below codes:

df_filtered = pd.merge(df, df_outlier, indicator=True, how = 'outer').query('_merge=="left_only"').drop(['_merge'],axis=1)

However, it only remove rows in df_outlier, not all (a,b,c) rows in df_outlier

Sorry for my poor English skills, so if you fell harder to understand..

Asked By: hjsg1010

||

Answers:

Just select the column in df_outlier for check

df_filtered = pd.merge(df, df_outlier[['A','B','C']], indicator=True, how = 'outer').query('_merge=="left_only"').drop(['_merge'],axis=1)
Answered By: BENY
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.