How to do left outer join exclusion in pandas

Question:

I have two dataframes, A and B, and I want to get those in A but not in B, just like the one right below the top left corner.

The one below the top left

Dataframe A has columns ['a','b' + others] and B has columns ['a','b' + others]. There are no NaN values. I tried the following:

1.

dfm = dfA.merge(dfB, on=['a','b'])
dfe = dfA[(~dfA['a'].isin(dfm['a']) | (~dfA['b'].isin(dfm['b'])

2.

dfm = dfA.merge(dfB, on=['a','b'])
dfe = dfA[(~dfA['a'].isin(dfm['a']) & (~dfA['b'].isin(dfm['b'])

3.

dfe = dfA[(~dfA['a'].isin(dfB['a']) | (~dfA['b'].isin(dfB['b'])

4.

dfe = dfA[(~dfA['a'].isin(dfB['a']) & (~dfA['b'].isin(dfB['b'])

but when I get len(dfm) and len(dfe), they don’t sum up to dfA (it’s off by a few numbers). I’ve tried doing this on dummy cases and #1 works, so maybe my dataset may have some peculiarities I am unable to reproduce.

What’s the right way to do this?

Asked By: irene

||

Answers:

I think it would go something like the examples in: Pandas left outer join multiple dataframes on multiple columns

dfe = pd.merge(dFA, dFB, how='left', on=['a','b'], indicator=True)
dfe[dfe['_merge'] == 'left_only']
Answered By: Justin

Check out this link

df = pd.merge(dfA, dfB, on=['a','b'], how="outer", indicator=True)
df = df[df['_merge'] == 'left_only']

One liner :

df = pd.merge(dfA, dfB, on=['a','b'], how="outer", indicator=True
              ).query('_merge=="left_only"')
Answered By: phi
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.