compare two data frames and get only non matching values with index and column names pandas dataframe python

Question:

df1-

ID Name  Number
0  AAA    123
1  BBB    456
2  CCC    789

df2-

ID Name  Number
0  AAA    123
1  BBB    456
2  CCC    **963**    <----- Non Matching value

want to compare above two data frames df1 and df2
want the result in below format: need only non matching value with column name.

expected output:
ID Number                        
2  963   

can anyone help me with the code
i am new to pandas, please help me out
Thanks you soo much…

Asked By: Amaresh puri

||

Answers:

You can use .merge() with indicator turned on and filter the result by the indicator, as follows:

df3 = df2.merge(df1, how='left', indicator=True)
df3[df3['_merge'] == 'left_only'][['ID', 'Number']]

Result:

   ID  Number
2   2     963

Edit

If you have multiple columns and would not like to specify the column names to highlight the differences, you can try:

df2[(df1 != df2)].dropna(how='all', axis=1).dropna(how='all', axis=0)

Demo

df1

   ID Name  Number1  Number2  Number3
0   0  AAA      123       12     1111
1   1  BBB      456       22     2222
2   2  CCC      789       32     3333


df2

   ID Name  Number1  Number2  Number3
0   0  AAA      123       12     1111
1   1  BBB      456       22     2255
2   2  CCC      963       32     3333


df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0)


   Number1  Number3
1      NaN   2255.0
2    963.0      NaN

You can see from the non_NaN values the differences. The ID is the index on the left.

Edit 2

If your numbers are all integers and you don’t want Pandas to show the integers as float type together with NaN values, you can use:

df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0).fillna('').astype(str).replace(r'.0', '', regex=True)


  Number1 Number3
1            2255
2     963        

Or, simply use:

df2[df1 != df2].dropna(how='all', axis=1).dropna(how='all', axis=0).astype('Int64')


   Number1  Number3
1     <NA>     2255
2      963     <NA>
Answered By: SeaBean

You can use the following

df2[df1.Number != df2.Number][['ID', 'Number']]
Answered By: valentin

You can Extract the data whatever you want from the output, which has the details of all mismatches

enter image description here

Answered By: Arpan Saini
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.