Compare two Dataframe columns and retrive the mismatched columns in a new column

Question:

I am trying to compare two dataframes and create one column in the first dataframe says that mismatched columns. attaching the Dataframe images and expected output images below. Could I get some help here

enter image description here
attaching the expected outputs here.Need one column says that names of all mismatched columns

Asked By: Sujith M

||

Answers:

Assuming aligned indices, you can use:

df1['mismatch'] = df1.ne(df2).dot(df1.columns+',').str[:-1]

Alternatively:

out = (df1.ne(df2).reset_index().melt('index')
       .query('value').groupby('index')
       ['variable'].agg(','.join)
      )

df1['mismatch'] = out

Example output:

   A  B  C mismatch
0  1  2  3      A,B
1  4  5  6        A
2  7  8  9         

Used inputs:

# df1
   A  B  C
0  1  2  3
1  4  5  6
2  7  8  9

# df2
   A  B  C
0  0  0  3
1  0  5  6
2  7  8  9
Answered By: mozway

Applying two lambda loops first to check if cell values equal or not second to find out the column names with true value and join them.
Alternatively, instead of second lambda, You can even use .dot(). as @mozway has done.

Code:

df1.apply(lambda x: x!=df2.iloc[x.index][x.name]).apply(lambda x: ','.join(x.index[x]), axis=1)
Answered By: R. Baraiya
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.