pandas – filter rows with same value in many columns

Question:

I have a pandas DataFrame with many columns (around 100+ columns but the exact amount doesn’t matter).

Most rows have the same value in all columns, some rows have more than one unique value.
For example, in the following table, rows 1 and 2 have the same value in all columns and row 3 has more than value in all columns.

column 1 column 2 column 3 column 4 column n
A A A A A
A A A A A
C A B A A

I want to filter rows which have only 1 unique value in all of its columns. In the previous example I would only keep row 3.
I know how to filter rows based on values in specific columns using masks, but this doesn’t seem to work in the case.
Any Ideas?

Asked By: Ofek Glick

||

Answers:

Looks like you want to filter based on nunique with boolean indexing:

out = df[df.nunique(axis=1).ne(1)]

Output:

  column 1 column 2 column 3 column 4 column n
2        C        A        B        A        A

Intermediate:

df.nunique(axis=1).ne(1)

0    False
1    False
2     True
dtype: bool
Answered By: mozway
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.