Boolean Indexing with multiple conditions


I have a Pandas DF where I need to filter out some rows that contains values == 0 for feature ‘a’ and feature ‘b’.

In order to inspect the values, I run the following:

DF1 = DF[DF['a'] == 0]

Which returns the right values. Similarly, by doing this:

DF2 = DF[DF['b'] == 0]

I can see the 0 values for feature ‘b’.

However, if I try to combine these 2 in a single line of code using the OR operand:

DF3 = DF[DF['a'] == 0 |  DF['b'] == 0]

I get this:

TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]

What’s happening here?

Asked By: user5730994



You can transform either column ‘a’ or ‘b’ so they are both either float64 or bool. However, an easier solution that preserves the data type of your features is this:

DF3 = DF[(DF['a'] == 0) | (DF['b'] == 0)]

A common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.

Answered By: Luis Miguel
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.