Boolean Indexing with multiple conditions

Question:

I have a Pandas DF where I need to filter out some rows that contains values == 0 for feature ‘a’ and feature ‘b’.

In order to inspect the values, I run the following:

DF1 = DF[DF['a'] == 0]

Which returns the right values. Similarly, by doing this:

DF2 = DF[DF['b'] == 0]

I can see the 0 values for feature ‘b’.

However, if I try to combine these 2 in a single line of code using the OR operand:

DF3 = DF[DF['a'] == 0 |  DF['b'] == 0]

I get this:

TypeError: cannot compare a dtyped [float64] array with a scalar of type [bool]

What’s happening here?

Asked By: user5730994

||

Answers:

You can transform either column ‘a’ or ‘b’ so they are both either float64 or bool. However, an easier solution that preserves the data type of your features is this:

DF3 = DF[(DF['a'] == 0) | (DF['b'] == 0)]

A common operation is the use of boolean vectors to filter the data. The operators are: | for or, & for and, and ~ for not. These must be grouped by using parentheses.

Answered By: Luis Miguel
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.