How to perform element-wise Boolean operations on NumPy arrays
Question:
For example, I would like to create a mask that masks elements with value between 40 and 60:
foo = np.asanyarray(range(100))
mask = (foo < 40).__or__(foo > 60)
Which just looks ugly. I can’t write
(foo < 40) or (foo > 60)
because I end up with:
ValueError Traceback (most recent call last)
...
----> 1 (foo < 40) or (foo > 60)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Is there a canonical way of doing element-wise Boolean operations on NumPy arrays with good looking code?
Answers:
Try this:
mask = (foo < 40) | (foo > 60)
Note: the __or__
method in an object overloads the bitwise or operator (|
), not the Boolean or
operator.
If you have comparisons within only Booleans, as in your example, you can use the bitwise OR operator |
as suggested by Jcollado. But beware, this can give you strange results if you ever use non-Booleans, such as mask = (foo < 40) | override
. Only as long as override
guaranteed to be either False, True, 1, or 0, are you fine.
More general is the use of NumPy’s comparison set operators, np.any
and np.all
. This snippet returns all values between 35 and 45 which are less than 40 or not a multiple of 3:
import numpy as np
foo = np.arange(35, 46)
mask = np.any([(foo < 40), (foo % 3)], axis=0)
print foo[mask]
OUTPUT: array([35, 36, 37, 38, 39, 40, 41, 43, 44])
It is not as nice as with |
, but nicer than the code in your question.
Note that you can use ~
for elementwise negation.
arr = np.array([False, True])
~arr
OUTPUT: array([ True, False], dtype=bool)
Also &
does elementwise and
arr_1 = np.array([False, False, True, True])
arr_2 = np.array([False, True, False, True])
arr_1 & arr_2
OUTPUT: array([False, False, False, True], dtype=bool)
These also work with Pandas Series
ser_1 = pd.Series([False, False, True, True])
ser_2 = pd.Series([False, True, False, True])
ser_1 & ser_2
OUTPUT:
0 False
1 False
2 False
3 True
dtype: bool
You can use the NumPy logical operations. In your example:
np.logical_or(foo < 40, foo > 60)
For example, I would like to create a mask that masks elements with value between 40 and 60:
foo = np.asanyarray(range(100))
mask = (foo < 40).__or__(foo > 60)
Which just looks ugly. I can’t write
(foo < 40) or (foo > 60)
because I end up with:
ValueError Traceback (most recent call last)
...
----> 1 (foo < 40) or (foo > 60)
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Is there a canonical way of doing element-wise Boolean operations on NumPy arrays with good looking code?
Try this:
mask = (foo < 40) | (foo > 60)
Note: the __or__
method in an object overloads the bitwise or operator (|
), not the Boolean or
operator.
If you have comparisons within only Booleans, as in your example, you can use the bitwise OR operator |
as suggested by Jcollado. But beware, this can give you strange results if you ever use non-Booleans, such as mask = (foo < 40) | override
. Only as long as override
guaranteed to be either False, True, 1, or 0, are you fine.
More general is the use of NumPy’s comparison set operators, np.any
and np.all
. This snippet returns all values between 35 and 45 which are less than 40 or not a multiple of 3:
import numpy as np
foo = np.arange(35, 46)
mask = np.any([(foo < 40), (foo % 3)], axis=0)
print foo[mask]
OUTPUT: array([35, 36, 37, 38, 39, 40, 41, 43, 44])
It is not as nice as with |
, but nicer than the code in your question.
Note that you can use ~
for elementwise negation.
arr = np.array([False, True])
~arr
OUTPUT: array([ True, False], dtype=bool)
Also &
does elementwise and
arr_1 = np.array([False, False, True, True])
arr_2 = np.array([False, True, False, True])
arr_1 & arr_2
OUTPUT: array([False, False, False, True], dtype=bool)
These also work with Pandas Series
ser_1 = pd.Series([False, False, True, True])
ser_2 = pd.Series([False, True, False, True])
ser_1 & ser_2
OUTPUT:
0 False
1 False
2 False
3 True
dtype: bool
You can use the NumPy logical operations. In your example:
np.logical_or(foo < 40, foo > 60)