Why does `in` operator return false positive when used on numpy arrays?
Question:
My overall objective is to check whether each row of a big array exists in a small array.
Using in
, testing numpy arrays sometimes results in false positives, whereas it returns the correct result for python lists.
item = [1, 2]
small = [[0,2], [5, 0]]
item in small
# False
import numpy as np
item_array = np.array(item)
small_array = np.array(small)
item_array in small_array
# True
Why does in
return a false positive when using numpy arrays?
For context, the following is my attempt to check membership of items from one array in another array:
big_array = np.array([[5, 0], [1, -2], [0, 2], [-1, 3], [1, 2]])
small_array = np.array([[0, 2], [5, 0]])
# false positive for last item
[row in small_array for row in big_array]
# [True, False, True, False, True]
Answers:
Let’s do the example: np.array([1, 2]) in small_array
It will check if the 1
is anywhere in the small array in the first position (index 0). It is not. Then it checks if the 2
is anywhere in the small array in the second position (index 1). It is! As one of the two returns True, it will return True.
So np.array([i, 2]) in small_array
will always return True
for any i
.
My overall objective is to check whether each row of a big array exists in a small array.
Using in
, testing numpy arrays sometimes results in false positives, whereas it returns the correct result for python lists.
item = [1, 2]
small = [[0,2], [5, 0]]
item in small
# False
import numpy as np
item_array = np.array(item)
small_array = np.array(small)
item_array in small_array
# True
Why does in
return a false positive when using numpy arrays?
For context, the following is my attempt to check membership of items from one array in another array:
big_array = np.array([[5, 0], [1, -2], [0, 2], [-1, 3], [1, 2]])
small_array = np.array([[0, 2], [5, 0]])
# false positive for last item
[row in small_array for row in big_array]
# [True, False, True, False, True]
Let’s do the example: np.array([1, 2]) in small_array
It will check if the 1
is anywhere in the small array in the first position (index 0). It is not. Then it checks if the 2
is anywhere in the small array in the second position (index 1). It is! As one of the two returns True, it will return True.
So np.array([i, 2]) in small_array
will always return True
for any i
.