Array filtering with conditions
Question:
(I’m not sure about the correctness of title)
I have an numpy.array
f
as follows:
# id frame x y z
What I want to do is to extract the trajectories for some specific id
. For id==1
I get for instance:
f_1 = f[ f[:,0]==1 ]
and get
array([[ 1. , 55. , 381.51 , -135.476 , 163.751 ],
[ 1. , 56. , 369.176 , -134.842 , 163.751 ],
[ 1. , 57. , 357.499 , -134.204 , 163.751 ],
[ 1. , 58. , 346.65 , -133.786 , 163.751 ],
[ 1. , 59. , 336.602 , -133.762 , 163.751 ],
[ 1. , 60. , 326.762 , -135.157 , 163.751 ],
[ 1. , 61. , 315.77 , -135.898 , 163.751 ],
[ 1. , 62. , 303.806 , -136.855 , 163.751 ],
[ 1. , 63. , 291.273 , -138.255 , 163.751 ],
[ 1. , 64. , 278.767 , -139.824 , 163.751 ],
[ 1. , 65. , 266.778 , -141.123 , 163.751 ],
[ 1. , 66. , 255.773 , -142.42 , 163.751 ],
[ 1. , 67. , 244.864 , -143.314 , 163.751 ]])
My problem is I’ m not sure I understand how it works. Normally I was expecting something like:
f_1 = f[ f[:,0]==1, : ]
which also works and makes more sense to me. (take all columns but only those rows that fulfill the required condition)
Can someone explain why this form also works and what exactly happens?
f_1 = f[ f[:,0]==1 ]
Answers:
For a 2D array asking only one index returns the line (with all columns) corresponding to that index, so that:
np.all( a[0] == a[0,:] )
#True
When you do a[0]==1
you get a boolean array, like:
b = a[0]==1
#array([True, True, False, False, True], dtype=bool)
Which you can use through fancy indexing to obtain all the lines whose index has a corresponding True
value in b
. In this example, doing:
c = a[ b ]
will get lines corresponding to indices [0,1,4]
. The same result would be obtained by passing directly these indices, like c = a[ [0,1,4] ]
.
Quoting from the Tentative Numpy Tutorial:
…When fewer indices are provided than the number of axes, the missing indices are considered complete slices…
So f[f[:,0]==1]
gets translated to f[f[:,0]==1,:]
(or equivalently, to f[f[:,0]==1,...]
) which are all the same thing from programmers’ perspective.
(I’m not sure about the correctness of title)
I have an numpy.array
f
as follows:
# id frame x y z
What I want to do is to extract the trajectories for some specific id
. For id==1
I get for instance:
f_1 = f[ f[:,0]==1 ]
and get
array([[ 1. , 55. , 381.51 , -135.476 , 163.751 ],
[ 1. , 56. , 369.176 , -134.842 , 163.751 ],
[ 1. , 57. , 357.499 , -134.204 , 163.751 ],
[ 1. , 58. , 346.65 , -133.786 , 163.751 ],
[ 1. , 59. , 336.602 , -133.762 , 163.751 ],
[ 1. , 60. , 326.762 , -135.157 , 163.751 ],
[ 1. , 61. , 315.77 , -135.898 , 163.751 ],
[ 1. , 62. , 303.806 , -136.855 , 163.751 ],
[ 1. , 63. , 291.273 , -138.255 , 163.751 ],
[ 1. , 64. , 278.767 , -139.824 , 163.751 ],
[ 1. , 65. , 266.778 , -141.123 , 163.751 ],
[ 1. , 66. , 255.773 , -142.42 , 163.751 ],
[ 1. , 67. , 244.864 , -143.314 , 163.751 ]])
My problem is I’ m not sure I understand how it works. Normally I was expecting something like:
f_1 = f[ f[:,0]==1, : ]
which also works and makes more sense to me. (take all columns but only those rows that fulfill the required condition)
Can someone explain why this form also works and what exactly happens?
f_1 = f[ f[:,0]==1 ]
For a 2D array asking only one index returns the line (with all columns) corresponding to that index, so that:
np.all( a[0] == a[0,:] )
#True
When you do a[0]==1
you get a boolean array, like:
b = a[0]==1
#array([True, True, False, False, True], dtype=bool)
Which you can use through fancy indexing to obtain all the lines whose index has a corresponding True
value in b
. In this example, doing:
c = a[ b ]
will get lines corresponding to indices [0,1,4]
. The same result would be obtained by passing directly these indices, like c = a[ [0,1,4] ]
.
Quoting from the Tentative Numpy Tutorial:
…When fewer indices are provided than the number of axes, the missing indices are considered complete slices…
So f[f[:,0]==1]
gets translated to f[f[:,0]==1,:]
(or equivalently, to f[f[:,0]==1,...]
) which are all the same thing from programmers’ perspective.