How to filter numpy array by list of indices?

Question:

I have a numpy array, filtered__rows, comprised of LAS data [x, y, z, intensity, classification]. I have created a cKDTree of points and have found nearest neighbors, query_ball_point, which is a list of indices for the point and its neighbors.

Is there a way to filter filtered__rows to create an array of only points whose index is in the list returned by query_ball_point?

Asked By: Barbarossa

||

Answers:

It looks like you just need a basic integer array indexing:

filter_indices = [1,3,5]
np.array([11,13,155,22,0xff,32,56,88])[filter_indices] 
Answered By: Joran Beasley

Do you know how that translates for multi-dimensional arrays?

It can be expanded to multi dimensional arrays by giving a 1d array for every index so for a 2d array

filter_indices=np.array([[1,0],[0,1]])
array=np.array([[0,1],[1,2]])
print(array[filter_indices[:,0],filter_indices[:,1]])

will give you :
[1,1]

Scipy has an explanation on what will happen if you call:
print(array[filter_indices])

Docs – https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html

Answered By: Leaderchicken

numpy.take can be useful and works well for multimensional arrays.

import numpy as np

filter_indices = [1, 2]
array = np.array([[1, 2, 3, 4, 5], 
                  [10, 20, 30, 40, 50], 
                  [100, 200, 300, 400, 500]])

axis = 0
print(np.take(array, filter_indices, axis))
# [[ 10  20  30  40  50]
#  [100 200 300 400 500]]

axis = 1
print(np.take(array, filter_indices, axis))
# [[  2   3]
#  [ 20  30]
# [200 300]]
Answered By: Keunwoo Choi

Using Docs: https://docs.scipy.org/doc/numpy-1.13.0/user/basics.indexing.html
The following implementation should work for arbitrary number of dimensions/shapes for some numpy ndarray.

First we need a multi-dimensional set of indexes and some example data:

import numpy as np
y = np.arange(35).reshape(5,7)
print(y) 
indexlist = [[0,1], [0,2], [3,3]]
print ('indexlist:', indexlist)

To actually extract the intuitive result the trick is to use a Transpose:

indexlisttranspose = np.array(indexlist).T.tolist()
print ('indexlist.T:', indexlisttranspose)
print ('y[indexlist.T]:', y[ tuple(indexlisttranspose) ])

Makes the following terminal output:

y: [[ 0  1  2  3  4  5  6]
 [ 7  8  9 10 11 12 13]
 [14 15 16 17 18 19 20]
 [21 22 23 24 25 26 27]
 [28 29 30 31 32 33 34]]
indexlist: [[0, 1], [0, 2], [3, 3]]
indexlist.T: [[0, 0, 3], [1, 2, 3]]
y[indexlist.T]: [ 1  2 24]

The tuple… fixes a future warning which we can cause like so:

print ('y[indexlist.T]:', y[ indexlisttranspose ])
FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`.
In the future this will be interpreted as an array index,
`arr[np.array(seq)]`, which will result either in an error or a
different result.
    print ('y[indexlist.T]:', y[ indexlisttranspose ])
y[indexlist.T]: [ 1  2 24]
Answered By: D Adams

The fastest way to do this is X[tuple(index.T)], where X is the ndarray with the elements and index is the ndarray of indices wished to be retrieved.

Answered By: larsaars