indexing numpy multidimensional arrays

Question:

I need to access this numpy array, sometimes with only the rows where the last column is 0, and sometimes the rows where the value of the last column is 1.

y = [0  0  0  0
     1  2  1  1 
     2 -6  0  1
     3  4  1  0]

I have to do this over and over, but would prefer to shy away from creating duplicate arrays or having to recalculate each time. Is there someway that I can identify the indices concerned and just call them? So that I can do this:

>>print y[LAST_COLUMN_IS_0] 
[0  0  0  0
3  4  1  0]

>>print y[LAST_COLUMN_IS_1] 
[1  2  1  1 
2 -6  0  1]

P.S. The number of columns in the array never changes, it’s always going to have 4 columns.

Asked By: Zach

||

Answers:

You can use numpy’s boolean indexing to identify which rows you want to select, and numpy’s fancy indexing/slicing to select the whole row.

print y[y[:,-1] == 0, :]
print y[y[:,-1] == 1, :]

You can save y[:,-1] == 0 and ... == 1 as usual, since they are just numpy arrays.

(The y[:,-1] selects the whole of the last column, and the == equality check happens element-wise, resulting in an array of booleans.)

Answered By: huon
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.