Indexing with boolean arrays into multidimensional arrays using numpy

Question:

I am new to using numpy and one thing that I really don’t understand is indexing arrays.

In the tentative tutorial there is this example:

>>> a = arange(12).reshape(3,4)
>>> b1 = array([False,True,True])             # first dim selection
>>> b2 = array([True,False,True,False])       # second dim selection
>>>
>>> a[b1,b2]                                  # a weird thing to do
array([ 4, 10])

I have no idea why it does that last thing. Can anyone explain that to me?

Thanks!

Asked By: mdlha

||

Answers:

Your array consists of:

0  1  2  3
4  5  6  7
8  9 10 11

One way of indexing it would be using a list of integers, specifying which rows/columns to include:

>>> i1 = [1,2]
>>> i2 = [0,2]
>>> a[i1,i2]
array([ 4, 10])

Meaning: row 1 column 0, row 2 column 2

When you’re using boolean indices, you’re telling which rows/columns to include and which ones not to:

>>> b1 = [False,True,True]       # 0:no,  1:yes, 2:yes       ==> [1,2]
>>> b2 = [True,False,True,False] # 0:yes, 1:no,  2:yes, 3:no ==> [0,2]

As you can see, this is equivalent to the i1 and i2 shown above. Hence, a[b1,b2] will have the same result.

Note also that the operation above is only possible because both b1 and b2 have the same number of True values (so, they represent two arrays of the same length when expressed in the integer form).

Answered By: mgibsonbr
Categories: questions Tags: , , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.