How do I use numpy.where()? What should I pass, and what does the result mean?

Question:

I tried reading the documentation for numpy.where(), but I’m still confused.

What should I pass for the condition, x and y values? When I pass only condition, what does the result mean and how can I use it? What about when I pass all three?

I found How does python numpy.where() work? but it didn’t answer my question because it seems to be about the implementation rather than about how to use it. Numpy where() on a 2D matrix also didn’t explain things for me; I’m looking for a step-by-step explanation, rather than a how-to guide for a specific case.

Please include examples with both 1D and 2D source data.

Answers:

After fiddling around for a while, I figured things out, and am posting them here hoping it will help others.

Intuitively, np.where is like asking “tell me where in this array, entries satisfy a given condition“.

>>> a = np.arange(5,10)
>>> np.where(a < 8)       # tell me where in a, entries are < 8
(array([0, 1, 2]),)       # answer: entries indexed by 0, 1, 2

It can also be used to get entries in array that satisfy the condition:

>>> a[np.where(a < 8)] 
array([5, 6, 7])          # selects from a entries 0, 1, 2

When a is a 2d array, np.where() returns an array of row idx’s, and an array of col idx’s:

>>> a = np.arange(4,10).reshape(2,3)
array([[4, 5, 6],
       [7, 8, 9]])
>>> np.where(a > 8)
(array(1), array(2))

As in the 1d case, we can use np.where() to get entries in the 2d array that satisfy the condition:

>>> a[np.where(a > 8)] # selects from a entries 0, 1, 2

array([9])


Note, when a is 1d, np.where() still returns an array of row idx’s and an array of col idx’s, but columns are of length 1, so latter is empty array.

Here is a little more fun. I’ve found that very often NumPy does exactly what I wish it would do – sometimes it’s faster for me to just try things than it is to read the docs. Actually a mixture of both is best.

I think your answer is fine (and it’s OK to accept it if you like). This is just “extra”.

import numpy as np

a = np.arange(4,10).reshape(2,3)

wh = np.where(a>7)
gt = a>7
x  = np.where(gt)

print "wh: ", wh
print "gt: ", gt
print "x:  ", x

gives:

wh:  (array([1, 1]), array([1, 2]))
gt:  [[False False False]
      [False  True  True]]
x:   (array([1, 1]), array([1, 2]))

… but:

print "a[wh]: ", a[wh]
print "a[gt]  ", a[gt]
print "a[x]:  ", a[x]

gives:

a[wh]:  [8 9]
a[gt]   [8 9]
a[x]:   [8 9]
Answered By: uhoh
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.