Difference between nonzero(a), where(a) and argwhere(a). When to use which?

Question:

In Numpy, nonzero(a), where(a) and argwhere(a), with a being a numpy array, all seem to return the non-zero indices of the array. What are the differences between these three calls?

  • On argwhere the documentation says:

    np.argwhere(a) is the same as np.transpose(np.nonzero(a)).

    Why have a whole function that just transposes the output of nonzero ? When would that be so useful that it deserves a separate function?

  • What about the difference between where(a) and nonzero(a)? Wouldn’t they return the exact same result?

Answers:

I can’t comment on the usefulness of having a separate convenience function that transposes the result of another, but I can comment on where vs nonzero. In it’s simplest use case, where is indeed the same as nonzero.

>>> np.where(np.array([[0,4],[4,0]]))
(array([0, 1]), array([1, 0]))
>>> np.nonzero(np.array([[0,4],[4,0]]))
(array([0, 1]), array([1, 0]))

or

>>> a = np.array([[1, 2],[3, 4]])
>>> np.where(a == 3)
(array([1, 0]),)
>>> np.nonzero(a == 3)
(array([1, 0]),)

where is different from nonzero in the case when you wish to pick elements of from array a if some condition is True and from array b when that condition is False.

>>> a = np.array([[6, 4],[0, -3]])
>>> b = np.array([[100, 200], [300, 400]])
>>> np.where(a > 0, a, b)
array([[6, 4], [300, 400]])

Again, I can’t explain why they added the nonzero functionality to where, but this at least explains how the two are different.

EDIT: Fixed the first example… my logic was incorrect previously

Answered By: SethMMorton

nonzero and argwhere both give you information about where in the array the elements are True. where works the same as nonzero in the form you have posted, but it has a second form:

np.where(mask,a,b)

which can be roughly thought of as a numpy “ufunc” version of the conditional expression:

a[i] if mask[i] else b[i]

(with appropriate broadcasting of a and b).

As far as having both nonzero and argwhere, they’re conceptually different. nonzero is structured to return an object which can be used for indexing. This can be lighter-weight than creating an entire boolean mask if the 0’s are sparse:

mask = a == 0  # entire array of bools
mask = np.nonzero(a)

Now you can use that mask to index other arrays, etc. However, as it is, it’s not very nice conceptually to figure out which indices correspond to 0 elements. That’s where argwhere comes in.

Answered By: mgilson
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.