# How to find zero elements in a sparse matrix

## Question:

I know that `scipy.sparse.find(A)` returns 3 arrays I,J,V each of them containing the rows, columns, and values of the nonzero elements respectively.

What i want is a way to do the same (except the V array) for all zero elements without having to iterate through the matrix since its too large.

Assuming you have a scipy sparse array and have imported `find`:

``````from itertools import product
I, J, _= find(your_sparse_array)
nonzero = zip(I, J)
nrows, ncols = your_sparse_array.shape
for a, b in product(range(nrows), range(ncols)):
if (a,b) not in nonzero: print(a, b)
``````

Make a small sparse matrix with 10% sparsity:

``````In [1]: from scipy import sparse
In [2]: M = sparse.random(10,10,.1)
In [3]: M
Out[3]:
<10x10 sparse matrix of type '<class 'numpy.float64'>'
with 10 stored elements in COOrdinate format>
``````

The 10 nonzero values:

``````In [5]: sparse.find(M)
Out[5]:
(array([6, 4, 1, 2, 3, 0, 1, 6, 9, 6], dtype=int32),
array([1, 2, 3, 3, 3, 4, 4, 4, 5, 8], dtype=int32),
array([ 0.91828586,  0.29763717,  0.12771201,  0.24986069,  0.14674883,
0.56018409,  0.28643427,  0.11654358,  0.8784731 ,  0.13253971]))
``````

If, out of the 100 elements of the matrix, 10 are nonzero, then 90 elements are zero. Do you really want the indices of all of those?

`where` or `nonzero` on the dense equivalent gives the same indices:

``````In [6]: A = M.A # dense
In [7]: np.where(A)
Out[7]:
(array([0, 1, 1, 2, 3, 4, 6, 6, 6, 9], dtype=int32),
array([4, 3, 4, 3, 3, 2, 1, 4, 8, 5], dtype=int32))
``````

And the indices of the 90 zero values:

``````In [8]: np.where(A==0)
Out[8]:
(array([0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4, 4, 5, 5,
5, 5, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 7, 7,
7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9, 9, 9, 9, 9, 9, 9, 9], dtype=int32),
array([0, 1, 2, 3, 5, 6, 7, 8, 9, 0, 1, 2, 5, 6, 7, 8, 9, 0, 1, 2, 4, 5, 6,
7, 8, 9, 0, 1, 2, 4, 5, 6, 7, 8, 9, 0, 1, 3, 4, 5, 6, 7, 8, 9, 0, 1,
2, 3, 4, 5, 6, 7, 8, 9, 0, 2, 3, 5, 6, 7, 9, 0, 1, 2, 3, 4, 5, 6, 7,
8, 9, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 1, 2, 3, 4, 6, 7, 8, 9], dtype=int32))
``````

That’s 2 arrays of shape (90,), 180 integers, as opposed to the 100 values in the the dense array itself. If your sparse matrix is too large to convert to dense, it will be too large to produce all the zero indices (assuming reasonable sparsity).

The `print(M)` shows the same triplets as the find. The attributes of the `coo` format also give the nonzero indices:

``````In [13]: M.row
Out[13]: array([6, 6, 3, 4, 1, 6, 9, 2, 1, 0], dtype=int32)
In [14]: M.col
Out[14]: array([1, 4, 3, 2, 3, 8, 5, 3, 4, 4], dtype=int32)
``````

(Sometimes manipulation of a matrix can set values to 0 without removing them from the attributes. So `find/nonzero` takes an added step to remove those, if any.)

We could apply `find` to `M==0` as well – but sparse will give us a warning.

``````In [15]: sparse.find(M==0)
/usr/local/lib/python3.5/dist-packages/scipy/sparse/compressed.py:213: SparseEfficiencyWarning: Comparing a sparse matrix with 0 using == is inefficient, try using != instead.
", try using != instead.", SparseEfficiencyWarning)
``````

It’s the same thing that I’ve been warning about – the large size of this set. The resulting arrays are the same as in Out[8].

Here is my solution to find the indices for the zero values:

``````from scipy.sparse import csr_matrix
csrm_reversed=sparse.csr_matrix((csrm.A==0)*1)
csrm_reversed.nonzero()
``````

For example:

``````from scipy.sparse import csr_matrix
csrm = csr_matrix([[1,2,0],[0,0,3],[4,0,5]])
csrm.nonzero()
``````

you will get the nonzero indices:

``````(array([0, 0, 1, 2, 2], dtype=int32), array([0, 1, 2, 0, 2], dtype=int32))
``````

and then to find the zero indices:

``````csrm_reversed=sparse.csr_matrix((csrm.A==0)*1)
csrm_reversed.nonzero()
``````

you will get:

``````(array([0, 1, 1, 2], dtype=int32), array([2, 0, 1, 1], dtype=int32))
``````

The dense format of the matrix is:

``````[[1, 2, 0],
[0, 0, 3],
[4, 0, 5]]
``````
Categories: questions
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.