Check if two scipy.sparse.csr_matrix are equal

Question:

I want to check if two csr_matrix are equal.

If I do:

x.__eq__(y)

I get:

raise ValueError("The truth value of an array with more than one "
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all().

This, However, works well:

assert (z in x for z in y)

Is there a better way to do it? maybe using some scipy optimized function instead?

Thanks so much

Asked By: omerbp

||

Answers:

Can we assume they are the same shape?

In [202]: a=sparse.csr_matrix([[0,1],[1,0]])
In [203]: b=sparse.csr_matrix([[0,1],[1,1]])
In [204]: (a!=b).nnz==0   
Out[204]: False

This checks the sparsity of the inequality array.

It will give you an efficiency warning if you try a==b (at least the 1st time you use it). That’s because it has to test all those zeros. It can’t take much advantage of the sparsity.

You need a relatively recent version to use logical operators like this. Were you trying to use x.__eq__(y) in some if expression, or did you get error from just that expression?

In general you probably want to check several parameters first. Same shape, same nnz, same dtype. You need to be careful with floats.

For dense arrays np.allclose is a good way of testing equality. And if the sparse arrays aren’t too large, that might be good as well

np.allclose(a.A, b.A)

allclose uses all(less_equal(abs(x-y), atol + rtol * abs(y))). You can use a-b, but I suspect that this too will give an efficiecy warning.

Answered By: hpaulj

SciPy and Numpy Hybrid Method

What worked best for my case was (using a generic code example):

bool_answer = np.arrays_equal(sparse_matrix_1.todense(), sparse_matrix_2.todense())

You might need to pay attention to the equal_nan parameter in np.arrays_equal

The following doc references helped me get there:
CSR Sparse Matrix Methods
CSC Sparse Matrix Methods
Numpy arrays_equal method
SciPy todense method

Answered By: Thom Ives
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.