Scipy sparse matrices – purpose and usage of different implementations

Question:

Scipy has many different types of sparse matrices available. What are the most important differences between these types, and what is the difference in their intended usage?

I’m developing a code in python based on a sample code1 in Matlab. One section of the code utilizes sparse matrices – which seem to have a single (annoying) type in Matlab, and I’m trying to figure out which type I should use2 in python.


1: This is for a class. Most people are doing the project in Matlab, but I like to create unnecessary work and confusion — apparently.

2: This is an academic question: I have the code working properly with the ‘CSR‘ format, but I’m interesting in knowing what the optimal usages are.

Asked By: DilithiumMatrix

||

Answers:

Sorry if I’m not answering this completely enough, but hopefully I can provide some insight.

CSC (Compressed Sparse Column) and CSR (Compressed Sparse Row) are more compact and efficient, but difficult to construct “from scratch”. Coo (Coordinate) and DOK (Dictionary of Keys) are easier to construct, and can then be converted to CSC or CSR via matrix.tocsc() or matrix.tocsr().

CSC is more efficient at accessing column-vectors or column operations, generally, as it is stored as arrays of columns and their value at each row.

CSR matrices are the opposite; stored as arrays of rows and their values at each column, and are more efficient at accessing row-vectors or row operations.

Answered By: Will
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.