How to transform numpy.matrix or array to scipy sparse matrix

Question:

For SciPy sparse matrix, one can use todense() or toarray() to transform to NumPy matrix or array. What are the functions to do the inverse?

I searched, but got no idea what keywords should be the right hit.

Asked By: Flake

||

Answers:

You can pass a numpy array or matrix as an argument when initializing a sparse matrix. For a CSR matrix, for example, you can do the following.

>>> import numpy as np
>>> from scipy import sparse
>>> A = np.array([[1,2,0],[0,0,3],[1,0,4]])
>>> B = np.matrix([[1,2,0],[0,0,3],[1,0,4]])

>>> A
array([[1, 2, 0],
       [0, 0, 3],
       [1, 0, 4]])

>>> sA = sparse.csr_matrix(A)   # Here's the initialization of the sparse matrix.
>>> sB = sparse.csr_matrix(B)

>>> sA
<3x3 sparse matrix of type '<type 'numpy.int32'>'
        with 5 stored elements in Compressed Sparse Row format>

>>> print sA
  (0, 0)        1
  (0, 1)        2
  (1, 2)        3
  (2, 0)        1
  (2, 2)        4
Answered By: David Alber

There are several sparse matrix classes in scipy.

bsr_matrix(arg1[, shape, dtype, copy, blocksize]) Block Sparse Row matrix
coo_matrix(arg1[, shape, dtype, copy]) A sparse matrix in COOrdinate format.
csc_matrix(arg1[, shape, dtype, copy]) Compressed Sparse Column matrix
csr_matrix(arg1[, shape, dtype, copy]) Compressed Sparse Row matrix
dia_matrix(arg1[, shape, dtype, copy]) Sparse matrix with DIAgonal storage
dok_matrix(arg1[, shape, dtype, copy]) Dictionary Of Keys based sparse matrix.
lil_matrix(arg1[, shape, dtype, copy]) Row-based linked list sparse matrix

Any of them can do the conversion.

import numpy as np
from scipy import sparse
a=np.array([[1,0,1],[0,0,1]])
b=sparse.csr_matrix(a)
print(b)

(0, 0)  1
(0, 2)  1
(1, 2)  1

See http://docs.scipy.org/doc/scipy/reference/sparse.html#usage-information .

Answered By: cyborg

As for the inverse, the function is inv(A), but I won’t recommend using it, since for huge matrices it is very computationally costly and unstable. Instead, you should use an approximation to the inverse, or if you want to solve Ax = b you don’t really need A-1.

In Python, the Scipy library can be used to convert the 2-D NumPy matrix into a Sparse matrix. SciPy 2-D sparse matrix package for numeric data is scipy.sparse

The scipy.sparse package provides different Classes to create the following types of Sparse matrices from the 2-dimensional matrix:

  1. Block Sparse Row matrix
  2. A sparse matrix in COOrdinate format.
  3. Compressed Sparse Column matrix
  4. Compressed Sparse Row matrix
  5. Sparse matrix with DIAgonal storage
  6. Dictionary Of Keys based sparse matrix.
  7. Row-based list of lists sparse matrix
  8. This class provides a base class for all sparse matrices.

CSR (Compressed Sparse Row) or CSC (Compressed Sparse Column) formats support efficient access and matrix operations.

Example code to Convert Numpy matrix into Compressed Sparse Column(CSC) matrix & Compressed Sparse Row (CSR) matrix using Scipy classes:

import sys                 # Return the size of an object in bytes
import numpy as np         # To create 2 dimentional matrix
from scipy.sparse import csr_matrix, csc_matrix 
# csr_matrix: used to create compressed sparse row matrix from Matrix
# csc_matrix: used to create compressed sparse column matrix from Matrix

create a 2-D Numpy matrix

A = np.array([[1, 0, 0, 0, 0, 0],
              [0, 0, 2, 0, 0, 1],
              [0, 0, 0, 2, 0, 0]])
print("Dense matrix representation: n", A)
print("Memory utilised (bytes): ", sys.getsizeof(A))
print("Type of the object", type(A))

Print the matrix & other details:

Dense matrix representation: 
 [[1 0 0 0 0 0]
 [0 0 2 0 0 1]
 [0 0 0 2 0 0]]
Memory utilised (bytes):  184
Type of the object <class 'numpy.ndarray'>

Converting Matrix A to the Compressed sparse row matrix representation using csr_matrix Class:

S = csr_matrix(A)
print("Sparse 'row' matrix: n",S)
print("Memory utilised (bytes): ", sys.getsizeof(S))
print("Type of the object", type(S))

The output of print statements:

Sparse 'row' matrix:
(0, 0) 1
(1, 2) 2
(1, 5) 1
(2, 3) 2
Memory utilised (bytes): 56
Type of the object: <class 'scipy.sparse.csr.csc_matrix'>

Converting Matrix A to Compressed Sparse Column matrix representation using csc_matrix Class:

S = csc_matrix(A)
print("Sparse 'column' matrix: n",S)
print("Memory utilised (bytes): ", sys.getsizeof(S))
print("Type of the object", type(S))

The output of print statements:

Sparse 'column' matrix:
(0, 0) 1
(1, 2) 2
(2, 3) 2
(1, 5) 1
Memory utilised (bytes): 56
Type of the object: <class 'scipy.sparse.csc.csc_matrix'>

As it can be seen the size of the compressed matrices is 56 bytes and the original matrix size is 184 bytes.

For a more detailed explanation and code examples please refer to this article: https://limitlessdatascience.wordpress.com/2020/11/26/sparse-matrix-in-machine-learning/

Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.