# Calculate Matrix Rank using scipy

## Question:

I’d like to calculate the mathematical rank of a matrix using scipy. The most obvious function `numpy.rank`

calculates the dimension of an array (ie. scalars have dimension 0, vectors 1, matrices 2, etc…). I am aware that the `numpy.linalg.lstsq`

module has this capability, but I was wondering if such a fundamental operation is built into the matrix class somewhere.

Here is an explicit example:

```
from numpy import matrix, rank
A = matrix([[1,3,7],[2,8,3],[7,8,1]])
print rank(A)
```

This gives `2`

the dimension, where I’m looking for an answer of `3`

.

## Answers:

I don’t know about Numpy in particular, but that’s unlikely to be a built-in operation on a matrix; it involves fairly intensive numerical computations (and associated concerns about floating-point roundoff error and so forth) and threshold selections that may or may not be appropriate in a given context, and algorithm selection is important to computing it accurately and quickly.

Things that are built into the basic classes tend to be things that can be performed in a unique and straightforward manner, such as matrix multiplications at the most complex.

The **linear algebra functions are generally grouped in numpy.linalg**. (They’re also available from

`scipy.linalg`

, which has more functionality.) This allows polymorphism: the functions can accept any of the types that SciPy handles.So, yes, **the numpy.linalg.lstsq function** does what you’re asking. Why is that insufficient?

**This answer is out of date.**

The answer is noâ€”there is currently no function dedicated to calculating the matrix rank of an array/matrix in scipy. Adding one has been discussed before, but if it’s going to happen, I don’t believe it has yet.

To provide a rough code snippet for people who need to get this done in practice. Feel free to improve.

```
u, s, v = np.linalg.svd(A)
rank = np.sum(s > 1e-10)
```

If `numpy`

does not offer a rank facility, why don’t you write your own?

An efficient way to compute the rank is via the Singular Value Decomposition – the rank of the matrix is equal to the number of non-zero singular values.

```
def rank(A, eps=1e-12):
u, s, vh = numpy.linalg.svd(A)
return len([x for x in s if abs(x) > eps])
```

Notice that `eps`

depends in your application – most would agree that 1e-12 corresponds to zero, but you may witness numerical instability even for eps=1e-9.

Using your example, the answer is three. If you change the second row to `[2, 6, 14]`

(linearly dependent with row one) the answer is two (the “zero” eigenvalue is 4.9960E-16)

Numpy provides `numpy.linalg.matrix_rank()`

:

```
>>> import numpy
>>> numpy.__version__
'1.5.1'
>>> A = numpy.matrix([[1,3,7],[2,8,3],[7,8,1]])
>>> numpy.linalg.matrix_rank(A)
3
```

`scipy`

now contains an efficient interpolative method for estimating the rank of a matrix/LinearOperator using random methods, which can often be accurate enough:

```
>>> from numpy import matrix
>>> A = matrix([[1,3,7],[2,8,3],[7,8,1]], dtype=float) # doesn't accept int
>>> import scipy.linalg.interpolative as sli
>>> sli.estimate_rank(A, eps=1e-10)
3
```