regression coefficient using numpy

Question:

I’m trying to find out the regression coefficient in multiple linear regression.I’m using numpy module for this.I have dependant and independent values.what I’ve tried is given below

import numpy as np
y = [5.4,6.3,6.5,6.2,8.1,7.9,6.7,6.8,4.9,5.8]
X = [[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 0.0, 1.0, 1.0, 1.0,   1.0, 1.0, 0.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], [1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0], [1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0], [1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]

b = y * X.T * np.linalg.inv(X*X.T)
print(b)

but it is giving an error like

Traceback (most recent call last):
File "C:/Python27/proj/new5.py", line 14, in <module>
b = y * X.T * np.linalg.inv(X*X.T)
AttributeError: 'list' object has no attribute 'T'

please help me to do this.

Asked By: naz

||

Answers:

The array can not be initialized like that.

you should use np.array

y = np.array([5.4,6.3,6.5,6.2,8.1,7.9,6.7,6.8,4.9,5.8])

then the T attribute will be there.

For numpy array, you can not use * to multiply, coz * is for element-wise multiplication.

If you are multiplying with matrix, like y * X.T should be written as y.dot(X.T)

Read this page about the difference in use of array and matrix in numpy.

Link

=====================================================

So you can get the best solution using the pseudo inverse:

if the svd of X^T is:

X^T = U*S*V^T ([compact svd][1])

Then:

b = V*S^-1*U^T*y

Here b and y are both column vector.

if you want them to be row vector, then just take transpose on both sides.

Answered By: Min Lin

Try this:

y = np.matrix([[1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4]])

x = np.matrix([
     [1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
     [4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5],
     [4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5],
     [4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4]
     ])

b = y * x.T * np.linalg.inv(x*x.T)

Result:

>>> b
matrix([[ 1.57044377, -0.0617812 ,  0.23596693,  0.24238522]])

I tried applying the above to your data, but I get singular matrix error:

    raise LinAlgError, 'Singular matrix'
numpy.linalg.linalg.LinAlgError: Singular matrix

So it appears that some variables in your X are perfectly correlated.

Answered By: Akavall
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.