regression coefficient using numpy
Question:
I’m trying to find out the regression coefficient in multiple linear regression.I’m using numpy module for this.I have dependant and independent values.what I’ve tried is given below
import numpy as np
y = [5.4,6.3,6.5,6.2,8.1,7.9,6.7,6.8,4.9,5.8]
X = [[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], [1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0], [1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0], [1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]
b = y * X.T * np.linalg.inv(X*X.T)
print(b)
but it is giving an error like
Traceback (most recent call last):
File "C:/Python27/proj/new5.py", line 14, in <module>
b = y * X.T * np.linalg.inv(X*X.T)
AttributeError: 'list' object has no attribute 'T'
please help me to do this.
Answers:
The array can not be initialized like that.
you should use np.array
y = np.array([5.4,6.3,6.5,6.2,8.1,7.9,6.7,6.8,4.9,5.8])
then the T attribute will be there.
For numpy array, you can not use * to multiply, coz * is for element-wise multiplication.
If you are multiplying with matrix, like y * X.T should be written as y.dot(X.T)
Read this page about the difference in use of array and matrix in numpy.
=====================================================
So you can get the best solution using the pseudo inverse:
if the svd of X^T is:
X^T = U*S*V^T ([compact svd][1])
Then:
b = V*S^-1*U^T*y
Here b and y are both column vector.
if you want them to be row vector, then just take transpose on both sides.
Try this:
y = np.matrix([[1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4]])
x = np.matrix([
[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
[4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5],
[4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5],
[4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4]
])
b = y * x.T * np.linalg.inv(x*x.T)
Result:
>>> b
matrix([[ 1.57044377, -0.0617812 , 0.23596693, 0.24238522]])
I tried applying the above to your data, but I get singular matrix error:
raise LinAlgError, 'Singular matrix'
numpy.linalg.linalg.LinAlgError: Singular matrix
So it appears that some variables in your X
are perfectly correlated.
I’m trying to find out the regression coefficient in multiple linear regression.I’m using numpy module for this.I have dependant and independent values.what I’ve tried is given below
import numpy as np
y = [5.4,6.3,6.5,6.2,8.1,7.9,6.7,6.8,4.9,5.8]
X = [[0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0], [1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], [1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0], [0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0], [1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0], [1.0, 1.0, 0.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0], [0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0], [0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0], [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 1.0], [1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 0.0], [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 0.0, 0.0, 1.0, 0.0], [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0, 0.0, 1.0], [1.0, 1.0, 0.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0], [0.0, 0.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]
b = y * X.T * np.linalg.inv(X*X.T)
print(b)
but it is giving an error like
Traceback (most recent call last):
File "C:/Python27/proj/new5.py", line 14, in <module>
b = y * X.T * np.linalg.inv(X*X.T)
AttributeError: 'list' object has no attribute 'T'
please help me to do this.
The array can not be initialized like that.
you should use np.array
y = np.array([5.4,6.3,6.5,6.2,8.1,7.9,6.7,6.8,4.9,5.8])
then the T attribute will be there.
For numpy array, you can not use * to multiply, coz * is for element-wise multiplication.
If you are multiplying with matrix, like y * X.T should be written as y.dot(X.T)
Read this page about the difference in use of array and matrix in numpy.
=====================================================
So you can get the best solution using the pseudo inverse:
if the svd of X^T is:
X^T = U*S*V^T ([compact svd][1])
Then:
b = V*S^-1*U^T*y
Here b and y are both column vector.
if you want them to be row vector, then just take transpose on both sides.
Try this:
y = np.matrix([[1,2,3,4,3,4,5,4,5,5,4,5,4,5,4,5,6,5,4,5,4,3,4]])
x = np.matrix([
[1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1],
[4,2,3,4,5,4,5,6,7,4,8,9,8,8,6,6,5,5,5,5,5,5,5],
[4,1,2,3,4,5,6,7,5,8,7,8,7,8,7,8,7,7,7,7,7,6,5],
[4,1,2,5,6,7,8,9,7,8,7,8,7,7,7,7,7,7,6,6,4,4,4]
])
b = y * x.T * np.linalg.inv(x*x.T)
Result:
>>> b
matrix([[ 1.57044377, -0.0617812 , 0.23596693, 0.24238522]])
I tried applying the above to your data, but I get singular matrix error:
raise LinAlgError, 'Singular matrix'
numpy.linalg.linalg.LinAlgError: Singular matrix
So it appears that some variables in your X
are perfectly correlated.