Fit the data to multivariable linear regression in Python

Question:

I have the following data:

x1=[100, 100, 110, 110, 120, 120, 120, 130, 130, 130]
x2=[1, 2, 1, 2, 1, 2, 3, 1, 2, 3]
y=[113, 118, 127, 132, 136, 144, 138, 146, 156, 149]

And I want to fit a function having the form y=a0+a1*x1+a2*x2.

I managed to do it by defining matrix X=[1 x1[0] x2[0], ..., 1 x1[9] x2[9]], and then computing [a0 a1 a2]'=inv(X'X)X'y, which gives a0=3.7148, a1=1.1013, a2=1.8517. However, I want to use the linear_model.LinearRegression() as well.

I have a file called multi_regress that contains the following data:

x0,x1,x2,y
1.0,100.0,1.0,113.0
1.0,100.0,2.0,118.0
1.0,110.0,1.0,127.0
1.0,110.0,2.0,132.0
1.0,120.0,1.0,136.0
1.0,120.0,2.0,144.0
1.0,120.0,3.0,138.0
1.0,130.0,1.0,146.0
1.0,130.0,2.0,156.0
1.0,130.0,3.0,149.0

And I wrote the following code:

import pandas
from sklearn import linear_model

df = pandas.read_csv("multi_regress.csv")
X = df[['x0', 'x1', 'x2']]
y = df['y']
regr = linear_model.LinearRegression()
regr.fit(X,y)
print(regr.coef_)

I got the following output:

[0.         1.10129032 1.8516129 ]

The first coefficient a0 is not correct, where is my mistake?

Asked By: Lee

||

Answers:

You’re looking for the intercept.

Try regr.intercept_ to get the value you want.

Alternatively, when you define your model, set fit_intercept=False, seeing as you’ve already added and intercept into your data.

Answered By: s_pike
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.