Finding optimal weights in regression
Question:
I am new to linear regression and sklearn.
I have a problem where I have input feature x1, which contains 101 ones and input feature x2 100 ones and then a zero. The output y is all 101 ones.
I am trying to find the optimal value of w1 and w2.
I tried writing the below code:
from sklearn.linear_model import LinearRegression
import numpy as np
x = np.ones(202, dtype='int').reshape(101,2)
x[100,1]= 0
y = np.ones(101, dtype='int')
model = LinearRegression().fit(x,y)
print(f"w1 and w2: {model.coef_}")
The output I am getting is :w1 and w2: [0. 0.]
, which I am sure is wrong.
It would be great if someone can help me in correcting the code.
Edit
The value of input x looks like below:
array([[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 0]])
Answers:
The LinearRegression model includes an intercept term by default, which you have omitted to monitor. The intercept, also known as the bias term, is denoted by b
in the following equation solved by the linear regression:
y = w1 * x1 + w2 * x2 + b
Here, w1
and w2
are the model coefficients stored into the model.coef_
, while the intercept is represented by model.intercept_
.
To illustrate this, if you run your code and print the value of the intercept, you will notice that it takes a value of 1 when the model coefficients are zero:
from sklearn.linear_model import LinearRegression
import numpy as np
# Data
x = np.ones(202, dtype='int').reshape(101,2)
x[100,1]= 0
y = np.ones(101, dtype='int')
# Fit
model = LinearRegression().fit(x,y)
print(f"The coefficients are: {model.coef_}")
print(f"The intercept is: {model.intercept_}")
This will output:
The coefficients are: [0. 0.]
The intercept is: 1.0
If you set fit_intercept to False, you will see that the model coefficients take on the expected values:
model = LinearRegression(fit_intercept=False).fit(x,y)
print(f"The coefficients are: {model.coef_}")
print(f"The intercept is: {model.intercept_}")
Output:
The coefficients are: [1.00000000e+00 6.38320294e-17]
The intercept is: 0.0
I am new to linear regression and sklearn.
I have a problem where I have input feature x1, which contains 101 ones and input feature x2 100 ones and then a zero. The output y is all 101 ones.
I am trying to find the optimal value of w1 and w2.
I tried writing the below code:
from sklearn.linear_model import LinearRegression
import numpy as np
x = np.ones(202, dtype='int').reshape(101,2)
x[100,1]= 0
y = np.ones(101, dtype='int')
model = LinearRegression().fit(x,y)
print(f"w1 and w2: {model.coef_}")
The output I am getting is :w1 and w2: [0. 0.]
, which I am sure is wrong.
It would be great if someone can help me in correcting the code.
Edit
The value of input x looks like below:
array([[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 1],
[1, 0]])
The LinearRegression model includes an intercept term by default, which you have omitted to monitor. The intercept, also known as the bias term, is denoted by b
in the following equation solved by the linear regression:
y = w1 * x1 + w2 * x2 + b
Here, w1
and w2
are the model coefficients stored into the model.coef_
, while the intercept is represented by model.intercept_
.
To illustrate this, if you run your code and print the value of the intercept, you will notice that it takes a value of 1 when the model coefficients are zero:
from sklearn.linear_model import LinearRegression
import numpy as np
# Data
x = np.ones(202, dtype='int').reshape(101,2)
x[100,1]= 0
y = np.ones(101, dtype='int')
# Fit
model = LinearRegression().fit(x,y)
print(f"The coefficients are: {model.coef_}")
print(f"The intercept is: {model.intercept_}")
This will output:
The coefficients are: [0. 0.]
The intercept is: 1.0
If you set fit_intercept to False, you will see that the model coefficients take on the expected values:
model = LinearRegression(fit_intercept=False).fit(x,y)
print(f"The coefficients are: {model.coef_}")
print(f"The intercept is: {model.intercept_}")
Output:
The coefficients are: [1.00000000e+00 6.38320294e-17]
The intercept is: 0.0