why do we use fit() before predict() in LinearRegression

Question:

Why do we use fit before giving a LinearRegression model a predict method Imean in this book I didn’t even give it a proper training dataset

I mean when we give fit doesn’t it just fits the data why do we need to predict something which is already known
this is just confusing
for reference if you know what this means kindly teach me too

I just need to know why we use fit() and then predict(). like aren’t we giving housing labels -the supposed- outputs to the model already why are we predicting it afterward?

Is it for the sole purpose of only checking if our LinearRegression model is doing good or not because we are giving it the output which is housing_labels and then we predict it and then compare the two. Why? why don’t we predict() on test set directly?

Asked By: Asrar

Source

Answers:

Before making a prediction with your linear model you must fit it to the training data you have.

It makes no sense to predict without seeing any training samples before.

Answered By: SystemSigma_

When you use the fit method, you will try to achieve the least possible amount of error on your data training set by making the model learn. But you won’t always (and in fact, rarely) be able to have 0 errors.

Thus you can try to use the predict method it on the training data, in order to obtain the training error. If the training/test/validation sets are well made, this can give you a feel about under-fitting (if the training error is too big, your model might not be adapted to the problem, or you might not have enough training data) or over-fitting (if the training error is null, you might pick on the noise present in the training data for example). Using predict on data from the training set has no effect on the training of the model though.

To really check if the model has trained well, you should predict on a test set (and maybe a validation set too if need be).

Again fit makes your model learn, while predict only applies what the model has learned. Thus it makes little sense to use predict before fit.

When talking about a LinearRegressor, the fit method is the one who will determine the values of the coefficients a and b in the equation y = ax + b of the regressor, according to the training data. Once you have these coefficients, you can use the equation to predict the y value for any x. And if you want to have an idea on it’s performance, you’ll try x values for which you know the y values (i.e. data in the training set).

Answered By: GregoirePelegrin