Mean squared error (MSE) problem with calculation in dataframe

Question:

I am working with sklearn.metrics mean_squared_error and I want to use the mse function in a dataframe.

I want to calculate the mse for each element between two columns from two different df.

When I try this:

mean_squared_error(train_df_test.iloc[:,:1], ideal_df_test.iloc[:,:1])

I simply get one value as a return. I would have expected to get one individual value for each row for 1 column.

Like it would work here:

train_df_test.iloc[:,:1] * ideal_df_test.iloc[:,:1]

Output:

|    x  | y1|
| ----- | - |
| -20.00| 10|
| -19.99| 20|

What am I doing wrong here?

Asked By: Stefan

||

Answers:

The MSE uses the errors of your model, namely the difference between the real values and the predicted ones, square them and then it takes the avarage.

It is correct obtainin a scalar value as output and not a vector of values.

This is the formula:

enter image description here

If you want as output a vector of squared errors (not sure which could be the scope) you can use the following function that takes as input 2 columns:

def squared_errors(col_true, col_pred):
    return (col_true - col_pred)**2
Answered By: Matteo
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.