Python/Scikit-Learn – Can't handle mix of multiclass and continuous

Question:

I’m trying to fit an SGDRegressor to my data and then check the accuracy. The fitting works fine, but then the predictions are not in the same datatype(?) as the original target data, and I get the error

ValueError: Can't handle mix of multiclass and continuous

When calling print "Accuracy:", ms.accuracy_score(y_test,predictions).

The data looks like this (just 200 thousand + rows):

Product_id/Date/product_group1/Price/Net price/Purchase price/Hour/Quantity/product_group2
0   107 12/31/2012  10  300 236 220 10  1   108

The code is as follows:

from sklearn.preprocessing import StandardScaler
import numpy as np
from sklearn.linear_model import SGDRegressor
import numpy as np
from sklearn import metrics as ms

msk = np.random.rand(len(beers)) < 0.8

train = beers[msk]
test = beers[~msk]

X = train [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']]
y = train[['Quantity']]
y = y.as_matrix().ravel()

X_test = test [['Price', 'Net price', 'Purchase price','Hour','Product_id','product_group2']]
y_test = test[['Quantity']]
y_test = y_test.as_matrix().ravel()

clf = SGDRegressor(n_iter=2000)
clf.fit(X, y)
predictions = clf.predict(X_test)
print "Accuracy:", ms.accuracy_score(y_test,predictions)

What should I do differently? Thank you!

Asked By: lte__

||

Answers:

Accuracy is a classification metric. You can’t use it with a regression. See the documentation for info on the various metrics.

Answered By: BrenBarn

Accuracy score is only for classification problems. For regression problems you can use: R2 Score, MSE (Mean Squared Error), RMSE (Root Mean Squared Error).

Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.