What are the parameters for sklearn's score function?

Question:

I recently looked at a bunch of sklearn tutorials, which were all similar in that they scored the goodness of fit by:

clf.fit(X_train, y_train)
clf.score(X_test, y_test)

And it’ll spit out:

0.92345...

or some other score.

I am curious as to the parameters of the clf.score function or how it scores the model. I looked all over the internet, but can’t seem to find documentation for it. Does anyone know?

Asked By: tooty44

||

Answers:

Not sure that I understood your question correctly. Obviously, to compute some error or similarity most scoring functions receive an array of reference values (y_true) and an array of values predicted by your model (y_score) as main parameters, but may also receive some other parameters, specific for the metric. Scoring functions usually do not need X values.

I would suggest look into the source code of the scoring functions to understand how they work.

Here is a list of scoring functions in scikit-learn.

Answered By: newtover

It takes a feature matrix X_test and the expected target values y_test. Predictions for X_test are compared with y_test and either accuracy (for classifiers) or R² score (for regression estimators is returned.

This is stated very explicitly in the docstrings for score methods. The one for classification reads

Returns the mean accuracy on the given test data and labels.

Parameters
----------
X : array-like, shape = (n_samples, n_features)
    Test samples.

y : array-like, shape = (n_samples,)
    True labels for X.

sample_weight : array-like, shape = [n_samples], optional
    Sample weights.

Returns
-------
score : float
    Mean accuracy of self.predict(X) wrt. y.

and the one for regression is similar.

Answered By: Fred Foo

This is classifier dependent. Each classifier provides it’s own scoring function.

Estimator score method: Estimators have a score method providing a
default evaluation criterion for the problem they are designed to
solve. This is not discussed on this page, but in each estimator’s
documentation.

Apart from the documentation given to you in one of the answers, the only additional thing you can do is to read what kind of parameters your estimator provides. For example SVM classifier SVC has the following parameters score(X, y, sample_weight=None)

Answered By: Salvador Dali

Syntax:
sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None)

In multilabel classification, this function computes subset accuracy: the set of labels predicted for a sample must exactly match the corresponding set of labels in y_true.

Parameters:
y_true : 1d array-like, or label indicator array / sparse matrix
Ground truth (correct) labels.

y_pred: 1d array-like, or label indicator array / sparse matrix
Predicted labels, as returned by a classifier.

normalize : bool, optional (default=True)
If False, return the number of correctly classified samples. Otherwise, return the fraction of correctly classified samples.

sample_weight : array-like of shape = [n_samples], optional
Sample weights.

Returns:
score : float
If normalize == True, return the fraction of correctly classified samples (float), else returns the number of correctly classified samples (int).

The best performance is 1 with normalize == True and the number of samples with normalize == False.

For more information you can refer to:
[https://scikit-learn.org/stable/modules/model_evaluation.html#accuracy-score][1]

Answered By: Hammad Basit

Here is the way the score is calculated for Regressor:

score(self, X, y, sample_weight=None)[source]
Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 – u/v), where u is the residual sum of squares ((ytrue – ypred) ** 2).sum() and v is the total sum of squares ((ytrue – ytrue.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

From sklearn documentation.

https://scikit-learn.org/stable/modules/generated/sklearn.dummy.DummyRegressor.html#sklearn.dummy.DummyRegressor.score

Answered By: Eli Safra

Scikit-learns model.score(X,y) calculation works on co-efficient of determination i.e R^2 is a simple function that takes model.score= (X_test,y_test). It doesn’t require y_predicted value to be supplied externally to calculate the score for you, rather it calculates y_predicted internally and uses it in the calculations.

This is how it is done:

u = ((y_test – y_predicted) ** 2).sum()

v = ((y_test – y_test.mean()) ** 2).sum()

score = 1 – (u/v)

and you get the score ! Hope that helps.

Answered By: Siddhesh Bhosale
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.