Can any sklearn module return average precision and recall scores for negative class in k-fold cross validation?

Question:

I am trying to get the mean of precision and recall for BOTH positive and negative class in a 10-fold cross validation. My model is a binary classifier.

I ran the codes below and unfortunately it only returned the mean precision and recall for the positive class. How can I tell the algorithm to return the mean precision and recall scores for the negative class as well?

from sklearn.metrics import make_scorer, accuracy_score, precision_score, recall_score, f1_score
from sklearn.model_selection import cross_validate

scoring = {'accuracy' : make_scorer(accuracy_score), 
           'precision' : make_scorer(precision_score),
           'recall' : make_scorer(recall_score), 
           'f1_score' : make_scorer(f1_score)}

results = cross_validate(model_unbalanced_data_10_times_weight, X, Y, cv=10, scoring=scoring)

np.mean(results['test_precision'])
np.mean(results['test_recall'])

I’ve also tried printing the classification report using the command “classification_report(y_test, predictions)” which resulted in the printout in screenshot below. However, I believe the precision/recall scores from the classification report is based on 1 run only and not the average over 10 folds (correct me if I am wrong).

enter image description here

Asked By: Stanleyrr

||

Answers:

Based on our discussion above, I do believe that computing predictions for every cv fold and computing cross_validation_report on them should be the right way to go. Results should now take the number of cv folds into account:

>>> from sklearn.metrics import classification_report
>>> from sklearn.datasets import load_iris
>>> from sklearn.ensemble import RandomForestClassifier
>>> from sklearn.model_selection import cross_val_predict
>>> 
>>> iris = load_iris()
>>> 
>>> rf_clf = RandomForestClassifier()
>>> 
>>> preds = cross_val_predict(estimator=rf_clf,
...                           X=iris["data"],
...                           y=iris["target"],
...                           cv=15)
>>> 
>>> print(classification_report(iris["target"], preds))
              precision    recall  f1-score   support

           0       1.00      1.00      1.00        50
           1       0.92      0.94      0.93        50
           2       0.94      0.92      0.93        50

    accuracy                           0.95       150
   macro avg       0.95      0.95      0.95       150
weighted avg       0.95      0.95      0.95       150
Answered By: anddt

Maybe cross_val_predict is not the way to go, because as stated in the documentation: "Passing these predictions into an evaluation metric may not be a valid way to measure generalization performance. Results can differ from cross_validate and cross_val_score unless all tests sets have equal size and the metric decomposes over samples."

I would suggest:

rf_clf = RandomForestClassifier()

scoring = {'precision_positive_class': make_scorer(precision_score),
           'precision_negative_class': make_scorer(precision_score, pos_label=0),
           'recall_positive_class': make_scorer(recall_score),
           'recall_negative_class': make_scorer(recall_score, pos_label=0)}

results = cross_validate(rf_clf, X=iris["data"], y=iris["target"],
                         scoring=scoring,
                         return_train_score=True)

# Calculate averages of NEGATIVE class
np.mean(results['test_precision_negative_class'])
np.mean(results['test_recall_negative_class']) 
Answered By: Hellen