scikit-learn: How do I define the thresholds for the ROC curve?

Question:

When plotting the ROC (or deriving the AUC) in scikit-learn, how can one specify arbitrary thresholds for roc_curve, rather than having the function calculate them internally and return them?

from sklearn.metrics import roc_curve
fpr,tpr,thresholds = roc_curve(y_true,y_pred)

A related question was asked at Scikit – How to define thresholds for plotting roc curve, but the OP’s accepted answer indicates that their intent was different to how it was written.

Thanks!

Asked By: jtlz2

||

Answers:

What you get from the classifier are scores, not just a class prediction.

roc_curve will give you a set of thresholds with associated false positive rates and true positive rates.

If you want your own threshold, just use it:

y_class = y_pred > threshold

Then you can display a confusion matrix, with this new y_class compared to y_true.

And if you want several thresholds, do the same, and get the confusion matrix from each of them to get the true and false positive rate.

Answered By: Matthieu Brucher

It’s quite simple. ROC curve shows you outputs for different thresholds. You always choose best threshold for you model to get forecasts, but ROC curve shows you how robust/good your model is for different thresholds. Here you have quite good explanation how it works: https://www.dataschool.io/roc-curves-and-auc-explained/

Answered By: Arthur G.
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.