scikit-learn: How do I define the thresholds for the ROC curve?

Question

When plotting the ROC (or deriving the AUC) in scikit-learn, how can one specify arbitrary thresholds for roc_curve, rather than having the function calculate them internally and return them?

from sklearn.metrics import roc_curve
fpr,tpr,thresholds = roc_curve(y_true,y_pred)

A related question was asked at Scikit – How to define thresholds for plotting roc curve, but the OP’s accepted answer indicates that their intent was different to how it was written.

Thanks!

Asked By: jtlz2

||

Source

Answer 1

What you get from the classifier are scores, not just a class prediction.

roc_curve will give you a set of thresholds with associated false positive rates and true positive rates.

If you want your own threshold, just use it:

y_class = y_pred > threshold

Then you can display a confusion matrix, with this new y_class compared to y_true.

And if you want several thresholds, do the same, and get the confusion matrix from each of them to get the true and false positive rate.

Answered By: Matthieu Brucher

Answer 2

It’s quite simple. ROC curve shows you outputs for different thresholds. You always choose best threshold for you model to get forecasts, but ROC curve shows you how robust/good your model is for different thresholds. Here you have quite good explanation how it works: https://www.dataschool.io/roc-curves-and-auc-explained/

Answered By: Arthur G.

scikit-learn: How do I define the thresholds for the ROC curve?

Question:

Answers: