Is sklearn.model_selection.GridSearchCV can do custom threshold?

Question:

My goal is to do threshold tuning before parameter tuning. The idea is simple, in imbalanced dataset, if class 1 is minority, then the threshold should be lower than 0.5, so it predict more instance as class 1 instead of 0.

Therefore, I believe, by changing the threshold early, we can improve the model predictive power even more than (parameter tuning – threshold tuning).

The problem is, I don’t find the parameter in GridSearchCV to change the threshold.

Asked By: kidfrom

||

Answers:

You can’t directly change the threshold used by predict (which gets called by your scorer, presumably), but you can provide a customer scoring method. See the User Guide. Here I think you’d want something like:

def f2_score_at_thresh(y_true, y_pos_prob, threshold):
    y_pred = y_pos_prob > threshold
    return fbeta_score(y_true, y_pred, beta=2, ...)

my_scorer = make_scorer(f2_scorer, needs_proba=True, threshold=0.2)

GridSearchCV(..., scoring=my_scorer)
Answered By: Ben Reiniger
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.