ValueError while computing Mean Poisson Deviance with the sklearn package

Question:

I am trying to compute the mean Poisson deviance for the predictions that I got from a random forest regression using the metric implemented in sklearn. However, I got this error:

ValueError: Mean Tweedie deviance error with power=2 can only be used on strictly positive y and y_pred.

After reading the sklearn documentation for this function, I know that it requires:

  • y_true >= 0
  • y_pred > 0

So I checked if my data met the requirements, and it seems like it does:

np.any(y_pred<0)
>>> False
np.any(y_pred==0)
>>> False
np.any(y_test<0)
>>> False

Here’s an overview of my data in case it helps:

y_pred[0:100]
>>> array([3.5937205 , 3.7193375 , 4.09343503, 4.3736175 , 2.95546795,
       4.04508544, 4.37377352, 3.59788064, 3.626048  , 4.38476427,
       3.66636431, 4.23287305, 4.33319475, 4.3501116 , 3.61961913,
       3.59461904, 3.60071618, 3.96581378, 4.19052013, 4.35723725,
       4.22033789, 4.07398282, 3.4375113 , 2.9226799 , 4.09534177,
       3.59309893, 3.96228398, 3.62929845, 4.16205635, 4.22933483,
       3.14524591, 3.30359978, 4.74198205, 3.32823315, 3.32921776,
       3.60518305, 4.75992   , 3.80171509, 3.30871596, 4.24122775,
       4.16740865, 4.05255728, 4.33714336, 3.58182892, 3.62881694,
       3.26689612, 3.30889384, 4.15076878, 3.63137294, 3.52888806,
       3.52889078, 3.61343248, 4.03201396, 3.63873444, 4.70430122,
       4.13515025, 3.53062101, 3.62593269, 4.36520532, 4.45868711,
       3.16152143, 3.61939638, 3.626048  , 3.61343248, 4.38715328,
       3.139312  , 3.99510801, 3.66807372, 3.08710737, 3.13308799,
       3.31435806, 4.14397136, 3.59442154, 4.86767821, 4.1714686 ,
       4.16953191, 4.16626847, 3.52720347, 4.99668654, 3.26219536,
       3.10700253, 4.60788712, 3.97407225, 4.17907077, 3.5771338 ,
       5.85575543, 4.35914091, 4.11046269, 3.33230884, 3.61872964,
       3.95371359, 3.2868769 , 3.44595121, 4.13779498, 4.50941441,
       3.66820605, 3.28609709, 4.3873206 , 3.60516187, 3.29150883])

y_test[0:100]
>>> array([ 1.,  8.,  5.,  5.,  0.,  4.,  5.,  5.,  1., 12.,  4.,  1.,  4.,
        6.,  6.,  7.,  8.,  3.,  4.,  1.,  7.,  5.,  3.,  3., 10., 10.,
        4.,  6.,  0.,  0.,  1.,  2.,  2.,  4.,  2.,  3.,  5.,  0.,  3.,
        3.,  3.,  9.,  0.,  7.,  0.,  2.,  1.,  8.,  0.,  4.,  2.,  4.,
        4.,  2.,  1.,  0.,  2.,  8.,  7.,  0.,  3.,  3.,  4.,  3.,  3.,
        6.,  3., 11.,  6., 10.,  2.,  2., 10.,  3.,  4.,  5.,  8.,  4.,
        9.,  0.,  2.,  2.,  6.,  4.,  0.,  1.,  5.,  4.,  2.,  1.,  7.,
        4.,  0.,  4.,  4.,  0.,  1.,  4.,  0.,  3.])

What is causing this error?

Asked By: pypau

||

Answers:

The error you are having is due to the Mean Tweedie deviance metric, power=2,

I referred to then sklearn.metrics.mean_tweedie_deviance docs

power = 2 : Gamma distribution. Requires: y_true > 0 and y_pred > 0.

it requires: y_true > 0 and y_pred > 0.
I used your values of y_pred and y_test, it gets me the same error as y_test contain zero values which violates this requirement.

Answered By: Sauron
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.