Getting NaNs in Power Transform inverse transformation

Question:

When I perform the inverse transform operation, I get some NaN values back.

Steps I took:

  1. Power transformed each feature column and saved it in a dictionary:
{col1: transformer,
col2: transformer2,
...,
yCol: transformerY
}
  1. After training the model and getting its predictions, I apply the inverse transform from transformerY and receive some NaNs. Why is this happening and how do I mitigate this?

Thanks!

Asked By: Boolean Autocrat

||

Answers:

After a lot of analysis, I figured out that the NaN values from the inverse transform were actually values that were outside the domain of the inverse transform function. Looking at the implementation of the power transformer, it seems that this can happen if the original data contained values that are too large or too small to be transformed by the power transform.

For my issue, I set max and min thresholds for the data to bound my data but this may not apply to you. Instead, you can use a different transformation method that is better suited to the range and distribution of your data. For example, if the data is skewed, you could try using a log transform instead of a power transform.

It is also a good idea to check the distribution of the original data to ensure that it is appropriate for the transformation method you are using. For example, the power transform is not well-suited for data that is heavily skewed or has multiple modes.

Finally, if you are insistent on using power transform consider trying different transformers yeo-johnson vs box-cox

Answered By: Boolean Autocrat