TransformedTargetRegressor save and load error

Question:

I’m defining my custom regressor using the TransformedTargetRegressor, adding it to the pipeline and saving the model in the ‘joblib’ file. However as I’m trying to load the model, I get an error

module ‘main‘ has no attribute ‘transform_targets’

where transform_targets is one of the functions defined for the regressor

def transform_targets(targets):
   targets = (targets - min_t)/(max_t-min_t)
   return targets

def inv_transform_targets(outputs):
   outputs = (outputs)*(max_t-min_t)+min_t
   return outputs

# Define the model 

mlp_model = MLPRegressor(activation = 'relu', validation_fraction = 0.2, hidden_layer_sizes=(1000, ))
full_model = TransformedTargetRegressor(regressor = mlp_model, func = transform_targets,
                                 inverse_func = inv_transform_targets)

# Incorporate feature scaling via pipeline

pipeline = make_pipeline(MinMaxScaler(), full_model)
nn_model = pipeline.fit(X_train,y_train)

# Fit the model which uses the transformed target regressor + maxmin pipeline

nn_model.fit(X_train,y_train)

from joblib import dump, load
dump(nn_model, 'fitness_nn_C1.joblib')

The model works fine and predicts well, it saves with no errors, but would not load back. If I save it with pickle it also returns a similar error

AttributeError: Can’t get attribute ‘transform_targets’ on module ‘main‘>

Does anyone know how to save a model which includes a TransformedTargetRegressor in one file, that can then be reloaded successfully? I realise that I can dump the parameters/ functions associated with transforming the targets in a separate file, but that’s exactly what I want to avoid

Edit:

The current workaround is to use MinMaxScaler as a transformer, or any other transformer from the preprocessing lot, but still don’t know if it’s possible to include the custom functions in this workflow

Asked By: Natalia

||

Answers:

The problem is when you try to load the file back it cannot resolve transform_targets which was not dumped initially. You can make use of dill to serialize it. So basically you have to create a list of items you want to dump and then use dill and joblib to serialize them as shown below:

from sklearn.neural_network import MLPRegressor
from sklearn.compose import TransformedTargetRegressor
from sklearn.pipeline import make_pipeline
from sklearn.datasets import make_friedman1
from sklearn.preprocessing import MinMaxScaler
import dill
X, y = make_friedman1(n_samples=50, n_features=10, random_state=0)

min_t = 10
max_t = 300
def transform_targets(targets):
   targets = (targets - min_t)/(max_t-min_t)
   return targets

def inv_transform_targets(outputs):
   outputs = (outputs)*(max_t-min_t)+min_t
   return outputs

# Define the model 

mlp_model = MLPRegressor(activation = 'relu', validation_fraction = 0.2, hidden_layer_sizes=(1000, ))
full_model = TransformedTargetRegressor(regressor = mlp_model, func = transform_targets,
                                 inverse_func = inv_transform_targets)

# Incorporate feature scaling via pipeline

pipeline = make_pipeline(MinMaxScaler(), full_model)
nn_model = pipeline.fit(X,y)

# Fit the model which uses the transformed target regressor + maxmin pipeline

nn_model.fit(X,y)
to_save = [transform_targets, inv_transform_targets, nn_model]
r = dill.dumps(to_save)
from joblib import dump, load
dump(r, 'fitness_nn_C1.joblib')

And now you can load it as shown below:

from joblib import dump, load
import dill
Q = load('fitness_nn_C1.joblib')
T = dill.loads(Q)

T will look like this:

[<function __main__.transform_targets(targets)>,
 <function __main__.inv_transform_targets(outputs)>,
 Pipeline(memory=None,
          steps=[('minmaxscaler', MinMaxScaler(copy=True, feature_range=(0, 1))),
                 ('transformedtargetregressor',
                  TransformedTargetRegressor(check_inverse=True,
                                             func=<function transform_targets at 0x000001F486D27048>,
                                             inverse_func=<function inv_transform_targets at 0x000001F4882E6C80>,
                                             regressor=MLPRegressor(activation='relu',
                                                                    alpha=0.0001,
                                                                    batch_size='a...
                                                                    beta_2=0.999,
                                                                    early_stopping=False,
                                                                    epsilon=1e-08,
                                                                    hidden_layer_sizes=(1000,),
                                                                    learning_rate='constant',
                                                                    learning_rate_init=0.001,
                                                                    max_iter=200,
                                                                    momentum=0.9,
                                                                    n_iter_no_change=10,
                                                                    nesterovs_momentum=True,
                                                                    power_t=0.5,
                                                                    random_state=None,
                                                                    shuffle=True,
                                                                    solver='adam',
                                                                    tol=0.0001,
                                                                    validation_fraction=0.2,
                                                                    verbose=False,
                                                                    warm_start=False),
                                             transformer=None))],
          verbose=False)]