Log Pickle files as a part of Mlflow run

Question:

I am running an MLflow experiment as a part of it I would like to log a few artifacts as a python pickle.

Ex: Trying out different categorical encoders, so wanted to log the encoder objects as a pickle file.

Is there a way to achieve this?

Asked By: Naga Budigam

||

Answers:

There are two functions for there:

  1. log_artifact – to log a local file or directory as an artifact
  2. log_artifacts – to log a contents of a local directory

so it would be as simple as:

with mlflow.start_run():
    mlflow.log_artifact("encoder.pickle")

And you will need to use the custom MLflow model to use that pickled file, something like this:

import mlflow.pyfunc

class my_model(mlflow.pyfunc.PythonModel):
    def __init__(self, encoders):
        self.encoders = encoders

    def predict(self, context, model_input):
        _X = ...# do encoding using self.encoders.
        return str(self.ctx.predict([_X])[0])
Answered By: Alex Ott

Thank you Alex for providing the relevant documentation.

Here is how I do it:

Saving the encoder

from sklearn.preprocessing import OneHotEncoder
import mlflow.pyfunc

encoder = OneHotEncoder()
encoder.fit(X_train)

class EncoderWrapper(mlflow.pyfunc.PythonModel):
    def __init__(self, encoder):
        self.encoder = encoder

    def predict(self, context, model_input):
        return self.encoder.transform(model_input)

# Wrap the encoder
encoder_wrapped = EncoderWrapper(encoder)

# Log and save the encoder
encoder_path = ...
mlflow.pyfunc.save_model(python_model=encoder_wrapped, path=encoder_path)
mlflow.pyfunc.log_model(python_model=encoder_wrapped, artifact_path=encoder_path)

Loading the encoder

encoder_path = ...
encoder = mlflow.pyfunc.load_model( encoder_path )
X_test_encoded = encoder.transform(X_test)
Answered By: BecayeSoft