Calling the sklearn2pmml() function in Python 3.8 throws RuntimeError
Question:
I’am trying to save my scikit learn logistic regression as pmml but get a RuntimeError:
My code:
from sklearn2pmml import sklearn2pmml
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn.linear_model import LogisticRegression
pipe_pmml = PMMLPipeline(steps=[('mapper', mapper),
('estimator', LogisticRegression(C = 0.01,
penalty = 'l1',
solver = 'liblinear',
random_state = 1))
])
pipe_pmml.fit(X_small, y)
sklearn2pmml(pipe_pmml, pmml_filename, with_repr = True)
with error:
Standard output is empty
Standard error:
Exception in thread "main" net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 0
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:366)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at com.sklearn2pmml.Main.run(Main.java:78)
at com.sklearn2pmml.Main.main(Main.java:6
where mapper is a DataFrameMapper from sklearn_pandas
Anybody any idea?
- sklearn==0.0
- scikit-learn==1.1.2
- sklearn-pandas==2.2.0
- sklearn2pmml==0.86.3
Answers:
Solution: Downgrade joblib to 1.1.0
Joblib 1.2.0 generates pickle-like files, which contain extra padding for array memory alignment purposes: joblib/joblib#563
This extra padding is what is causing standard pickle data format readers to fail.
The SkLearn2PMML package version 0.87.0 (and newer) should be able to deal with both standard (Python pickle, Joblib 1.1.0) and and non-standard (Joblib 1.2.0) pickle files.
I’am trying to save my scikit learn logistic regression as pmml but get a RuntimeError:
My code:
from sklearn2pmml import sklearn2pmml
from sklearn2pmml.pipeline import PMMLPipeline
from sklearn.linear_model import LogisticRegression
pipe_pmml = PMMLPipeline(steps=[('mapper', mapper),
('estimator', LogisticRegression(C = 0.01,
penalty = 'l1',
solver = 'liblinear',
random_state = 1))
])
pipe_pmml.fit(X_small, y)
sklearn2pmml(pipe_pmml, pmml_filename, with_repr = True)
with error:
Standard output is empty
Standard error:
Exception in thread "main" net.razorvine.pickle.InvalidOpcodeException: invalid pickle opcode: 0
at net.razorvine.pickle.Unpickler.dispatch(Unpickler.java:366)
at org.jpmml.python.CustomUnpickler.dispatch(CustomUnpickler.java:31)
at org.jpmml.python.PickleUtil$1.dispatch(PickleUtil.java:64)
at net.razorvine.pickle.Unpickler.load(Unpickler.java:109)
at org.jpmml.python.PickleUtil.unpickle(PickleUtil.java:85)
at com.sklearn2pmml.Main.run(Main.java:78)
at com.sklearn2pmml.Main.main(Main.java:6
where mapper is a DataFrameMapper from sklearn_pandas
Anybody any idea?
- sklearn==0.0
- scikit-learn==1.1.2
- sklearn-pandas==2.2.0
- sklearn2pmml==0.86.3
Solution: Downgrade joblib to 1.1.0
Joblib 1.2.0 generates pickle-like files, which contain extra padding for array memory alignment purposes: joblib/joblib#563
This extra padding is what is causing standard pickle data format readers to fail.
The SkLearn2PMML package version 0.87.0 (and newer) should be able to deal with both standard (Python pickle, Joblib 1.1.0) and and non-standard (Joblib 1.2.0) pickle files.