sklearn: print DecisionTreeRegressor's tree from IterativeImputer
Question:
I have an IterativeImputer that uses DecisionTreeRegressor as estimator and I want to print it’s tree with export_text method:
import pandas as pd
from sklearn import tree
from sklearn.experimental import enable_iterative_imputer # noqa
from sklearn.impute import IterativeImputer
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(criterion="squared_error",
max_depth=None,
min_samples_split=2,
min_samples_leaf=1,
random_state=0)
iterative_imputer = IterativeImputer(
estimator=regressor,
sample_posterior=False,
max_iter=10,
initial_strategy='mean',
imputation_order='roman',
verbose=2,
random_state=0)
iterative_imputer.fit(df)
print(tree.export_text(iterative_imputer.estimator))
But I’m getting an error:
sklearn.exceptions.NotFittedError: This DecisionTreeRegressor
instance is not fitted yet. Call ‘fit’ with appropriate arguments
before using this estimator.
What am I doing wrong?
Answers:
The error occurs because the iterative_imputer.estimator
object is cloned before being fit in each iteration. It is the instance that all other estimators come from.
After fitting, the estimators are stored as as list of _ImputerTriplet
objects under the imputation_sequence_
attribute. They can be accessed (scikit-learn==1.2.0
) with:
import numpy as np
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import export_text
regressor = DecisionTreeRegressor(random_state=0)
iterative_imputer = IterativeImputer(
estimator=regressor,
max_iter=10,
imputation_order='roman',
random_state=0,
)
iterative_imputer.fit([[7, 2, 3], [4, np.nan, 6], [10, 5, 9]])
for _, _, estimator in iterative_imputer.imputation_sequence_:
print(export_text(estimator))
|--- feature_1 <= 7.50
| |--- feature_0 <= 2.75
| | |--- value: [7.00]
| |--- feature_0 > 2.75
| | |--- value: [4.00]
|--- feature_1 > 7.50
| |--- value: [10.00]
...
I have an IterativeImputer that uses DecisionTreeRegressor as estimator and I want to print it’s tree with export_text method:
import pandas as pd
from sklearn import tree
from sklearn.experimental import enable_iterative_imputer # noqa
from sklearn.impute import IterativeImputer
from sklearn.tree import DecisionTreeRegressor
regressor = DecisionTreeRegressor(criterion="squared_error",
max_depth=None,
min_samples_split=2,
min_samples_leaf=1,
random_state=0)
iterative_imputer = IterativeImputer(
estimator=regressor,
sample_posterior=False,
max_iter=10,
initial_strategy='mean',
imputation_order='roman',
verbose=2,
random_state=0)
iterative_imputer.fit(df)
print(tree.export_text(iterative_imputer.estimator))
But I’m getting an error:
sklearn.exceptions.NotFittedError: This DecisionTreeRegressor
instance is not fitted yet. Call ‘fit’ with appropriate arguments
before using this estimator.
What am I doing wrong?
The error occurs because the iterative_imputer.estimator
object is cloned before being fit in each iteration. It is the instance that all other estimators come from.
After fitting, the estimators are stored as as list of _ImputerTriplet
objects under the imputation_sequence_
attribute. They can be accessed (scikit-learn==1.2.0
) with:
import numpy as np
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import export_text
regressor = DecisionTreeRegressor(random_state=0)
iterative_imputer = IterativeImputer(
estimator=regressor,
max_iter=10,
imputation_order='roman',
random_state=0,
)
iterative_imputer.fit([[7, 2, 3], [4, np.nan, 6], [10, 5, 9]])
for _, _, estimator in iterative_imputer.imputation_sequence_:
print(export_text(estimator))
|--- feature_1 <= 7.50
| |--- feature_0 <= 2.75
| | |--- value: [7.00]
| |--- feature_0 > 2.75
| | |--- value: [4.00]
|--- feature_1 > 7.50
| |--- value: [10.00]
...