sklearn: print DecisionTreeRegressor's tree from IterativeImputer

Question:

I have an IterativeImputer that uses DecisionTreeRegressor as estimator and I want to print it’s tree with export_text method:

import pandas as pd
from sklearn import tree
from sklearn.experimental import enable_iterative_imputer  # noqa
from sklearn.impute import IterativeImputer
from sklearn.tree import DecisionTreeRegressor

regressor = DecisionTreeRegressor(criterion="squared_error", 
                                  max_depth=None, 
                                  min_samples_split=2,
                                  min_samples_leaf=1, 
                                  random_state=0)
iterative_imputer = IterativeImputer(
    estimator=regressor,
    sample_posterior=False,
    max_iter=10,
    initial_strategy='mean',
    imputation_order='roman',
    verbose=2,
    random_state=0)
iterative_imputer.fit(df)
print(tree.export_text(iterative_imputer.estimator))

But I’m getting an error:

sklearn.exceptions.NotFittedError: This DecisionTreeRegressor
instance is not fitted yet. Call ‘fit’ with appropriate arguments
before using this estimator.

What am I doing wrong?

Asked By: Daemon2017

||

Answers:

The error occurs because the iterative_imputer.estimator object is cloned before being fit in each iteration. It is the instance that all other estimators come from.

After fitting, the estimators are stored as as list of _ImputerTriplet objects under the imputation_sequence_ attribute. They can be accessed (scikit-learn==1.2.0) with:

import numpy as np
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer
from sklearn.tree import DecisionTreeRegressor
from sklearn.tree import export_text

regressor = DecisionTreeRegressor(random_state=0)
iterative_imputer = IterativeImputer(
    estimator=regressor,
    max_iter=10,
    imputation_order='roman',
    random_state=0,
)

iterative_imputer.fit([[7, 2, 3], [4, np.nan, 6], [10, 5, 9]])

for _, _, estimator in iterative_imputer.imputation_sequence_:
    print(export_text(estimator))
|--- feature_1 <= 7.50
|   |--- feature_0 <= 2.75
|   |   |--- value: [7.00]
|   |--- feature_0 >  2.75
|   |   |--- value: [4.00]
|--- feature_1 >  7.50
|   |--- value: [10.00]

...
Answered By: Alexander L. Hayes