AttributeError: 'CalibratedClassifierCV' object has no attribute 'coef_'

Question:

I’m using sklearn linear implementation of SVM classifier LinearSVM.

I didn’t use it directly but I wrap it with CalibratedClassifierCV to get the probabilities in the prediction time, like:

model = CalibratedClassifierCV(LinearSVC(random_state=0))

After fitting the model, I tried to get the coef_ to print the Top features, following this post Visualising Top Features in Linear SVM with Scikit Learn and Matplotlib, but this I got this error:

coef = classifier.coef_.ravel()
AttributeError: 'CalibratedClassifierCV' object has no attribute 'coef_'

How can I get the coef in the case I wrap the classifier with a calibrator?, I’m not totally interested in this way, thus if there is another way to get the features importance, it will be welcomed.

Asked By: Minions

||

Answers:

coef_ is not an attribute of CalibratedClassifierCV however, it is an attribute of the base_estimator which is a LinearSVC in your case. You can access your base estimator via the calibrated_classifiers_ which is a list of the fitted models (which depends on the number of models you fit based on your cv value). I have shown a sample code which you can refer to for your need.

from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.calibration import CalibratedClassifierCV
from sklearn.svm import LinearSVC
iris = datasets.load_iris()
model = CalibratedClassifierCV(LinearSVC(random_state=0))
model.fit(iris.data, iris.target)
model.calibrated_classifiers_

[<sklearn.calibration._CalibratedClassifier at 0x7f15d0c57550>,
 <sklearn.calibration._CalibratedClassifier at 0x7f15d0c57c18>,
 <sklearn.calibration._CalibratedClassifier at 0x7f15d0aec080>]

In this case my cv is three so I have three models built, so I would simple loop through them and taken an average.

coef_avg = 0
for i in model.calibrated_classifiers_:
    coef_avg = coef_avg + i.base_estimator.coef_
coef_avg  = coef_avg/len(model.calibrated_classifiers_)

array([[ 0.16464871,  0.45680981, -0.77801375, -0.4170196 ],
   [ 0.1238834 , -0.89117967,  0.35451826, -0.89231957],
   [-0.83826029, -0.9237139 ,  1.30772955,  1.67592916]])

Note: Starting from sklearn version 0.24, CalibratedClassifierCV constructor exposes an ensemble argument, that, if set to False (assuming cv is not set to "prefit"), makes CalibratedClassifierCV expose only one calibrated classifier trained using all training data. This means we no longer need to loop over all calibrated_classifiers_ at prediction time:

model = CalibratedClassifierCV(LinearSVC(random_state=0), ensemble=False)
model.fit(iris.data, iris.target)
model.calibrated_classifiers_

# Returns a list with one element, [<sklearn.calibration._CalibratedClassifier at 0x7f15d0c57550>]

(using an example above, given by Parthasarathy)

Answered By: Tomasz Bartkowiak