Pass a dict to scikit learn estimator

Question:

I am trying to pass model parameters as a dict to a Scikit-learn estimator and am having no luck. It just seems to nest my dict into one of the parameters. For instance:

params = {
 'copy_X': True, 
 'fit_intercept': False, 
 'normalize': True
}

lr = LinearRegression(params)

Gives me:

LinearRegression(copy_X=True,
         fit_intercept={'copy_X': True, 'fit_intercept': False,'normalize': True},
     normalize=False)

Additionally, I created a function to iterate over the dict and can create a string like:

'copy_X=True, fit_intercept=True, normalize=False'

This was equally as unsuccessful. Anyone have any advice here? The only restriction I have is the data will be coming to me as a dict (well actually a json object being loaded with json.uploads).

Thanks.

Asked By: GMarsh

||

Answers:

I got it. Used setattr like this.

for k,v in params.items():
   setattr(lr,k,v)
Answered By: GMarsh

fit_intercept is the first argument of the LinearRegression object

http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html

So that explains why your dictionary is being passed to that argument, the other args (also optional) copy_X and normalize are receiving no parameter so they are using the default values.

You could also do:

params = {
 'copy_X': True, 
 'fit_intercept': False, 
 'normalize': True
}

lr = LinearRegression(copy_X = params['copy_X'], 
                      fit_intercept = params['fit_intercept'], 
                      normalize = params['normalize'])
Answered By: David Zemens

The best solution to initialise your estimator with the right parameters would be to unpack your dictionary:

lr = LinearRegression(**params)

If for some reason you need to set some parameters afterwards, you could use:

lr.set_params(**params)

This has an advantage over using setattr in that it allows Scikit learn to perform some validation checks on the parameters.

Answered By: ldirer
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.