Setting exact number of iterations for Logistic regression in python
Question:
I’m creating a model to perform Logistic regression on a dataset using Python. This is my code:
from sklearn import linear_model
my_classifier2=linear_model.LogisticRegression(solver='lbfgs',max_iter=10000)
Now, according to Sklearn doc page, max_iter is maximum number of iterations taken for the solvers to converge. How do I specifically state that I need ‘N’ number of iterations ?
Any kind of help would be really appreciated.
Answers:
I’m not sure, but, Do you want to know the optimal number of iterations for your model? If so, you are better off utilizing GridSearchCV
that scan tune hyper parameter like max_iter
.
Briefly,
- Split your data into two groups: train/test data with
train_test_split
or KFold
that can be imported from sklean
- Set your parameter, for instance
para=[{‘max_iter’:[1,10,100,100]}]
- Instance, for example
clf=GridSearchCV(LogisticRegression, param_grid=para, cv=5, scoring=‘r2’)
- Implement with using train data like this:
clf.fit(x_train, y_train)
You can also fetch the best number of iterations with RandomizedSearchCV
or BayesianOptimization
.
About the GridSearchCV
of the max_iter
parameter, the fitted LogisticRegression
models have and attribute n_iter_
so you can discover the exact max_iter
needed for a given sample size and regarding features:
n_iter_: ndarray of shape (n_classes,) or (1, )
Actual number of iterations for all classes. If binary or multinomial, it
returns only 1 element. For liblinear solver, only the maximum number of
iteration across all classes is given.
Scanning very short intervals, like 1 by 1, is a waste of resources that could be used for more important LogisticRegression
fit parameters such as the combination of solver
itself, its regularization penalty
and the inverse of the regularization strength C
which contributes for a faster convergence within a given max_iter
.
Setting a very high max_iter
could be also a waste of resources if you haven’t previously did a minimal feature preprocessing, at least, feature scaling or maybe imputation, outlier clipping and a dimensionality reduction (e.g. PCA).
Things can become worse: a tunned max_iter
could be ok for a given sample size but not for a bigger sample size, for instance, if you are developing a cross-validated learning curve, which by the way is imperative for optimal machine learning.
It becomes even worse if you increase a sample size in a pipeline that generates feature vectors such as n-grams (NLP): more rows will generate more (sparse) features for the LogisticRegression
classification.
I think it’s important to observe if different solvers converges or not on given sample size, generated features and max_iter
.
Methods that help a faster convergence which eventually won’t demand increasing max_iter
are:
- Feature scaling
- Dimensionality Reduction (e.g. PCA) of scaled features
There’s a nice sklearn example demonstrating the importance of feature scaling
I’m creating a model to perform Logistic regression on a dataset using Python. This is my code:
from sklearn import linear_model
my_classifier2=linear_model.LogisticRegression(solver='lbfgs',max_iter=10000)
Now, according to Sklearn doc page, max_iter is maximum number of iterations taken for the solvers to converge. How do I specifically state that I need ‘N’ number of iterations ?
Any kind of help would be really appreciated.
I’m not sure, but, Do you want to know the optimal number of iterations for your model? If so, you are better off utilizing GridSearchCV
that scan tune hyper parameter like max_iter
.
Briefly,
- Split your data into two groups: train/test data with
train_test_split
orKFold
that can be imported from sklean - Set your parameter, for instance
para=[{‘max_iter’:[1,10,100,100]}]
- Instance, for example
clf=GridSearchCV(LogisticRegression, param_grid=para, cv=5, scoring=‘r2’)
- Implement with using train data like this:
clf.fit(x_train, y_train)
You can also fetch the best number of iterations with RandomizedSearchCV
or BayesianOptimization
.
About the GridSearchCV
of the max_iter
parameter, the fitted LogisticRegression
models have and attribute n_iter_
so you can discover the exact max_iter
needed for a given sample size and regarding features:
n_iter_: ndarray of shape (n_classes,) or (1, )
Actual number of iterations for all classes. If binary or multinomial, it
returns only 1 element. For liblinear solver, only the maximum number of
iteration across all classes is given.
Scanning very short intervals, like 1 by 1, is a waste of resources that could be used for more important LogisticRegression
fit parameters such as the combination of solver
itself, its regularization penalty
and the inverse of the regularization strength C
which contributes for a faster convergence within a given max_iter
.
Setting a very high max_iter
could be also a waste of resources if you haven’t previously did a minimal feature preprocessing, at least, feature scaling or maybe imputation, outlier clipping and a dimensionality reduction (e.g. PCA).
Things can become worse: a tunned max_iter
could be ok for a given sample size but not for a bigger sample size, for instance, if you are developing a cross-validated learning curve, which by the way is imperative for optimal machine learning.
It becomes even worse if you increase a sample size in a pipeline that generates feature vectors such as n-grams (NLP): more rows will generate more (sparse) features for the LogisticRegression
classification.
I think it’s important to observe if different solvers converges or not on given sample size, generated features and max_iter
.
Methods that help a faster convergence which eventually won’t demand increasing max_iter
are:
- Feature scaling
- Dimensionality Reduction (e.g. PCA) of scaled features
There’s a nice sklearn example demonstrating the importance of feature scaling