How to tune GaussianNB?
Question:
Trying to fit data with GaussianNB()
gives me low accuracy score.
I’d like to try Grid Search, but it seems that parameters sigma
and theta
cannot be set. Is there anyway to tune GausssianNB
?
Answers:
Naive Bayes doesn’t have any hyperparameters to tune.
As of version 0.20
GaussianNB().get_params().keys()
returns ‘priors’ and ‘var_smoothing’
A grid search would look like:
pipeline = Pipeline([
('clf', GaussianNB())
])
parameters = {
'clf__priors': [None],
'clf__var_smoothing': [0.00000001, 0.000000001, 0.00000001]
}
cv = GridSearchCV(pipeline, param_grid=parameters)
cv.fit(X_train, y_train)
y_pred_gnb = cv.predict(X_test)
You can tune ‘var_smoothing‘ parameter like this:
nb_classifier = GaussianNB()
params_NB = {'var_smoothing': np.logspace(0,-9, num=100)}
gs_NB = GridSearchCV(estimator=nb_classifier,
param_grid=params_NB,
cv=cv_method, # use any cross validation technique
verbose=1,
scoring='accuracy')
gs_NB.fit(x_train, y_train)
gs_NB.best_params_
In an sklearn pipeline it may look as follows:
pipe = Pipeline(steps=[
('pca', PCA()),
('estimator', GaussianNB()),
])
parameters = {'estimator__var_smoothing': [1e-11, 1e-10, 1e-9]}
Bayes = GridSearchCV(pipe, parameters, scoring='accuracy', cv=10).fit(X_train, y_train)
print(Bayes.best_estimator_)
print('best score:')
print(Bayes.best_score_)
predictions = Bayes.best_estimator_.predict(X_test)
Trying to fit data with GaussianNB()
gives me low accuracy score.
I’d like to try Grid Search, but it seems that parameters sigma
and theta
cannot be set. Is there anyway to tune GausssianNB
?
Naive Bayes doesn’t have any hyperparameters to tune.
As of version 0.20
GaussianNB().get_params().keys()
returns ‘priors’ and ‘var_smoothing’
A grid search would look like:
pipeline = Pipeline([
('clf', GaussianNB())
])
parameters = {
'clf__priors': [None],
'clf__var_smoothing': [0.00000001, 0.000000001, 0.00000001]
}
cv = GridSearchCV(pipeline, param_grid=parameters)
cv.fit(X_train, y_train)
y_pred_gnb = cv.predict(X_test)
You can tune ‘var_smoothing‘ parameter like this:
nb_classifier = GaussianNB()
params_NB = {'var_smoothing': np.logspace(0,-9, num=100)}
gs_NB = GridSearchCV(estimator=nb_classifier,
param_grid=params_NB,
cv=cv_method, # use any cross validation technique
verbose=1,
scoring='accuracy')
gs_NB.fit(x_train, y_train)
gs_NB.best_params_
In an sklearn pipeline it may look as follows:
pipe = Pipeline(steps=[
('pca', PCA()),
('estimator', GaussianNB()),
])
parameters = {'estimator__var_smoothing': [1e-11, 1e-10, 1e-9]}
Bayes = GridSearchCV(pipe, parameters, scoring='accuracy', cv=10).fit(X_train, y_train)
print(Bayes.best_estimator_)
print('best score:')
print(Bayes.best_score_)
predictions = Bayes.best_estimator_.predict(X_test)