How to tune GaussianNB?

Question:

Trying to fit data with GaussianNB() gives me low accuracy score.

I’d like to try Grid Search, but it seems that parameters sigma and theta cannot be set. Is there anyway to tune GausssianNB?

Asked By: vlad

||

Answers:

Naive Bayes doesn’t have any hyperparameters to tune.

Answered By: Matheus Schaly

As of version 0.20

GaussianNB().get_params().keys()
returns ‘priors’ and ‘var_smoothing’

A grid search would look like:

pipeline = Pipeline([
    ('clf', GaussianNB())
])

parameters = {
    'clf__priors': [None],
    'clf__var_smoothing': [0.00000001, 0.000000001, 0.00000001]
}

cv = GridSearchCV(pipeline, param_grid=parameters)

cv.fit(X_train, y_train)
y_pred_gnb = cv.predict(X_test)
Answered By: Helen Batson

You can tune ‘var_smoothing‘ parameter like this:

nb_classifier = GaussianNB()

params_NB = {'var_smoothing': np.logspace(0,-9, num=100)}
gs_NB = GridSearchCV(estimator=nb_classifier, 
                 param_grid=params_NB, 
                 cv=cv_method,   # use any cross validation technique 
                 verbose=1, 
                 scoring='accuracy') 
gs_NB.fit(x_train, y_train)

gs_NB.best_params_
Answered By: ana

In an sklearn pipeline it may look as follows:

pipe = Pipeline(steps=[
                    ('pca', PCA()),
                    ('estimator', GaussianNB()),
                    ])
    
parameters = {'estimator__var_smoothing': [1e-11, 1e-10, 1e-9]}
Bayes = GridSearchCV(pipe, parameters, scoring='accuracy', cv=10).fit(X_train, y_train)
print(Bayes.best_estimator_)
print('best score:')
print(Bayes.best_score_)
predictions = Bayes.best_estimator_.predict(X_test)
Answered By: Pavel Fedotov