Hashingvectorizer and Multinomial naive bayes are not working together

Question:

I am trying to write a twitter sentiment analysis program with Scikit-learn in python 2.7. OS is Linux Ubuntu 14.04.

In Vectorizing step, I want to use Hashingvectorizer(). To test the classifier accuracy it works fine with LinearSVC, NuSVC, GaussianNB, BernoulliNB and LogisticRegression classifiers, but for MultinomialNB, it returns this error

Traceback (most recent call last):
  File "/media/test.py", line 310, in <module>
    classifier_rbf.fit(train_vectors, y_trainTweets)
  File "/home/.local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 552, in fit
    self._count(X, Y)
  File "/home/.local/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 655, in _count
    raise ValueError("Input X must be non-negative")
ValueError: Input X must be non-negative
[Finished in 16.4s with exit code 1] 

Here is the block code related to this error

vectorizer = HashingVectorizer()
train_vectors = vectorizer.fit_transform(x_trainTweets)
test_vectors = vectorizer.transform(x_testTweets)

classifier_rbf = MultinomialNB()
classifier_rbf.fit(train_vectors, y_trainTweets)
prediction_rbf = classifier_rbf.predict(test_vectors)

Why it is happening and how can I solve it?

Asked By: ehsan badakhshan

||

Answers:

You need to set non_negative argument to True, when initialising your vectorizer

vectorizer = HashingVectorizer(non_negative=True)
Answered By: silentser

If the non_negative argument isn’t available (just like my version)

Try putting :
vectorizer = HashingVectorizer(alternate_sign=False)

non_negative argument have been replaced with alternate_sign. so if u want non_negative=True, try putting alternate_sign=False .It will work surely.

Answered By: Shuvo Alok