python machine learning evaluation metric

Question:

I apply SVM(rbf) model on a pandas dataframe which has 8528 rows (8528 samples) and it classify 4 classes. train-test split is 50-50. at the end I get this results :

          precision    recall  f1-score   support

       A       0.00      0.00      0.00       386
       N       0.60      1.00      0.75      2563
       O       0.00      0.00      0.00      1180
       ~       0.00      0.00      0.00       135

accuracy                           0.60      4264

does anybody knows why the scores of other classes except ‘N’ are equal to 0.00? (it has approximately same results for evey runnig, even for diffrent train-test split percentage for example 80-20 or 70-30)

Asked By: jasmine

||

Answers:

This looks like a classic case of overfitting: Your model always returns the class "N".

When you consider this these numbers make sense, e.g.:

Precision(N) = # (Rightly classified as N) /  # (Everything classified as N) = 0.6
Recall(N) = # (Rightly classified as N) / # (Every element in N) = 1

This is probably the case, because your dataset is very unbalanced: About 60% of your data is "N".

Answered By: Wormfan