Does the value counts of 1's and 0's in my dataframe while using a Decision Tree Classifier matters?

Question:

I am using a Decision Tree Classifier and in the data, the target column is ‘TARGET’ which consists of 0’s and 1’s
TARGET 0 282686 1 24825 dtype: int64
and after training on 0.75 of the whole data it is giving all the output as 0 and the accuracy_score for training, validation, test set is >0.90.

Asked By: Karan Dhar

||

Answers:

Evaluation:

The test set should produce accuracy less than your training set. The training set is said to be trained with everything it knows and test doesn’t know the patterns in training data.
The simple evaluation method is to find train and test accuracy and compare them.

Results

  1. if train accuracy < test accuracy there is a problem check everything. The problem would mostly fall on the train test split. Using stratification method will do some good in this or try some other subset of data
  2. if train accuracy > test accuracy then almost it is right, you can work on optimizing it.
Answered By: Mohammed Shammeer
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.