Sudden drop in validation accuracy during training

Question:

When I was training my neural network there is a sudden drop in validation accuracy during the 8th epoch what does this mean?

Train for 281 steps, validate for 24 steps Epoch 1/10 281/281 [==============================] - 106s 378ms/step - loss: 1.5758 - accuracy: 0.8089 - val_loss: 1.8909 - val_accuracy: 0.4766 Epoch 2/10 281/281 [==============================] - 99s 353ms/step - loss: 1.5057 - accuracy: 0.8715 - val_loss: 1.7364 - val_accuracy: 0.6276 Epoch 3/10 281/281 [==============================] - 99s 353ms/step - loss: 1.4829 - accuracy: 0.8929 - val_loss: 1.5347 - val_accuracy: 0.8398 Epoch 4/10 281/281 [==============================] - 99s 353ms/step - loss: 1.4445 - accuracy: 0.9301 - val_loss: 1.5551 - val_accuracy: 0.8047 Epoch 5/10 281/281 [==============================] - 99s 353ms/step - loss: 1.4331 - accuracy: 0.9412 - val_loss: 1.5043 - val_accuracy: 0.8659 Epoch 6/10 281/281 [==============================] - 97s 344ms/step - loss: 1.4100 - accuracy: 0.9639 - val_loss: 1.5562 - val_accuracy: 0.8151 Epoch 7/10 281/281 [==============================] - 96s 342ms/step - loss: 1.4140 - accuracy: 0.9585 - val_loss: 1.4935 - val_accuracy: 0.8737 Epoch 8/10 281/281 [==============================] - 96s 341ms/step - loss: 1.4173 - accuracy: 0.9567 - val_loss: 1.7569 - val_accuracy: 0.6055 Epoch 9/10 281/281 [==============================] - 96s 340ms/step - loss: 1.4241 - accuracy: 0.9490 - val_loss: 1.4756 - val_accuracy: 0.9023 Epoch 10/10 281/281 [==============================] - 96s 340ms/step - loss: 1.4067 - accuracy: 0.9662 - val_loss: 1.4167 - val_accuracy: 0.9648

Asked By: Alex Lee

Source

Answers:

Sudden drops in validation loss and training loss occur due to the batch training; in essence, the convergence would be smooth only if we trained with the entire dataset, not with batches. Therefore, it is normal to see such drops (both for training and for validation).

val_loss: 1.4935 – val_accuracy: 0.8737 (Previous epoch)
val_loss: 1.7569 – val_accuracy: 0.6055 (Epoch with drop)
val_loss: 1.4756 – val_accuracy: 0.9023 (Next epoch)

If you take a look at the validation loss, it merely increased with 0.26; however, this resulted in a 27% decrease in your accuracy. In this case, it is due to the fact that your model is not certain when it makes a prediction (at least at this stage of training).

Imagine that you have a binary classification model(between apples and oranges). At each prediction, when the ground truth is an apple, the network is 51% confident that the image is of an apple. We have the ground_truth apple, and as Keras does behind the curtains, the default confidence threshold is 50%. Then all the predictions are good and you have a good accuracy.

However, now comes the ‘problematic’ epoch. Due to the changed values of the weights of your neural network after another epoch of training, when you predict on your validation dataset, you get a confidence of 48-49% for each ground_truth apple, and again, since the threshold is 50%, you get much poorer accuracy than the previous epoch.

This particular case that you are experiencing, as you can now infer from the previous explanation, does not affect the loss so much, but the accuracy. It does not affect the loss that much during backpropagation, because a difference in the confidence prediction between 49% and 51% when computing the loss is not a very significant difference in the overall loss(as you see in your case, only a 0.26%). In the end, even at the ‘previous epoch’, when the model predicted correctly an apple, the neural network was not that extremely confident, by yielding only 51% confidence for an apple, not 95% for instance.

Answered By: Timbus Calin