Multi Label Classification – Incorrect training hyperparameters?

Question:

I am working on a multi-label image classification problem, using TensorFlow, Keras and Python 3.9.

I have built a dataset containing one .csv file with image names and their respective one-hot encoded labels, like so:

Dataset Sample

I also have an image folder with the associated image files. There are around 17,000 images, and each one can be classified with a total of 29 possible labels. The dataset is fairly well balanced. These labels refer to the visual components found in an image, for example, the following image belongs to classes [02, 23, 05].

Image Sample

  • 2 – Human Beings
  • 5 – Plants
  • 23 – Arms, Armour

This method of image labelling is popular in Trademark Imaging and is known as Vienna Classification.
Now, my goal is to perform predictions on similar images. For this, I am fine-tuning a VGG19 network with a custom prediction layer defined as follows:

prediction_layer = tf.keras.layers.Dense(29, activation=tf.keras.activations.sigmoid)`

All images are properly resized to (224, 224, 3) and their RGB values scaled to [0, 1]. My network summary looks like this:

Model: "model_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_11 (InputLayer)       [(None, 224, 224, 3)]     0         
                                                                 
 tf.__operators__.getitem_1   (None, 224, 224, 3)      0         
 (SlicingOpLambda)                                               
                                                                 
 tf.nn.bias_add_1 (TFOpLambd  (None, 224, 224, 3)      0         
 a)                                                              
                                                                 
 vgg19 (Functional)          (None, 7, 7, 512)         20024384  
                                                                 
 global_average_pooling2d_3   (None, 512)              0         
 (GlobalAveragePooling2D)                                        
                                                                 
 dense_12 (Dense)            (None, 29)                14877     
                                                                 
=================================================================
Total params: 20,039,261
Trainable params: 14,877
Non-trainable params: 20,024,384
_________________________________________________________________

The problem I am facing is in regard to the actual training of the network. I am using Adam and the binary_crossentropy loss function which I believe is adequate for multi-label problems. However, after around 5 hours of training, I am fairly dissapointed with the accuracy it’s achieving.

Epoch 10/10
239/239 [==============================] - 1480s 6s/step - loss: 0.1670 - accuracy: 0.1969 - val_loss: 0.1656 - val_accuracy: 0.1922

I am somewhat familiar with multi-class classification but this is my first attempt at solving a multi-label problem. Am I failing at any point before training, is VGG19 not ideal for this task, did I get my parameters wrong?

Asked By: Johnny

||

Answers:

Multilabel problems are different in evaluation. Check out this answer. Low accuracy could mean nothing. Consider that a prediction for one sample is only correct if the entire vector of 29 elements is correct. This is hard to achieve. For your example that is:

[0,1,0,0,1,0,0,0,0...,1,0,0,0,0,0,0]

I recommend you to use the binary accuracy, f1-score hamming loss or coverage to evaluate your model, depending on what aspect of the prediction is most important in your context.

Answered By: Viktor