How can I improve my dataset for increased mAP in yolov4 object detection framework


I want to use Yolov4 object detector to detect LED matrices like the one in the attached picture. The goal of my project is to perform automated RoI of these types of LED matrices in vehicular scenarios, mainly.

Unfortunately, these type of objects are not very popular and I could not find a way to produce a good dataset for training. I’ve tried to train the Yolov4 algorithm with different cfg parameters but two things always happen:

  1. Overfitting
  2. The alghorithm does not converge and no detection is performed.

Do you have any tips on how can I improve my dataset? This kind of object is not very popular. Also I’m attaching the code that I used to train the detector executed on Google Colab.

Note: I am using tiny-yolo-v4 for training due to its s

from google.colab import drive

!ln -s /content/gdrive/My Drive/ /mydrive

%cd /mydrive/yolov4

!git clone

%cd darknet/
!sed -i 's/OPENCV=0/OPENCV=1/' Makefile
!sed -i 's/GPU=0/GPU=1/' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/' Makefile
!sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile
!sed -i 's/LIBSO=0/LIBSO=1/' Makefile


# run file, used to create train.txt and test.txt from annotated images
!ls data/

# Here we use transfer learning. Instead of training a model from scratch, we use pre-trained YOLOv4 weights which have been trained up to 137 convolutional layers. Run the following command to download the YOLOv4 pre-trained weights file.

!chmod +x ./darknet

#!./darknet detector train data/ cfg/yolov4-custom.cfg yolov4.conv.137 -dont_show -map
!./darknet detector train data/ cfg/yolov4-custom.cfg yolov4-tiny.conv.29 -dont_show -map


I am assuming you are aware that the image you wanted to display is not displayed.

There are several important factors that can help you improve your mAP and your dataset, in order to train YOLOv4.

Improving your Dataset:

  • It is important to include multiple images of the same state, in your dataset.

  • If you have a lot of classes the more images you have for each class the better and more accurate the results will be.

  • If your average loss is low and your mean Average Precision low as well, then you should check each individual test image and compare it to the ones in your dataset, do you have enough train images similar to the ones in your test images?

  • Make sure 10% of your dataset to be background images with no labels at all. This way, the model will be trained not to constantly have to have a detection in an image. (This is not necessary in stationary cameras/images, where at least 1 detection is expected)

Network Size:

I am not keen on this topic so please do refer to Stephanes FAQ.

If you attempt to train your neural network with say the size 416×416 the network will resize any image to the network’s size, which that would also make the objects in your images a lot smaller. YOLOv4 should be fine with any objects sized at 16×16 pixels and above. So if any of your objects in the images were really small to begin with, then that will most luckily affect your mAP% negatively.

What is recommended: 1000 images per class, with a lot of the images being repetitive in several angles or cases.

Answered By: TheAlphaDubstep

@theAlphaDubstep pointed important factors to increase your model performance. I would also recommend using augmentation to increase the number of instances in your dateset. there are several python libraries and online tools available for this, you can also use tensor-flow for performing data augmentation.

Answered By: jansary