How can I evaluate multiple checkpoints with the TF2 Object Detection?

Question:

I’ve successfully trained a model with around 16k steps which produced quite a few checkpoints that are saved in my training folder. I want to make sure that I am not running into overfitting issues, so I would like to evaluate every single checkpoint with my testing data.

I am using the following command from the official Tensorflow 2 Object Detection repository:

PIPELINE_CONFIG_PATH={path to pipeline config file}
MODEL_DIR={path to model directory}
CHECKPOINT_DIR=${MODEL_DIR}
python object_detection/model_main_tf2.py 
    --pipeline_config_path=${PIPELINE_CONFIG_PATH} 
    --model_dir=${MODEL_DIR} 
    --checkpoint_dir=${CHECKPOINT_DIR} 
    --alsologtostderr

MODEL_DIR and CHECKPOINT_DIR are both pointing to my training folder.

The issue I am experiencing now is that this only evaluates the latest checkpoint, but I’d like to evaluate all of them at once.

Ideally I would like to see the results in TensorBoard which shows the val_accuracy (mAP) of the different checkpoints as graph – which it does already, but just for the one checkpoint.

Asked By: Top Snek

||

Answers:

As of 02.2022

The Validation process is supposed to run at the same time with the Training process so that whenever a new Checkpoint is saved, the Validation process immediately loads the Checkpoint and starts validating.

Please see my other answer in this regard.

Answered By: JoyfulPanda