Convolutional Autoencoder for classification problem

Question:

I am following with Datacamp’s tutorial on using convolutional autoencoders for classification here. I understand in the tutorial that we only need the autoencoder’s head (i.e. the encoder part) stacked to a fully-connected layer to do the classification.

After stacking, the resulting network (convolutional-autoencoder) is trained twice. The first by setting the encoder’s weights to false as:

for layer in full_model.layers[0:19]:
    layer.trainable = False

And then setting back to true, and re-trained the network:

for layer in full_model.layers[0:19]:
    layer.trainable = True

I cannot understand why we are doing this twice. Anyone with experience working with conv-net or autoencoders?

Asked By: user12587364

||

Answers:

It’s because the first 19 layers are already trained in this line:

autoencoder_train = autoencoder.fit(train_X, train_ground, batch_size=batch_size,epochs=epochs,verbose=1,validation_data=(valid_X, valid_ground))

The point of autoencoder is for dimensionality reduction. Let’s say you have 1000 features and you want to reduce it to 100 features. You train an autoencoder with encoder layer followed by decoder layer. The point is so that the encoded feature (outputted by encoder layer) can be decoded back to the original features.

After the autoencoder is trained, the decoder part is thrown away and instead a fully connected classification layer is added on top of the encoder layer to train a classification network from a reduced set of features. This is why the encoder layers trainable is set to false to only train the fully connected classification layer (to speed up training).

The reason that the encoder layer trainable is set to true again is to train the entire network which can be slow because changes are back propagated to the entire network.

Answered By: Armin Primadi

They freeze the Autoencoder’s layers first because they need to initialize the stacked CNN network so as to make them "catch up" with the pre-trained weights.

If you skip this step, what would happen is that the uninitialized layers would untrain your Autoencoder’s layer during the backward pass because the gradients between both networks can be large. Eventually you will return to the same point but you’ll have lost the time savings of using a pre-trained network.

Answered By: Eduardo H. Ramirez