Meaning of output shapes of ResNet9 model layers

Question:

I have a ResNet9 model, implemented in Pytorch which I am using for multi-class image classification. My total number of classes is 6. Using the following code, from torchsummary library, I am able to show the summary of the model, seen in the attached image:

INPUT_SHAPE = (3, 256, 256) #input shape of my image

print(summary(model.cuda(), (INPUT_SHAPE)))

However, I am quite confused about the -1 values in all layers of the ResNet9 model. Also, for Conv2d-1 layer, I am confused about the 64 value in the output shape [-1, 64, 256, 256] as I believe the n_channels value of the input image is 3. Can anyone please help me with the explanation of the output shape values? Thanks!

Asked By: Slwd-wave540

Source

Answers:

Your input shape should be probably (256, 256, 3) as the number of color channels is the last dimension of an image. You should also include dimension for the batch size.

Answered By: Tomáš Řetický

Yes
your INPUT_SHAPE is torch.Size([3, 256, 256]) if it’s channel first format AND (256, 256, 3) if it’s channel last format.
As Pytorch model accepts it in channel first format , for you it shows torch.Size([3, 256, 256])

and talking about our output shape [-1, 64, 256, 256], this is the output shape of your first conv output which has 64 filter each of 256x256 dim and not your input_shape.
-1 represents your variable batch_size which can be fixed in dataloader

Answered By: Prajot Kuvalekar