Global Max Pooling in Pytorch: RuntimeError: mat1 and mat2 shapes cannot be multiplied (128×2048 and 128×1024)

Question:

In the model I’m building I’m trying to improve performance by replacing the Flatten layer with global max pooling.

To check that shapes are in order I ran a single random sample through the net:

test = torch.rand((1, 3, 224, 224))     # [N, C, H, W]

foo = nn.Sequential(
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.Conv2d(32, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(32),
            nn.MaxPool2d(2)
        )

foo2 = nn.Sequential(
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.Conv2d(64, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(64),
            nn.MaxPool2d(2)
        )

foo3 = nn.Sequential(
            nn.Conv2d(64, 128, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(128),
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(128),
            nn.MaxPool2d(2)
        )

l1 = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(128,  1024),
            nn.ReLU(),
            nn.Dropout(0.2),
            nn.Linear(1024, 10)
        )

r1 = foo(test)
print(r1.shape)    # torch.Size([1, 32, 112, 112])
r2 = foo2(r1)
print(r2.shape)    # torch.Size([1, 64, 56, 56])
r3 = foo3(r2)
print(r3.shape)    # torch.Size([1, 128, 28, 28])

# applying global max pooling and reshaping the layer to [N, C]
flat = F.adaptive_max_pool2d(r3, (1, 1))
ff = flat.reshape(flat.size(0), -1)

print(ff.shape)    # torch.Size([1, 128])
res = l1(ff)
print(res.shape)   # torch.Size([1, 10])

Here all seems to work as expected.

My model class has these same layers with the forward method like so:

    def forward(self, batch: torch.Tensor) -> torch.Tensor:
        r1 = self.conv1(batch)
        r2 = self.conv2(r1)
        r3 = self.conv3(r2)
        
        tmp = F.adaptive_max_pool2d(r3, (1, 1))
        flat = r3.view(tmp.size(0), -1)
        
        out = self.linear(flat)
        
        return out

Unfortunately, when I try to run the actual images through (Fashion MNIST dataset) I get the error: mat1 and mat2 shapes cannot be multiplied (128×2048 and 128×1024)

My batch size is 128 but I don’t understand where 2048 might be coming from. None of my layers should output anything of that shape.

The full error message is as follows:

RuntimeError                              Traceback (most recent call last)
/root/fashion_mnist.ipynb Cell 7 in <cell line: 1>()
----> 1 runner.train_model(epochs=80, batch_size=128, criterion=loss_fn, optimizer=optim)

/root/fashion_mnist.ipynb Cell 7 in RunModel.train_model(self, epochs, batch_size, criterion, optimizer, device)
    113 t_ep = datetime.now()
    115 # run train routine
--> 116 train_loss, train_acc = self._run_train(train_loader, criterion, optimizer)   
    117 self.train_losses[ep] = train_loss
    118 self.train_accuracies[ep] = train_acc

/root/fashion_mnist.ipynb Cell 7 in RunModel._run_train(self, train_data, criterion, optimizer)
    141 inputs, targets = inputs.cuda(), targets.cuda()
    142 optimizer.zero_grad()
--> 144 outputs: torch.Tensor = self.model(inputs)
    145 loss: torch.Tensor = criterion(outputs, targets)          
    147 loss.backward()

File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/module.py:1186, in Module._call_impl(self, *input, **kwargs)
   1182 # If we don't have any hooks, we want to skip the rest of the logic in
   1183 # this function, and just call forward.
   1184 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1185         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1186     return forward_call(*input, **kwargs)
...
File /opt/conda/lib/python3.8/site-packages/torch/nn/modules/linear.py:114, in Linear.forward(self, input)
    113 def forward(self, input: Tensor) -> Tensor:
--> 114     return F.linear(input, self.weight, self.bias)

RuntimeError: mat1 and mat2 shapes cannot be multiplied (128x2048 and 128x1024)

Any ideas what’s happening here?

The notebook is available here:
https://colab.research.google.com/drive/1QGpSpUCbuDz-dktmLCv_YpG6LZjYZ1TM?usp=sharing

Asked By: pavel

||

Answers:

Use Flatten() in the layers instead of view(). So your linear layer should look like this:

    self.linear = nn.Sequential(
        nn.Flatten(),
        nn.Dropout(0.5),
        nn.Linear(128,  1024),
        nn.ReLU(),
        nn.Dropout(0.2),
        nn.Linear(1024, 10)
    )

And your forward function look like:

def forward(self, batch: torch.Tensor) -> torch.Tensor:
    r1 = self.conv1(batch)
    r2 = self.conv2(r1)
    r3 = self.conv3(r2)
    
    tmp = F.adaptive_max_pool2d(r3, (1, 1))
    
    out = self.linear(tmp)
    
    return out

I have tested it on colab and it works fine.

Here is a summary output:

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 32, 32, 32]             896
              ReLU-2           [-1, 32, 32, 32]               0
       BatchNorm2d-3           [-1, 32, 32, 32]              64
            Conv2d-4           [-1, 32, 32, 32]           9,248
              ReLU-5           [-1, 32, 32, 32]               0
       BatchNorm2d-6           [-1, 32, 32, 32]              64
         MaxPool2d-7           [-1, 32, 16, 16]               0
            Conv2d-8           [-1, 64, 16, 16]          18,496
              ReLU-9           [-1, 64, 16, 16]               0
      BatchNorm2d-10           [-1, 64, 16, 16]             128
           Conv2d-11           [-1, 64, 16, 16]          36,928
             ReLU-12           [-1, 64, 16, 16]               0
      BatchNorm2d-13           [-1, 64, 16, 16]             128
        MaxPool2d-14             [-1, 64, 8, 8]               0
           Conv2d-15            [-1, 128, 8, 8]          73,856
             ReLU-16            [-1, 128, 8, 8]               0
      BatchNorm2d-17            [-1, 128, 8, 8]             256
           Conv2d-18            [-1, 128, 8, 8]         147,584
             ReLU-19            [-1, 128, 8, 8]               0
      BatchNorm2d-20            [-1, 128, 8, 8]             256
        MaxPool2d-21            [-1, 128, 4, 4]               0
          Flatten-22                  [-1, 128]               0
          Dropout-23                  [-1, 128]               0
           Linear-24                 [-1, 1024]         132,096
             ReLU-25                 [-1, 1024]               0
          Dropout-26                 [-1, 1024]               0
           Linear-27                   [-1, 10]          10,250
================================================================
Total params: 430,250
Trainable params: 430,250
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 2.76
Params size (MB): 1.64
Estimated Total Size (MB): 4.41
----------------------------------------------------------------

Trainer Output:

Epoch 1/80 completed in 0:00:32.994402. Train_loss:  1.0680, train accuracy:  0.6225 Test loss:  1.0435, test accuracy:  0.6271
Epoch 2/80 completed in 0:00:32.939861. Train_loss:  0.9726, train accuracy:  0.6578 Test loss:  0.9616, test accuracy:  0.6662
Epoch 3/80 completed in 0:00:32.811203. Train_loss:  0.9015, train accuracy:  0.6851 Test loss:  0.9015, test accuracy:  0.6883
Epoch 4/80 completed in 0:00:32.836747. Train_loss:  0.8361, train accuracy:  0.7119 Test loss:  0.8336, test accuracy:  0.7173
Answered By: rafathasan