"RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead"?
Question:
I am trying to use a pre-trained model. Here’s where the problem occurs
Isn’t the model supposed to take in a simple colored image? Why is it expecting a 4-dimensional input?
RuntimeError Traceback (most recent call last)
<ipython-input-51-d7abe3ef1355> in <module>()
33
34 # Forward pass the data through the model
---> 35 output = model(data)
36 init_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
37
5 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
336 _pair(0), self.dilation, self.groups)
337 return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338 self.padding, self.dilation, self.groups)
339
340
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead
Where
inception = models.inception_v3()
model = inception.to(device)
Answers:
As Usman Ali wrote in his comment, pytorch (and most other DL toolboxes) expects a batch of images as an input. Thus you need to call
output = model(data[None, ...])
Inserting a singleton “batch” dimension to your input data
.
Please also note that the model you are using might expect a different input size (3x229x229) and not 3x224x224.
From the Pytorch documentation on convolutional layers, Conv2d
layers expect input with the shape
(n_samples, channels, height, width) # e.g., (1000, 1, 224, 224)
Passing grayscale images in their usual format (224, 224) won’t work.
To get the right shape, you will need to add a channel dimension. You can do it as follows:
x = np.expand_dims(x, 1) # if numpy array
tensor = tensor.unsqueeze(1) # if torch tensor
The unsqueeze()
method adds a dimensions at the specified index. The result would have shape:
(1000, 1, 224, 224)
As the model expects a batch of images, we need to pass a 4 dimensional tensor, which can be done as follows:
Method-1: output = model(data[0:1])
Method-2: output = model(data[0].unsqueeze(0))
This will only send the first image of the whole batch.
Similarly for ith image we can do:
Method-1: output = model(data[i:i+1])
Method-2: output = model(data[i].unsqueeze(0))
I am trying to use a pre-trained model. Here’s where the problem occurs
Isn’t the model supposed to take in a simple colored image? Why is it expecting a 4-dimensional input?
RuntimeError Traceback (most recent call last)
<ipython-input-51-d7abe3ef1355> in <module>()
33
34 # Forward pass the data through the model
---> 35 output = model(data)
36 init_pred = output.max(1, keepdim=True)[1] # get the index of the max log-probability
37
5 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/conv.py in forward(self, input)
336 _pair(0), self.dilation, self.groups)
337 return F.conv2d(input, self.weight, self.bias, self.stride,
--> 338 self.padding, self.dilation, self.groups)
339
340
RuntimeError: Expected 4-dimensional input for 4-dimensional weight 32 3 3, but got 3-dimensional input of size [3, 224, 224] instead
Where
inception = models.inception_v3()
model = inception.to(device)
As Usman Ali wrote in his comment, pytorch (and most other DL toolboxes) expects a batch of images as an input. Thus you need to call
output = model(data[None, ...])
Inserting a singleton “batch” dimension to your input data
.
Please also note that the model you are using might expect a different input size (3x229x229) and not 3x224x224.
From the Pytorch documentation on convolutional layers, Conv2d
layers expect input with the shape
(n_samples, channels, height, width) # e.g., (1000, 1, 224, 224)
Passing grayscale images in their usual format (224, 224) won’t work.
To get the right shape, you will need to add a channel dimension. You can do it as follows:
x = np.expand_dims(x, 1) # if numpy array
tensor = tensor.unsqueeze(1) # if torch tensor
The unsqueeze()
method adds a dimensions at the specified index. The result would have shape:
(1000, 1, 224, 224)
As the model expects a batch of images, we need to pass a 4 dimensional tensor, which can be done as follows:
Method-1: output = model(data[0:1])
Method-2: output = model(data[0].unsqueeze(0))
This will only send the first image of the whole batch.
Similarly for ith image we can do:
Method-1: output = model(data[i:i+1])
Method-2: output = model(data[i].unsqueeze(0))