ValueError: Inputs have incompatible shapes

Question:

I have the following code:

def fcn8_decoder(convs, n_classes):
  # features from the encoder stage
  f3, f4, f5 = convs

  # number of filters
  n = 512

  # add convolutional layers on top of the CNN extractor.
  o = tf.keras.layers.Conv2D(n , (7 , 7) , activation='relu' , padding='same', name="conv6", data_format=IMAGE_ORDERING)(f5)
  o = tf.keras.layers.Dropout(0.5)(o)

  o = tf.keras.layers.Conv2D(n , (1 , 1) , activation='relu' , padding='same', name="conv7", data_format=IMAGE_ORDERING)(o)
  o = tf.keras.layers.Dropout(0.5)(o)

  o = tf.keras.layers.Conv2D(n_classes,  (1, 1), activation='relu' , padding='same', data_format=IMAGE_ORDERING)(o)

    
  ### START CODE HERE ###

  # Upsample `o` above and crop any extra pixels introduced
  o = tf.keras.layers.Conv2DTranspose(n_classes , kernel_size=(4,4) ,  strides=(2,2) , use_bias=False)(o)
  o = tf.keras.layers.Cropping2D(cropping=(1,1))(o)

  # load the pool 4 prediction and do a 1x1 convolution to reshape it to the same shape of `o` above
  o2 = f4
  o2 = ( tf.keras.layers.Conv2D(n_classes , ( 1 , 1 ) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(o2)

  # add the results of the upsampling and pool 4 prediction
  o = tf.keras.layers.Add()([o, o2])

  # upsample the resulting tensor of the operation you just did
  o = (tf.keras.layers.Conv2DTranspose( n_classes , kernel_size=(4,4) ,  strides=(2,2) , use_bias=False))(o)
  o = tf.keras.layers.Cropping2D(cropping=(1, 1))(o)

  # load the pool 3 prediction and do a 1x1 convolution to reshape it to the same shape of `o` above
  o2 = f3
  o2 = tf.keras.layers.Conv2D(n_classes , ( 1 , 1 ) , activation='relu' , padding='same', data_format=IMAGE_ORDERING)(o2)

  # add the results of the upsampling and pool 3 prediction
  o = tf.keras.layers.Add()([o, o2])

  # upsample up to the size of the original image
  o = tf.keras.layers.Conv2DTranspose(n_classes , kernel_size=(8,8) ,  strides=(8,8) , use_bias=False )(o)
  o = tf.keras.layers.Cropping2D(((0, 0), (0, 96-84)))(o)

  # append a sigmoid activation
  o = (tf.keras.layers.Activation('sigmoid'))(o)
  ### END CODE HERE ###

  return o

# TEST CODE

test_convs, test_img_input = FCN8()
test_fcn8_decoder = fcn8_decoder(test_convs, 11)

print(test_fcn8_decoder.shape)

del test_convs, test_img_input, test_fcn8_decoder

You can view the complete code here.

I am getting the following error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-14-cff468b82c6a> in <module>
      2 
      3 test_convs, test_img_input = FCN8()
----> 4 test_fcn8_decoder = fcn8_decoder(test_convs, 11)
      5 
      6 print(test_fcn8_decoder.shape)

2 frames
/usr/local/lib/python3.8/dist-packages/keras/layers/merging/base_merge.py in _compute_elemwise_op_output_shape(self, shape1, shape2)
     71       else:
     72         if i != j:
---> 73           raise ValueError(
     74               'Inputs have incompatible shapes. '
     75               f'Received shapes {shape1} and {shape2}')

ValueError: Inputs have incompatible shapes. Received shapes (4, 4, 11) and (4, 5, 11)

What am I doing wrong here?

Asked By: Paul Reiners

||

Answers:

Your model implementation looks correct. The problem lies with the input shape. The input shape must be divisible by 32, like (224, 224, 3) or (64, 64, 3). You can use non-square inputs, as long as the width and height are both divisible by 32. For example (64, 96, 3) would work.

In the code that errors, you are using the shape (64, 84, 3), but 84 is not divisible by 32.

Alternatively, you can zero-pad the inputs to have a shape that is divisible by 32. Your code includes a line to do that, but as another answer notes, you do not use the outputs of the zero-padding. Another thing, instead of zero padding to 96, I suggest zero-padding to the next highest multiple of 32. So if your input width is 84, you would zero-pad to 96. If your input width were 200, you would zero-pad to 224.

Explanation:

The downsampling path of the FCN8 uses 5 convolutional blocks to downsample the input. Because the model uses 5 such blocks, the input must be divisible by 2**5 = 32. (Related to https://stackoverflow.com/a/74847194/5666087.)

Working examples:

test_convs, test_img_input = FCN8(input_height=64, input_width=64)
test_fcn8_decoder = fcn8_decoder(convs=test_convs, n_classes=11)
test_convs, test_img_input = FCN8(input_height=224, input_width=224)
test_fcn8_decoder = fcn8_decoder(convs=test_convs, n_classes=11)
test_convs, test_img_input = FCN8(input_height=128, input_width=128)
test_fcn8_decoder = fcn8_decoder(convs=test_convs, n_classes=11)
Answered By: jkr

Probably, what is causing the problem is that your inputs are not square format (same height and width), and in the sum layer, you are trying to sum tensors that after some convolutions, have different shapes, being one of then in a more rectangular shape.

I managed to find some arbitrary kernel sizes that resulted in tensors with compatible shapes. Also removed the padding.

here is what i changed:

this:

o2 = f4
o2 = ( tf.keras.layers.Conv2D(n_classes , ( 1 , 1 ) , activation='relu' , padding= 'same', data_format=IMAGE_ORDERING))(o2)

turns to this:

o2 = f4
o2 = ( tf.keras.layers.Conv2D(n_classes , ( 1 , 2 ) ,activation='relu', data_format=IMAGE_ORDERING))(o2)

and this:

o2 = f3
o2 = tf.keras.layers.Conv2D(n_classes , ( 1 , 1 ) , activation='relu' , padding ='same', data_format=IMAGE_ORDERING)(o2)

turns to this:

o2 = f3
o2 = tf.keras.layers.Conv2D(n_classes , ( 1 , 3 ) , activation='relu' , data_format=IMAGE_ORDERING)(o2)

Here is the full code:

def fcn8_decoder(convs, n_classes):
    #features from the encoder stage
    f3, f4, f5 = convs

    # number of filters
    n = 512

    # add convolutional layers on top of the CNN extractor.
    o = tf.keras.layers.Conv2D(n , (7 , 7) , activation='relu' , padding='same', name="conv6", data_format=IMAGE_ORDERING)(f5)
    o = tf.keras.layers.Dropout(0.5)(o)

    o = tf.keras.layers.Conv2D(n , (1 , 1) , activation='relu' , padding='same', name="conv7", data_format=IMAGE_ORDERING)(o)
    o = tf.keras.layers.Dropout(0.5)(o)

    o = tf.keras.layers.Conv2D(n_classes,  (1, 1), activation='relu' , padding='same', data_format=IMAGE_ORDERING)(o)


    ### START CODE HERE ###

    # Upsample `o` above and crop any extra pixels introduced
    o = tf.keras.layers.Conv2DTranspose(n_classes , kernel_size=(4,4) ,  strides=(2,2) , use_bias=False)(o)
    o = tf.keras.layers.Cropping2D(cropping=(1,1))(o)

    # load the pool 4 prediction and do a 1x1 convolution to reshape it to the same shape of `o` above
    o2 = f4
    o2 = ( tf.keras.layers.Conv2D(n_classes , ( 1 , 2 ) , activation='relu' , data_format=IMAGE_ORDERING))(o2)



    # add the results of the upsampling and pool 4 prediction
    o = tf.keras.layers.Add()([o, o2])

    # upsample the resulting tensor of the operation you just did
    o = (tf.keras.layers.Conv2DTranspose( n_classes , kernel_size=(4,4) ,  strides=(2,2) , use_bias=False))(o)
    o = tf.keras.layers.Cropping2D(cropping=(1, 1))(o)

    # load the pool 3 prediction and do a 1x1 convolution to reshape it to the same shape of `o` above
    o2 = f3
    o2 = tf.keras.layers.Conv2D(n_classes , ( 1 , 3 ) , activation='relu' , data_format=IMAGE_ORDERING)(o2)

    print("###################################################")
    print(o2.shape,o.shape)

    # add the results of the upsampling and pool 3 prediction
    o = tf.keras.layers.Add()([o, o2])

    # upsample up to the size of the original image
    o = tf.keras.layers.Conv2DTranspose(n_classes , kernel_size=(8,8) ,  strides=(8,8) , use_bias=False )(o)
    o = tf.keras.layers.Cropping2D(((0, 0), (0, 96-84)))(o)

    # append a sigmoid activation
    o = (tf.keras.layers.Activation('sigmoid'))(o)
    ### END CODE HERE ###

    return o
Answered By: Guinther Kovalski

After reviewing your colab code, I can verify that your error comes from typo

In the picture below, you can see that you have used img_input instead of x in the # Block 1 of FCN8() function, which basically means instead of passing zeros-padded layer, you have passed the input layer to the conv block, which is causing the error.

To solve the error, you have to replace img_input with x inside # Block 1

enter image description here

Answered By: Prakash Dahal

There is a difference in the shape of the ‘input_height’ and ‘input_width’. Additionally there is incompatibility in the size of f3, f4, f5. This creates problem while adding the keras layers inside the function ‘fcn8_decoder()’. But this can be solvable by using appropriate cropping inside ‘fcn8_decoder()’ function tf-keras-Add.
The below code shows the problem inside the function,

input1 = tf.keras.layers.Input(shape=(16,))
x1 = tf.keras.layers.Dense(8, activation='relu')(input1)
print(x1)
input2 = tf.keras.layers.Input(shape=(32,))
x2 = tf.keras.layers.Dense(8, activation='relu')(input2)
print(x2)
# equivalent to `added = tf.keras.layers.add([x1, x2])`
added = tf.keras.layers.Add()([x1, input2])
print(added)
out = tf.keras.layers.Dense(4)(added)
model = tf.keras.models.Model(inputs=[input1, input2], outputs=out)

This code throws an error as follows,

ValueError: Inputs have incompatible shapes. Received shapes (8,) and (32,)

Possible solution to your problem is given below,

def fcn8_decoder(convs, n_classes):
  # features from the encoder stage
  f3, f4, f5 = convs

  # number of filters
  n = 512

  # add convolutional layers on top of the CNN extractor.
  o = tf.keras.layers.Conv2D(n , (7 , 7) , activation='relu' , padding='same', name="conv6", data_format=IMAGE_ORDERING)(f5)
  o = tf.keras.layers.Dropout(0.5)(o)

  o = tf.keras.layers.Conv2D(n , (1 , 1) , activation='relu' , padding='same', name="conv7", data_format=IMAGE_ORDERING)(o)
  o = tf.keras.layers.Dropout(0.5)(o)

  o = tf.keras.layers.Conv2D(n_classes,  (1, 1), activation='relu' , padding='same', data_format=IMAGE_ORDERING)(o)

    
  ### START CODE HERE ###

  # Upsample `o` above and crop any extra pixels introduced
  o = tf.keras.layers.Conv2DTranspose(n_classes , kernel_size=(4,4) ,  padding= 'same', strides=(2,2) , use_bias=False)(o)
  o = tf.keras.layers.Cropping2D(cropping=(1,1))(o)

  # load the pool 4 prediction and do a 1x1 convolution to reshape it to the same shape of `o` above
  o2 = f4
  o2 = ( tf.keras.layers.Conv2D(n_classes , ( 1 , 1 ) , activation='relu' , padding='same', data_format=IMAGE_ORDERING))(o2)
  o2 = tf.keras.layers.Cropping2D(cropping=(1,2))(o2)
  print(o2)
  # add the results of the upsampling and pool 4 prediction
  o = tf.keras.layers.Add()([o, o2])

  # upsample the resulting tensor of the operation you just did
  o = (tf.keras.layers.Conv2DTranspose( n_classes , kernel_size=(4,4) , padding='same', strides=(2,2) , use_bias=False))(o)
  o = tf.keras.layers.Cropping2D(cropping=(1, 1))(o)

  # load the pool 3 prediction and do a 1x1 convolution to reshape it to the same shape of `o` above
  o2 = f3
  o2 = tf.keras.layers.Conv2D(n_classes , ( 1 , 1 ) , activation='relu' , padding='same', data_format=IMAGE_ORDERING)(o2)
  o2 = tf.keras.layers.Cropping2D(cropping=(3,4))(o2)

  # add the results of the upsampling and pool 3 prediction
  o = tf.keras.layers.Add()([o, o2])

  # upsample up to the size of the original image
  o = tf.keras.layers.Conv2DTranspose(n_classes , kernel_size=(8,8) ,  padding='same', strides=(8,8) , use_bias=False )(o)
  o = tf.keras.layers.Cropping2D(((0, 0), (0, 96-84)))(o)

  # append a sigmoid activation
  o = (tf.keras.layers.Activation('sigmoid'))(o)
  ### END CODE HERE ###

  return o

The output shape of test_fcn8_decoder for this selection is as follows,

(None, 16, 4, 11)

According to your requirements, the choice of cropping can be made.The documentation gives more additional information tf-keras-Cropping2D

Answered By: Ipvikukiepki-KQS