Exchange a pooling layer using conv2d layer in keras

Question:

I have a neural network in keras with two conv2d layers, an average pooling layer and a dense output layer.
I want to put the trained model on an FPGA later on and the architecture does not support MaxPooling or AveragePooling layers.
However, I read somewhere that you can basically use a conv2d for pooling by playing with the parameters, I am unsure how to do it exactly.
I naively thought that a Pooling layer (max or average or whatever) like this:

model.add(tf.keras.layers.AveragePooling2D(pool_size=(1, 3)))

would do roughly the same job as this:

model.add(tf.keras.layers.Conv2D(1, (1, 3),strides=(1,3),use_bias=False,padding='same',name='Conv3'))
model.add(tf.keras.layers.BatchNormalization())
model.add(tf.keras.layers.Activation('relu'))

Where I thought that choosing 1 filter would be equivalent of telling to network to do one operation (e.g. pooling or maxing, whichever seems best). And the dimension and stride should correspond to that of the Pooling layer.
However the total parameters in my model are vastly different and I fail to understand why my model with the averagepooling has 15,644 parameters and my model with the conv2d variant only has 2,604 parameters?
Also the model performs a lot worse when doing it like this.

Asked By: darmstadt beste

||

Answers:

You could create conv layer, set weights that would perform average pooling and then set this layer as not trainable.

Example code:

conv_pool_weights = np.zeros((2, 2, 4, 4))  # this shape should be computed depending on shape of previous layer's output

for i in range(conv_pool_weights.shape[2]):
    conv_pool_weights[:,:,i,i] = 1./(conv_pool_weights.shape[0]*conv_pool_weights.shape[1])

conv_pool = Conv2D(4, kernel_size=(2, 2), strides=(2, 2), input_shape=(16, 16, 4), use_bias=False)

model_conv = Sequential(
    conv_pool
)

conv_pool.set_weights([conv_pool_weights])
conv_pool.trainable = False

model_pool = Sequential(
    AveragePooling2D(input_shape=(16, 16, 4))
)

model_conv.summary()
model_pool.summary()

Output:

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)             (None, 8, 8, 4)           64        
                                                                 
=================================================================
Total params: 64
Trainable params: 0
Non-trainable params: 64
_________________________________________________________________
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 average_pooling2d (AverageP  (None, 8, 8, 4)          0         
 ooling2D)                                                       
                                                                 
=================================================================
Total params: 0
Trainable params: 0
Non-trainable params: 0
_________________________________________________________________

Test:

random_input = np.random.random((4, 16, 16, 4))

pred_1 = model_pool.predict(random_input)
pred_2 = model_conv.predict(random_input)

print(np.mean(np.abs(pred_1 - pred_2)))

Output:

1.1503289e-08

As we can see there is some difference but it is negligible.

Answered By: maciek97x