What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?
Question:
What is the difference between ‘SAME’ and ‘VALID’ padding in tf.nn.max_pool
of tensorflow
?
In my opinion, ‘VALID’ means there will be no zero padding outside the edges when we do max pool.
According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use ‘VALID’ of tensorflow
.
But what is ‘SAME’ padding of max pool in tensorflow
?
Answers:
The TensorFlow Convolution example gives an overview about the difference between SAME
and VALID
:

For the
SAME
padding, the output height and width are computed as:out_height = ceil(float(in_height) / float(strides[1])) out_width = ceil(float(in_width) / float(strides[2]))
And

For the
VALID
padding, the output height and width are computed as:out_height = ceil(float(in_height  filter_height + 1) / float(strides[1])) out_width = ceil(float(in_width  filter_width + 1) / float(strides[2]))
I’ll give an example to make it clearer:
x
: input image of shape [2, 3], 1 channelvalid_pad
: max pool with 2×2 kernel, stride 2 and VALID padding.same_pad
: max pool with 2×2 kernel, stride 2 and SAME padding (this is the classic way to go)
The output shapes are:
valid_pad
: here, no padding so the output shape is [1, 1]same_pad
: here, we pad the image to the shape [2, 4] (withinf
and then apply max pool), so the output shape is [1, 2]
x = tf.constant([[1., 2., 3.],
[4., 5., 6.]])
x = tf.reshape(x, [1, 2, 3, 1]) # give a shape accepted by tf.nn.max_pool
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
valid_pad.get_shape() == [1, 1, 1, 1] # valid_pad is [5.]
same_pad.get_shape() == [1, 1, 2, 1] # same_pad is [5., 6.]
If you like ascii art:

"VALID"
= without padding:inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13) ________________ dropped _________________

"SAME"
= with zero padding:pad pad inputs: 0 1 2 3 4 5 6 7 8 9 10 11 12 130 0 ________________ _________________ ________________
In this example:
 Input width = 13
 Filter width = 6
 Stride = 5
Notes:
"VALID"
only ever drops the rightmost columns (or bottommost rows)."SAME"
tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).
Edit:
About the name:
 With
"SAME"
padding, if you use a stride of 1, the layer’s outputs will have the same spatial dimensions as its inputs.  With
"VALID"
padding, there’s no “madeup” padding inputs. The layer only uses valid input data.
Based on the explanation here and following up on Tristan’s answer, I usually use these quick functions for sanity checks.
# a function to help us stay clean
def getPaddings(pad_along_height,pad_along_width):
# if even.. easy..
if pad_along_height%2 == 0:
pad_top = pad_along_height / 2
pad_bottom = pad_top
# if odd
else:
pad_top = np.floor( pad_along_height / 2 )
pad_bottom = np.floor( pad_along_height / 2 ) +1
# check if width padding is odd or even
# if even.. easy..
if pad_along_width%2 == 0:
pad_left = pad_along_width / 2
pad_right= pad_left
# if odd
else:
pad_left = np.floor( pad_along_width / 2 )
pad_right = np.floor( pad_along_width / 2 ) +1
#
return pad_top,pad_bottom,pad_left,pad_right
# strides [image index, y, x, depth]
# padding 'SAME' or 'VALID'
# bottom and right sides always get the one additional padded pixel (if padding is odd)
def getOutputDim (inputWidth,inputHeight,filterWidth,filterHeight,strides,padding):
if padding == 'SAME':
out_height = np.ceil(float(inputHeight) / float(strides[1]))
out_width = np.ceil(float(inputWidth) / float(strides[2]))
#
pad_along_height = ((out_height  1) * strides[1] + filterHeight  inputHeight)
pad_along_width = ((out_width  1) * strides[2] + filterWidth  inputWidth)
#
# now get padding
pad_top,pad_bottom,pad_left,pad_right = getPaddings(pad_along_height,pad_along_width)
#
print 'output height', out_height
print 'output width' , out_width
print 'total pad along height' , pad_along_height
print 'total pad along width' , pad_along_width
print 'pad at top' , pad_top
print 'pad at bottom' ,pad_bottom
print 'pad at left' , pad_left
print 'pad at right' ,pad_right
elif padding == 'VALID':
out_height = np.ceil(float(inputHeight  filterHeight + 1) / float(strides[1]))
out_width = np.ceil(float(inputWidth  filterWidth + 1) / float(strides[2]))
#
print 'output height', out_height
print 'output width' , out_width
print 'no padding'
# use like so
getOutputDim (80,80,4,4,[1,1,1,1],'SAME')
When stride
is 1 (more typical with convolution than pooling), we can think of the following distinction:
"SAME"
: output size is the same as input size. This requires the filter window to slip outside input map, hence the need to pad."VALID"
: Filter window stays at valid position inside input map, so output size shrinks byfilter_size  1
. No padding occurs.
There are three choices of padding: valid (no padding), same (or half), full. You can find explanations (in Theano) here:
http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html
 Valid or no padding:
The valid padding involves no zero padding, so it covers only the valid input, not including artificially generated zeros. The length of output is ((the length of input) – (k1)) for the kernel size k if the stride s=1.
 Same or half padding:
The same padding makes the size of outputs be the same with that of inputs when s=1. If s=1, the number of zeros padded is (k1).
 Full padding:
The full padding means that the kernel runs over the whole inputs, so at the ends, the kernel may meet the only one input and zeros else. The number of zeros padded is 2(k1) if s=1. The length of output is ((the length of input) + (k1)) if s=1.
Therefore, the number of paddings: (valid) <= (same) <= (full)
Padding is an operation to increase the size of the input data. In case of 1dimensional data you just append/prepend the array with a constant, in 2dim you surround matrix with these constants. In ndim you surround your ndim hypercube with the constant. In most of the cases this constant is zero and it is called zeropadding.
Here is an example of zeropadding with p=1
applied to 2d tensor:
You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:
 VALID padding. The easiest case, means no padding at all. Just leave your data the same it was.
 SAME padding sometimes called HALF padding. It is called SAME because for a convolution with a stride=1, (or for pooling) it should produce output of the same size as the input. It is called HALF because for a kernel of size
k
 FULL padding is the maximum padding which does not result in a convolution over just padded elements. For a kernel of size
k
, this padding is equal tok  1
.
To use arbitrary padding in TF, you can use tf.pad()
I am quoting this answer from official tensorflow docs https://www.tensorflow.org/api_guides/python/nn#Convolution
For the ‘SAME’ padding, the output height and width are computed as:
out_height = ceil(float(in_height) / float(strides[1]))
out_width = ceil(float(in_width) / float(strides[2]))
and the padding on the top and left are computed as:
pad_along_height = max((out_height  1) * strides[1] +
filter_height  in_height, 0)
pad_along_width = max((out_width  1) * strides[2] +
filter_width  in_width, 0)
pad_top = pad_along_height // 2
pad_bottom = pad_along_height  pad_top
pad_left = pad_along_width // 2
pad_right = pad_along_width  pad_left
For the ‘VALID’ padding, the output height and width are computed as:
out_height = ceil(float(in_height  filter_height + 1) / float(strides[1]))
out_width = ceil(float(in_width  filter_width + 1) / float(strides[2]))
and the padding values are always zero.
Quick Explanation
VALID
: Don’t apply any padding, i.e., assume that all dimensions are valid so that input image fully gets covered by filter and stride you specified.
SAME
: Apply padding to input (if needed) so that input image gets fully covered by filter and stride you specified. For stride 1, this will ensure that output image size is same as input.
Notes
 This applies to conv layers as well as max pool layers in same way
 The term “valid” is bit of a misnomer because things don’t become “invalid” if you drop part of the image. Sometime you might even want that. This should have probably be called
NO_PADDING
instead.  The term “same” is a misnomer too because it only makes sense for stride of 1 when output dimension is same as input dimension. For stride of 2, output dimensions will be half, for example. This should have probably be called
AUTO_PADDING
instead.  In
SAME
(i.e. autopad mode), Tensorflow will try to spread padding evenly on both left and right.  In
VALID
(i.e. no padding mode), Tensorflow will drop right and/or bottom cells if your filter and stride doesn’t full cover input image.
VALID padding: this is with zero padding. Hope there is no confusion.
x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='VALID')
print (valid_pad.get_shape()) # output>(1, 2, 1, 1)
SAME padding: This is kind of tricky to understand in the first place because we have to consider two conditions separately as mentioned in the official docs.
Let’s take input as , output as , padding as , stride as and kernel size as (only a single dimension is considered)
Case 01: :
Case 02: :
is calculated such that the minimum value which can be taken for padding. Since value of is known, value of can be found using this formula .
Let’s work out this example:
x = tf.constant([[1., 2., 3.], [4., 5., 6.],[ 7., 8., 9.], [ 7., 8., 9.]])
x = tf.reshape(x, [1, 4, 3, 1])
same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding='SAME')
print (same_pad.get_shape()) # > output (1, 2, 2, 1)
Here the dimension of x is (3,4). Then if the horizontal direction is taken (3):
If the vertial direction is taken (4):
Hope this will help to understand how actually SAME padding works in TF.
Padding on/off. Determines the effective size of your input.
VALID:
No padding. Convolution etc. ops are only performed at locations that are “valid”, i.e. not too close to the borders of your tensor.
With a kernel of 3×3 and image of 10×10, you would be performing convolution on the 8×8 area inside the borders.
SAME:
Padding is provided. Whenever your operation references a neighborhood (no matter how big), zero values are provided when that neighborhood extends outside the original tensor to allow that operation to work also on border values.
With a kernel of 3×3 and image of 10×10, you would be performing convolution on the full 10×10 area.
To sum up, ‘valid’ padding means no padding. The output size of the convolutional layer shrinks depending on the input size & kernel size.
On the contrary, ‘same’ padding means using padding. When the stride is set as 1, the output size of the convolutional layer maintains as the input size by appending a certain number of ‘0border’ around the input data when calculating convolution.
Hope this intuitive description helps.
Tensorflow 2.0 Compatible Answer: Detailed Explanations have been provided above, about “Valid” and “Same” Padding.
However, I will specify different Pooling Functions and their respective Commands in Tensorflow 2.x (>= 2.0)
, for the benefit of the community.
Functions in 1.x:
tf.nn.max_pool
tf.keras.layers.MaxPool2D
Average Pooling => None in tf.nn, tf.keras.layers.AveragePooling2D
Functions in 2.x:
tf.nn.max_pool
if used in 2.x and tf.compat.v1.nn.max_pool_v2
or tf.compat.v2.nn.max_pool
, if migrated from 1.x to 2.x.
tf.keras.layers.MaxPool2D
if used in 2.x and
tf.compat.v1.keras.layers.MaxPool2D
or tf.compat.v1.keras.layers.MaxPooling2D
or tf.compat.v2.keras.layers.MaxPool2D
or tf.compat.v2.keras.layers.MaxPooling2D
, if migrated from 1.x to 2.x.
Average Pooling => tf.nn.avg_pool2d
or tf.keras.layers.AveragePooling2D
if used in TF 2.x and
tf.compat.v1.nn.avg_pool_v2
or tf.compat.v2.nn.avg_pool
or tf.compat.v1.keras.layers.AveragePooling2D
or tf.compat.v1.keras.layers.AvgPool2D
or tf.compat.v2.keras.layers.AveragePooling2D
or tf.compat.v2.keras.layers.AvgPool2D
, if migrated from 1.x to 2.x.
For more information about Migration from Tensorflow 1.x to 2.x, please refer to this Migration Guide.
Complementing YvesgereY’s great answer, I found this visualization extremely helpful:
Padding ‘valid‘ is the first figure. The filter window stays inside the image.
Padding ‘same‘ is the third figure. The output is the same size.
Found it on this article
Visualization credits: vdumoulin@GitHub
valid padding is no padding.
same padding is padding in a way the output has the same size as input.
In the TensorFlow function tf.nn.max_pool, the padding parameter determines how the input tensor is padded before the max pooling operation is applied. The ‘VALID’ padding option means that no padding will be applied to the input tensor, and the output tensor will have dimensions that are smaller than the input tensor. For example, if the input tensor has dimensions [batch_size, height, width, channels] and the max pooling window has dimensions [pool_height, pool_width], then the output tensor will have dimensions [batch_size, (height – pool_height + 1), (width – pool_width + 1), channels].
The ‘SAME’ padding option, on the other hand, means that the input tensor will be padded with zeros in such a way that the output tensor will have the same dimensions as the input tensor. The amount of padding applied to the input tensor will depend on the dimensions of the max pooling window and the stride size. For example, if the input tensor has dimensions [batch_size, height, width, channels] and the max pooling window has dimensions [pool_height, pool_width], and the stride is set to 1, then the output tensor will also have dimensions [batch_size, height, width, channels], and the input tensor will be padded with zeros on the top, bottom, left, and right sides as needed.
In summary, the ‘VALID’ padding option means that no padding will be applied to the input tensor, and the output tensor will have dimensions that are smaller than the input tensor. The ‘SAME’ padding option means that the input tensor will be padded with zeros as needed to ensure that the output tensor has the same dimensions as the input tensor.