How downsample work in ResNet in pytorch code?
Question:
In this pytorch ResNet code example they define downsample as variable in line 44. and line 58 use it as function. How this downsample work here as CNN point of view and as python Code point of view.
code example : pytorch ResNet
i searched for if downsample is any pytorch inbuilt function. but it is not.
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1, norm_layer=None):
super(BasicBlock, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
if groups != 1:
raise ValueError('BasicBlock only supports groups=1')
# Both self.conv1 and self.downsample layers downsample the input when stride != 1
self.conv1 = conv3x3(inplanes, planes, stride)
self.bn1 = norm_layer(planes)
self.relu = nn.ReLU(inplace=True)
self.conv2 = conv3x3(planes, planes)
self.bn2 = norm_layer(planes)
self.downsample = downsample
self.stride = stride
def forward(self, x):
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = self.relu(out)
return out
Answers:
I believe in this context it can be average pooling or max pooling. They both reduce the dimensionality and preserve most of the properties of the input.
If you look into the original ResNet Paper (http://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf) they use strided convolutions to downsample the image. The main path is downsampled automatically using these strided convolutions as is done in your code. The residual path uses either (a) identity mapping with zero entries added to add no additional parameters or (b) a 1×1 convolution with the same stride parameter.
The second option could look like follows:
if downsample:
self.downsample = conv1x1(inplanes, planes, strides)
In this ResNet example, Here when we define BasicBlock class we pass downsample as constructor parameter.
def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1, norm_layer=None):
if we pass nothing to class then downsample = None
, as result identity will not changed.
When we pass downsample = "some convolution layer"
as class constructor argument, It will downsample the identity via passed convolution layer to sucessfully perform addition. this layer will downsample the identity through code as mentioned
if self.downsample is not None:
identity = self.downsample(x)
In addition to what Thomas Pinetz said :
In resnet-50
architecture, this is happening as a downsampling step:
downsample = nn.Sequential(conv1x1(self.inplanes, planes * block.expansion, stride),norm_layer(planes * block.expansion))
Note – 1*1 convolution and batch normalization
In this pytorch ResNet code example they define downsample as variable in line 44. and line 58 use it as function. How this downsample work here as CNN point of view and as python Code point of view.
code example : pytorch ResNet
i searched for if downsample is any pytorch inbuilt function. but it is not.
class BasicBlock(nn.Module):
expansion = 1
def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1, norm_layer=None):
super(BasicBlock, self).__init__()
if norm_layer is None:
norm_layer = nn.BatchNorm2d
if groups != 1:
raise ValueError('BasicBlock only supports groups=1')
# Both self.conv1 and self.downsample layers downsample the input when stride != 1
self.conv1 = conv3x3(inplanes, planes, stride)
self.bn1 = norm_layer(planes)
self.relu = nn.ReLU(inplace=True)
self.conv2 = conv3x3(planes, planes)
self.bn2 = norm_layer(planes)
self.downsample = downsample
self.stride = stride
def forward(self, x):
identity = x
out = self.conv1(x)
out = self.bn1(out)
out = self.relu(out)
out = self.conv2(out)
out = self.bn2(out)
if self.downsample is not None:
identity = self.downsample(x)
out += identity
out = self.relu(out)
return out
I believe in this context it can be average pooling or max pooling. They both reduce the dimensionality and preserve most of the properties of the input.
If you look into the original ResNet Paper (http://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf) they use strided convolutions to downsample the image. The main path is downsampled automatically using these strided convolutions as is done in your code. The residual path uses either (a) identity mapping with zero entries added to add no additional parameters or (b) a 1×1 convolution with the same stride parameter.
The second option could look like follows:
if downsample:
self.downsample = conv1x1(inplanes, planes, strides)
In this ResNet example, Here when we define BasicBlock class we pass downsample as constructor parameter.
def __init__(self, inplanes, planes, stride=1, downsample=None, groups=1, norm_layer=None):
if we pass nothing to class then downsample = None
, as result identity will not changed.
When we pass downsample = "some convolution layer"
as class constructor argument, It will downsample the identity via passed convolution layer to sucessfully perform addition. this layer will downsample the identity through code as mentioned
if self.downsample is not None:
identity = self.downsample(x)
In addition to what Thomas Pinetz said :
In resnet-50
architecture, this is happening as a downsampling step:
downsample = nn.Sequential(conv1x1(self.inplanes, planes * block.expansion, stride),norm_layer(planes * block.expansion))
Note – 1*1 convolution and batch normalization