How to get rid of checkerboard artifacts

Question:

I am using a fully convolutional autoencoder to color black and white images, however, the output has a checkerboard pattern and I want to get rid of it. The checkerboard artifacts I have seen so far allways have been far smaller than mine and the usual way to get rid of them is replacing all unpooling operations with bilinear upsampling (I have been told that).

But I can not simply replace the unpooling operation because I work with different sized images, thus the unpooling operation is needed, else the output tensor could have a different size than the original.

TLDR:

How can I get rid of these checkerboard-artifacts without replacing the unpooling operations?

class AE(nn.Module):
    def __init__(self):
        super(AE, self).__init__()
        self.leaky_reLU = nn.LeakyReLU(0.2)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=1, return_indices=True)
        self.unpool = nn.MaxUnpool2d(kernel_size=2, stride=2, padding=1)
        self.softmax = nn.Softmax2d()

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.conv4 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv5 = nn.Conv2d(in_channels=512, out_channels=1024, kernel_size=3, stride=1, padding=1)
        self.conv6 = nn.ConvTranspose2d(in_channels=1024, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv7 = nn.ConvTranspose2d(in_channels=512, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.conv8 = nn.ConvTranspose2d(in_channels=256, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.conv9 = nn.ConvTranspose2d(in_channels=128, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv10 = nn.ConvTranspose2d(in_channels=64, out_channels=2, kernel_size=3, stride=1, padding=1)

    def forward(self, x):

        # encoder
        x = self.conv1(x)
        x = self.leaky_reLU(x)
        size1 = x.size()
        x, indices1 = self.pool(x)

        x = self.conv2(x)
        x = self.leaky_reLU(x)
        size2 = x.size()
        x, indices2 = self.pool(x)

        x = self.conv3(x)
        x = self.leaky_reLU(x)
        size3 = x.size()
        x, indices3 = self.pool(x)

        x = self.conv4(x)
        x = self.leaky_reLU(x)
        size4 = x.size()
        x, indices4 = self.pool(x)

        ######################
        x = self.conv5(x)
        x = self.leaky_reLU(x)

        x = self.conv6(x)
        x = self.leaky_reLU(x)
        ######################

        # decoder
        x = self.unpool(x, indices4, output_size=size4)
        x = self.conv7(x)
        x = self.leaky_reLU(x)

        x = self.unpool(x, indices3, output_size=size3)
        x = self.conv8(x)
        x = self.leaky_reLU(x)

        x = self.unpool(x, indices2, output_size=size2)
        x = self.conv9(x)
        x = self.leaky_reLU(x)

        x = self.unpool(x, indices1, output_size=size1)
        x = self.conv10(x)
        x = self.softmax(x)

        return x

NNs output

Asked By: Stefan

||

Answers:

Instead of using an upconv layer such as nn.ConvTranspose2d, you can use interpolation in the decoder part to go back to your initial format, such as torch.nn.functional.interpolate. It will prevent you from having checkerboards artifacts.

If you want learnable weights in the decoder, you should also use a conv layer such as nn.Conv2d after each interpolation.

Answered By: Gregoire Fournier

This pattern you have because of deconvolution (nn.ConvTranspose2d). The article explains it in detail.

You may try Upsample as alternative. This will not provide the checkerboard pattern.

Works like this:

import torch 
input = torch.arange(1, 5, dtype=torch.float32).view(1, 1, 2, 2)
m = torch.nn.Upsample(scale_factor=2, mode='nearest')
m(input)

However you will not be able to learn anything with Upsample. It is just a transform.
So it is a trade. There are many papers online how to deal with the checkerboard pattern for different problems.

The idea is to train your network so the checkerboard pattern is gone.

Answered By: prosti

Skip connection is commonly used in Encoder-Decoder architecture and it helps to produce accurate result by passing appearance information from shallow layer of encoder (discriminator) to corresponding deeper layer of decoder (generator). Unet is the widely used Encoder-Decoder type architecture. Linknet is also very popular and it differs with Unet in the way of fusing appearance information of encoder layer with the decoder layer. In case of Unet, incoming features (from encoder) are concatenated in the corresponding decoder layer. On the other hand, Linknet performs addition and that why Linknet requires fewer number of operations in a single forward pass and significantly faster than the Unet.

Your each convolution block in Decoder might looks like following:
enter image description here

Additionally, i’m attaching a figure bellow depicting architecture of Unet and LinkNet. Hope using skip connection will help.

enter image description here

Answered By: Kaushik Roy

As Kaushik Roy has specified, Skip-Connections are the way to go!

class AE(nn.Module):
    def __init__(self):
        super(AE, self).__init__()
        self.leaky_reLU = nn.LeakyReLU(0.2)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2, padding=1, return_indices=True)
        self.unpool = nn.MaxUnpool2d(kernel_size=2, stride=2, padding=1)
        self.softmax = nn.Softmax2d()

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.conv3 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.conv4 = nn.Conv2d(in_channels=256, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv5 = nn.Conv2d(in_channels=512, out_channels=1024, kernel_size=3, stride=1, padding=1)
        self.conv6 = nn.Conv2d(in_channels=1024, out_channels=512, kernel_size=3, stride=1, padding=1)
        self.conv7 = nn.Conv2d(in_channels=1024, out_channels=256, kernel_size=3, stride=1, padding=1)
        self.conv8 = nn.Conv2d(in_channels=512, out_channels=128, kernel_size=3, stride=1, padding=1)
        self.conv9 = nn.Conv2d(in_channels=256, out_channels=64, kernel_size=3, stride=1, padding=1)
        self.conv10 = nn.Conv2d(in_channels=128, out_channels=2, kernel_size=3, stride=1, padding=1)

    def forward(self, x):

        # encoder
        x = self.conv1(x)
        out1 = self.leaky_reLU(x)
        x = out1
        size1 = x.size()
        x, indices1 = self.pool(x)

        x = self.conv2(x)
        out2 = self.leaky_reLU(x)
        x = out2
        size2 = x.size()
        x, indices2 = self.pool(x)

        x = self.conv3(x)
        out3 = self.leaky_reLU(x)
        x = out3
        size3 = x.size()
        x, indices3 = self.pool(x)

        x = self.conv4(x)
        out4 = self.leaky_reLU(x)
        x = out4
        size4 = x.size()
        x, indices4 = self.pool(x)

        ######################
        x = self.conv5(x)
        x = self.leaky_reLU(x)

        x = self.conv6(x)
        x = self.leaky_reLU(x)
        ######################

        # decoder
        x = self.unpool(x, indices4, output_size=size4)
        x = self.conv7(torch.cat((x, out4), 1))
        x = self.leaky_reLU(x)

        x = self.unpool(x, indices3, output_size=size3)
        x = self.conv8(torch.cat((x, out3), 1))
        x = self.leaky_reLU(x)

        x = self.unpool(x, indices2, output_size=size2)
        x = self.conv9(torch.cat((x, out2), 1))
        x = self.leaky_reLU(x)

        x = self.unpool(x, indices1, output_size=size1)
        x = self.conv10(torch.cat((x, out1), 1))
        x = self.softmax(x)

        return x

This answer was posted as an edit to the question How to get rid of checkerboard artifacts by the OP Stefan under CC BY-SA 4.0.

Answered By: vvvvv