Pytorch transforms.Compose usage for pair of images in segmentation tasks

Question:

I’m trying to use the transforms.Compose() in my segmentation task. But I’m not sure how to use the same (almost) random transforms for both the image and the mask.

So in my segmentation task, I have the raw picture and the corresponding mask, I’d like to generate more random transformed image pairs for training popurse. Meaning if I do some transform on my raw pictures, and this transformation should also happen on my mask pictures, and then this pair can go into my CNN. My transformer is something like:

train_transform = transforms.Compose([
            transforms.Resize(512), # resize, the smaller edge will be matched.
            transforms.RandomHorizontalFlip(p=0.5),
            transforms.RandomVerticalFlip(p=0.5),
            transforms.RandomRotation(90),
            transforms.RandomResizedCrop(320,scale=(0.3, 1.0)),
            AddGaussianNoise(0., 1.),
            transforms.ToTensor(), # convert a PIL image or ndarray to tensor. 
            transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) # normalize to Imagenet mean and std
])

mask_transform = transforms.Compose([
            transforms.Resize(512), # resize, the smaller edge will be matched.
            transforms.RandomHorizontalFlip(p=0.5),
            transforms.RandomVerticalFlip(p=0.5),
            transforms.RandomRotation(90),
            transforms.RandomResizedCrop(320,scale=(0.3, 1.0)),
            ##---------------------!------------------
            transforms.ToTensor(), # convert a PIL image or ndarray to tensor. 
            transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) # normalize to Imagenet mean and std
])

Notice, in the code block, I added a class that can add random noise to the raw images transformation, which is not in the mask_transformation, that I want my mask images follow the raw image transformation, but ignore the random noise. So how can these two transformations happen in pairs (with the same random act)?

Asked By: kikyo91

||

Answers:

This seems to have an answer here: How to apply same transform on a pair of picture.

Basically, you can use the torchvision functional API to get a handle to the randomly generated parameters of a random transform such as RandomCrop. Then call torchvision.transforms.functional.crop() on both images with the same parameter values. It seems a bit lengthy but gets the job done. You can skip some transforms on some images, as per your need.

Another option that I’ve seen elsewhere is to re-seed the random generator with the same seed, to force generation of the same random transformations twice. I would think that such implementations are hacky and keep changing with pytorch versions (e.g. whether to re-seed np.random, random, or torch.manual_seed() ?)

Answered By: Sabyasachi Ghosh

So Sabyasachi’s answer is really helpful for me, and I was able to use the transformer in PyTorch to transform my images. This usage of the torchvision.transformer is not the most straightforward way for transferring images. So I’m adding my solution that has an example of using the torchvision.transforms.functional, but also using skimage.filters, and lots of transform functions are available here: https://scikit-image.org/docs/dev/api/skimage.filters.html#skimage.filters.unsharp_mask.

import torchvision.transforms.functional as TF
from skimage.filters import gaussian
from skimage.filters import unsharp_mask

def transformer(image, mask):
    # image and mask are PIL image object. 
    img_w, img_h = image.size
    
    # Random horizontal flipping
    if random.random() > 0.5:
        image = TF.hflip(image)
        mask = TF.hflip(mask)

    # Random vertical flipping
    if random.random() > 0.5:
        image = TF.vflip(image)
        mask = TF.vflip(mask)
  
    # Random affine
    affine_param = transforms.RandomAffine.get_params(
        degrees = [-180, 180], translate = [0.3,0.3],  
        img_size = [img_w, img_h], scale_ranges = [1, 1.3], 
        shears = [2,2])
    image = TF.affine(image, 
                      affine_param[0], affine_param[1],
                      affine_param[2], affine_param[3])
    mask = TF.affine(mask, 
                     affine_param[0], affine_param[1],
                     affine_param[2], affine_param[3])

    image = np.array(image)
    mask = np.array(mask)
    
    # Randome GaussianBlur -- only for images
    if random.random() < 0.25:
        sigma_param = random.uniform(0.01, 1)
        image = gaussian(image, sigma=sigma_param)
    
    # Randome Gaussian Noise -- only for images
    if random.random() < 0.25:
        factor_param = random.uniform(0.01, 0.5)
        image = image + factor_param * image.std() * np.random.randn(image.shape[0], image.shape[1])
    
    # Unsharp filter -- only for images
    if random.random() < 0.25:
        radius_param = random.uniform(0, 5)
        amount_param = random.uniform(0.5, 2)
        image = unsharp_mask(image, radius = radius_param, amount=amount_param)
    
    f, ax = plt.subplots(1, 2, figsize=(8, 8))
    ax[0].imshow(image)
    ax[1].imshow(mask)   

    return image, mask
Answered By: kikyo91

I think I have a simple solution:
If the images are concatenated, the transformations are applied to all of them identically:

import torch
import torchvision.transforms as T

# Create two fake images (identical for test purposes):
image = torch.randn((3, 128, 128))
target = image.clone()

# This is the trick (concatenate the images):
both_images = torch.cat((image.unsqueeze(0), target.unsqueeze(0)),0)

# Apply the transformations to both images simultaneously:
transformed_images = T.RandomRotation(180)(both_images)

# Get the transformed images:
image_trans = transformed_images[0]
target_trans = transformed_images[1]

# Compare the transformed images:
torch.all(image_trans == target_trans).item()

>> True
Answered By: Mario Galindo