How to load a dataset starting from list of images Pytorch

Question:

I have a service that receives images in a binary format from another service (let’s call it service B):

from PIL import Image

img_list = []
img_bin = get_image_from_service_B()
image = Image.open(io.BytesIO(img_bin)) # Convert bytes to image using PIL

When an image is successfully converted thanks to PIL it is also appended to a list of images.

img_list.append(image)    

When I’ve enough images I want to load my list of images using Pytorch as if it was a dataset

if img_list.__len__() == 500:
     ### Load dataset and do a transform operation on the data

In a previous version of the software the requirement was simply to retrieve the images from a folder, so it was quite simple to load all the images

my_dataset = datasets.ImageFolder("path/to/images/folder/", transform=transform)
dataset_iterator = DataLoader(my_dataset, batch_size=1)

Now my issue is how to perform the transform and load the dataset from a list.

Asked By: Tajinder Singh

||

Answers:

You can simply write a custom dataset:

class MyDataset(torch.utils.data.Dataset):
    def __init__(self, img_list, augmentations):
        super(MyDataset, self).__init__()
        self.img_list = img_list
        self.augmentations = augmentations

    def __len__(self):
        return len(self.img_list)

    def __getitem__(self, idx):
        img = self.img_list[idx]
        return self.augmentations(img)
  

You can now plug this custom dataset into DataLoader and you are done.

Answered By: Shai
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.