Conversion to .lmdb format of PNG Images

Question:

I ran a python script below to convert my png images to .lmdb files:

import sys
import os
import os.path
import glob
import pickle
import lmdb
import cv2

sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
from utils.progress_bar import ProgressBar

# configurations
img_folder = "/content/drive/MyDrive/sub-dataset-2/divk/*"  # glob matching pattern
lmdb_save_path = "/content/drive/MyDrive/sub-dataset-2/trainHR_lmdb/" #'/mnt/SSD/xtwang/BasicSR_datasets/DIV2K800/DIV2K800.lmdb'  # must end with .lmdb

img_list = sorted(glob.glob(img_folder))
dataset = []
data_size = 0

print('Read images...')
pbar = ProgressBar(len(img_list))
for i, v in enumerate(img_list):
    pbar.update('Read {}'.format(v))
    img = cv2.imread(v, cv2.IMREAD_UNCHANGED)
    dataset.append(img)
    data_size += img.nbytes
env = lmdb.open(lmdb_save_path, map_size=data_size * 10)
print('Finish reading {} images.nWrite lmdb...'.format(len(img_list)))

pbar = ProgressBar(len(img_list))
with env.begin(write=True) as txn:  # txn is a Transaction object
    for i, v in enumerate(img_list):
        pbar.update('Write {}'.format(v))
        base_name = os.path.splitext(os.path.basename(v))[0]
        key = base_name.encode('ascii')
        data = dataset[i]
        if dataset[i].ndim == 2:
            H, W = dataset[i].shape
            C = 1
        else:
            H, W, C = dataset[i].shape
        meta_key = (base_name + '.meta').encode('ascii')
        meta = '{:d}, {:d}, {:d}'.format(H, W, C)
        # The encode is only essential in Python 3
        txn.put(key, data)
        txn.put(meta_key, meta.encode('ascii'))
print('Finish writing lmdb.')

# create keys cache
keys_cache_file = os.path.join(lmdb_save_path, '_keys_cache.p')
env = lmdb.open(lmdb_save_path, readonly=True, lock=False, readahead=False, meminit=False)
with env.begin(write=False) as txn:
    print('Create lmdb keys cache: {}'.format(keys_cache_file))
    keys = [key.decode('ascii') for key, _ in txn.cursor()]
    pickle.dump(keys, open(keys_cache_file, "wb"))
print('Finish creating lmdb keys cache.')

After I ran the script in google colab, I got this error:

 Read images...
[                                                  ] 0/9, elapsed: 0s, ETA:
Start...
[█████---------------------------------------------] 1/9, 246723.8 task/s, elapsed: 0s, ETA:     0s
Read /content/drive/MyDrive/sub-dataset-2/divk/0843.png
[███████████---------------------------------------] 2/9, 44.3 task/s, elapsed: 0s, ETA:     0s
Read /content/drive/MyDrive/sub-dataset-2/divk/0868.png
libpng error: Read Error
Traceback (most recent call last):
  File "/content/rs-esrgan/scripts/create_lmdb.py", line 26, in <module>
    data_size += img.nbytes
AttributeError: 'NoneType' object has no attribute 'nbytes'

How can I run the script successfully?

Asked By: Stud17

||

Answers:

When using imread, you need to check its return. If something is wrong with the filename, the file could not be found. If there are permission problems, the file could not be read. If you have malformed image data, decoding might fail. And so on.

Example:

for i, v in enumerate(img_list):
    pbar.update('Read {}'.format(v))
    img = cv2.imread(v, cv2.IMREAD_UNCHANGED)
    if not img:
        print(f"Error! Image {v} could not be read! Ignoring")
        continue
    dataset.append(img)
    data_size += img.nbytes
Answered By: ypnos
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.