How to check dimensions of all images in a directory using python?

Question:

I need to check the dimensions of images in a directory. Currently it has ~700 images.
I just need to check the sizes, and if the size does not match a given dimension, it will be moved to a different folder. How do I get started?

Asked By: john2x

||

Answers:

You can use the Python Imaging Library (aka PIL) to read the image headers and query the dimensions.

One way to approach it would be to write yourself a function that takes a filename and returns the dimensions (using PIL). Then use the os.path.walk function to traverse all the files in the directory, applying this function. Collecting the results, you can build a dictionary of mappings filename -> dimensions, then use a list comprehension (see itertools) to filter out those that do not match the required size.

Answered By: gavinb

One common way is to use PIL, the python imaging library to get the dimensions:

from PIL import Image
import os.path

filename = os.path.join('path', 'to', 'image', 'file')
img = Image.open(filename)
print img.size

Then you need to loop over the files in your directory, check the dimensions against your required dimensions, and move those files that do not match.

Answered By: mhawke

If you don’t need the rest of PIL and just want image dimensions of PNG, JPEG and GIF then this small function (BSD license) does the job nicely:

http://code.google.com/p/bfg-pages/source/browse/trunk/pages/getimageinfo.py

import StringIO
import struct

def getImageInfo(data):
    data = str(data)
    size = len(data)
    height = -1
    width = -1
    content_type = ''

    # handle GIFs
    if (size >= 10) and data[:6] in ('GIF87a', 'GIF89a'):
        # Check to see if content_type is correct
        content_type = 'image/gif'
        w, h = struct.unpack("<HH", data[6:10])
        width = int(w)
        height = int(h)

    # See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
    # Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
    # and finally the 4-byte width, height
    elif ((size >= 24) and data.startswith('211PNGrn32n')
          and (data[12:16] == 'IHDR')):
        content_type = 'image/png'
        w, h = struct.unpack(">LL", data[16:24])
        width = int(w)
        height = int(h)

    # Maybe this is for an older PNG version.
    elif (size >= 16) and data.startswith('211PNGrn32n'):
        # Check to see if we have the right content type
        content_type = 'image/png'
        w, h = struct.unpack(">LL", data[8:16])
        width = int(w)
        height = int(h)

    # handle JPEGs
    elif (size >= 2) and data.startswith('377330'):
        content_type = 'image/jpeg'
        jpeg = StringIO.StringIO(data)
        jpeg.read(2)
        b = jpeg.read(1)
        try:
            while (b and ord(b) != 0xDA):
                while (ord(b) != 0xFF): b = jpeg.read(1)
                while (ord(b) == 0xFF): b = jpeg.read(1)
                if (ord(b) >= 0xC0 and ord(b) <= 0xC3):
                    jpeg.read(3)
                    h, w = struct.unpack(">HH", jpeg.read(4))
                    break
                else:
                    jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0])-2)
                b = jpeg.read(1)
            width = int(w)
            height = int(h)
        except struct.error:
            pass
        except ValueError:
            pass

    return content_type, width, height
Answered By: John Slade

Here is a script that does what you need:

#!/usr/bin/env python

"""
Get information about images in a folder.
"""

from os import listdir
from os.path import isfile, join

from PIL import Image


def print_data(data):
    """
    Parameters
    ----------
    data : dict
    """
    for k, v in data.items():
        print("%s:t%s" % (k, v))
    print("Min width: %i" % data["min_width"])
    print("Max width: %i" % data["max_width"])
    print("Min height: %i" % data["min_height"])
    print("Max height: %i" % data["max_height"])


def main(path):
    """
    Parameters
    ----------
    path : str
        Path where to look for image files.
    """
    onlyfiles = [f for f in listdir(path) if isfile(join(path, f))]

    # Filter files by extension
    onlyfiles = [f for f in onlyfiles if f.endswith(".jpg")]

    data = {}
    data["images_count"] = len(onlyfiles)
    data["min_width"] = 10 ** 100  # No image will be bigger than that
    data["max_width"] = 0
    data["min_height"] = 10 ** 100  # No image will be bigger than that
    data["max_height"] = 0

    for filename in onlyfiles:
        im = Image.open(filename)
        width, height = im.size
        data["min_width"] = min(width, data["min_width"])
        data["max_width"] = max(width, data["max_width"])
        data["min_height"] = min(height, data["min_height"])
        data["max_height"] = max(height, data["max_height"])

    print_data(data)


if __name__ == "__main__":
    main(path=".")
Answered By: Martin Thoma

I am pretty much satisfied with the answers provided above as those helped me to write another simple answer for this question.

As the above answer have only scripts so the readers need to run to check these whether they works fine or not. So I decided to solve the problem using an interactive mode programming (using Python shell).

I think it will be clear to you. I am using Python 2.7.12 and I have installed Pillow library to use PIL for accessing images.I have a lot of jpg mages and 1 png image in my current directory.

Now let’s move on to Python shell.

>>> #Date of creation : 3 March 2017
>>> #Python version   : 2.7.12
>>>
>>> import os         #Importing os module
>>> import glob       #Importing glob module to list the same type of image files like jpg/png(here)
>>> 
>>> for extension in ["jpg", 'png']:
...     print "List of all " + extension + " files in current directory:-"
...     i = 1
...     for imgfile in glob.glob("*."+extension):
...         print i,") ",imgfile
...         i += 1
...     print "n"
... 
List of all jpg files in current directory:-
1 )  002-tower-babel.jpg
2 )  1454906.jpg
3 )  69151278-great-hd-wallpapers.jpg
4 )  amazing-ancient-wallpaper.jpg
5 )  Ancient-Rome.jpg
6 )  babel_full.jpg
7 )  Cuba-is-wonderfull.jpg
8 )  Cute-Polar-Bear-Images-07775.jpg
9 )  Cute-Polar-Bear-Widescreen-Wallpapers-07781.jpg
10 )  Hard-work-without-a-lh.jpg
11 )  jpeg422jfif.jpg
12 )  moscow-park.jpg
13 )  moscow_city_night_winter_58404_1920x1080.jpg
14 )  Photo1569.jpg
15 )  Pineapple-HD-Photos-03691.jpg
16 )  Roman_forum_cropped.jpg
17 )  socrates.jpg
18 )  socrates_statement1.jpg
19 )  steve-jobs.jpg
20 )  The_Great_Wall_of_China_at_Jinshanling-edit.jpg
21 )  torenvanbabel_grt.jpg
22 )  tower_of_babel4.jpg
23 )  valckenborch_babel_1595_grt.jpg
24 )  Wall-of-China-17.jpg


List of all png files in current directory:-
1 )  gergo-hungary.png


>>> #So let's display all the resolutions with the filename
... from PIL import Image   #Importing Python Imaging library(PIL)
>>> for extension in ["jpg", 'png']:
...     i = 1
...     for imgfile in glob.glob("*." + extension):
...         img = Image.open(imgfile)
...         print i,") ",imgfile,", resolution: ",img.size[0],"x",img.size[1]
...         i += 1
...     print "n"
... 
1 )  002-tower-babel.jpg , resolution:  1024 x 768
2 )  1454906.jpg , resolution:  1920 x 1080
3 )  69151278-great-hd-wallpapers.jpg , resolution:  5120 x 2880
4 )  amazing-ancient-wallpaper.jpg , resolution:  1920 x 1080
5 )  Ancient-Rome.jpg , resolution:  1000 x 667
6 )  babel_full.jpg , resolution:  1464 x 1142
7 )  Cuba-is-wonderfull.jpg , resolution:  1366 x 768
8 )  Cute-Polar-Bear-Images-07775.jpg , resolution:  1600 x 1067
9 )  Cute-Polar-Bear-Widescreen-Wallpapers-07781.jpg , resolution:  2300 x 1610
10 )  Hard-work-without-a-lh.jpg , resolution:  650 x 346
11 )  jpeg422jfif.jpg , resolution:  2048 x 1536
12 )  moscow-park.jpg , resolution:  1920 x 1200
13 )  moscow_city_night_winter_58404_1920x1080.jpg , resolution:  1920 x 1080
14 )  Photo1569.jpg , resolution:  480 x 640
15 )  Pineapple-HD-Photos-03691.jpg , resolution:  2365 x 1774
16 )  Roman_forum_cropped.jpg , resolution:  4420 x 1572
17 )  socrates.jpg , resolution:  852 x 480
18 )  socrates_statement1.jpg , resolution:  1280 x 720
19 )  steve-jobs.jpg , resolution:  1920 x 1080
20 )  The_Great_Wall_of_China_at_Jinshanling-edit.jpg , resolution:  4288 x 2848
21 )  torenvanbabel_grt.jpg , resolution:  1100 x 805
22 )  tower_of_babel4.jpg , resolution:  1707 x 956
23 )  valckenborch_babel_1595_grt.jpg , resolution:  1100 x 748
24 )  Wall-of-China-17.jpg , resolution:  1920 x 1200


1 )  gergo-hungary.png , resolution:  1236 x 928


>>> 
Answered By: hygull
import os
from PIL import Image 

folder_images = "/tmp/photos"
size_images = dict()

for dirpath, _, filenames in os.walk(folder_images):
    for path_image in filenames:
        image = os.path.abspath(os.path.join(dirpath, path_image))
        with Image.open(image) as img:
            width, heigth = img.size
            SIZE_IMAGES[path_image] = {'width': width, 'heigth': heigth}
print(size_images)

In folder_images you arrow directory where it is imagens.
size_images is a variable with the size of the images, in this format.

Example:

{'image_name.jpg' : {'width': 100, 'heigth': 100} }
Answered By: Elinaldo Monteiro

You can also use cv2 library to check the dimensions of images.

import cv2

# read image
img = cv2.imread('boarding_pass.png', cv2.IMREAD_UNCHANGED)

# get dimensions of image
dimensions = img.shape

# height, width, number of channels in image
height = img.shape[0]
width = img.shape[1]
channels = img.shape[2]

print('Image Dimension    : ',dimensions)
print('Image Height       : ',height)
print('Image Width        : ',width)
print('Number of Channels : ',channels)
Answered By: Deepak Kumrawat

In case you are using ipython / jupyter notebook, this function works like a charm. The command that comes handy is file command in linux terminal. Merits you ask? Here:

  • Blazing fast, suitable if the folder contains thousands of images and you need to know the distribution of image sizes
  • Doesn’t need to load the image in memory, thus saving the RAM overload
def get_image_size_faster(file_dir, ext='png'):
        """
        Function to retrieve image size without loading the image at all

        params:
        file_dir = path of the folder containing image files
        dim_index = index of image dimensions in the `file $file_path` call output
                    For PNG : -3 # Downloads/test.png: PNG image data, 4032 x 3024, 8-bit/color RGB, non-interlaced
                    For JPEG/JPG : -2 # Downloads/test.jpg: JPEG image data,..., baseline, precision 8, 2252x1400, components 3
                    For GIF : -1 # Downloads/test.gif: GIF image data, version 89a, 498 x 373
        """
        dim_index_map = {
            'png' : -3,
            'jpg' : -2,
            'jpeg': -2,
            'gif' : -1
        }

        dim_index = dim_index_map[ext]

        files_regex = "{file_dir}/*.{ext}".format(file_dir=file_dir, ext=ext)
        outputs = !file $files_regex
        dims = [tuple(map(int, x.split(',')[dim_index].strip().split('x'))) for x in outputs]
        return dims

One can write the python-script alternative for this function using subprocess package which yeilds the same result

Answered By: Mainak Chain

I tried to use @JohnTESlade answer but I had problems with byte – string conversion, so I corrected it, followed a few PEP’s, and added support for EMF types, which I needed.

def get_image_info(data: bytes) -> Tuple[str, int, int]:
    size = len(data)
    height = -1
    width = -1
    content_type = ''

    # handle GIFs
    if (size >= 10) and data[:6] in (b'GIF87a', b'GIF89a'):
        # Check to see if content_type is correct
        content_type = 'image/gif'
        w, h = struct.unpack("<HH", data[6:10])
        width = int(w)
        height = int(h)

    # See PNG 2. Edition spec (http://www.w3.org/TR/PNG/)
    # Bytes 0-7 are below, 4-byte chunk length, then 'IHDR'
    # and finally the 4-byte width, height
    elif ((size >= 24) and data[0:8] == b'211PNGrn32n'
          and (data[12:16] == b'IHDR')):
        content_type = 'image/png'
        w, h = struct.unpack(">LL", data[16:24])
        width = int(w)
        height = int(h)

    # Maybe this is for an older PNG version.
    elif (size >= 16) and data[0:8] == b'211PNGrn32n':
        # Check to see if we have the right content type
        content_type = 'image/png'
        w, h = struct.unpack(">LL", data[8:16])
        width = int(w)
        height = int(h)

    # handle JPEGs
    elif (size >= 2) and data[0:2] == b'377330':
        content_type = 'image/jpeg'
        jpeg = BytesIO(data)
        jpeg.read(2)
        b = jpeg.read(1)
        w, h = -1, -1
        try:
            while b and ord(b) != 0xDA:
                while ord(b) != 0xFF:
                    b = jpeg.read(1)
                while ord(b) == 0xFF:
                    b = jpeg.read(1)
                if 0xC0 <= ord(b) <= 0xC3:
                    jpeg.read(3)
                    h, w = struct.unpack(">HH", jpeg.read(4))
                    break
                else:
                    jpeg.read(int(struct.unpack(">H", jpeg.read(2))[0]) - 2)
                b = jpeg.read(1)
            width = int(w)
            height = int(h)
        except struct.error:
            pass
        except ValueError:
            pass

    # Maybe this will work for most EMF types.
    elif (size >= 40) and data[0:4] == b'01000000':
        # Check to see if we have the right content type
        content_type = 'image/x-emf'
        x, y, r, b = struct.unpack("<LLLL", data[24:40])
        width = int(r - x)
        height = int(b - y)

    return content_type, width, height

Answered By: dmcontador

This is based off Elinaldo Monteiro’s answer, except it puts all the file names and dimensions into a dict/list.

import os
import pandas as pd
from PIL import Image 

folder_images = "/tmp/photos"
size_images = []

for dirpath, _, filenames in os.walk(folder_images):
    for path_image in filenames:
        image = os.path.abspath(os.path.join(dirpath, path_image))
        with Image.open(image) as img:
            width, height = img.size
            size_images.append(
                {
                    'image': path_image,
                    'width': width,
                    'height': height
                }
            )
    pd.DataFrame(size_images)
print(size_images)

In folder_images you specify targeted directory.
size_images is a list containing all files’ names and dimensions.

Example:

[{'image':'image_name.jpg', 'width':100, 'height':100}]
Answered By: want2mtb
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.