How to check if a file is a valid image file?

Question

I am currently using PIL.

from PIL import Image
try:
    im=Image.open(filename)
    # do stuff
except IOError:
    # filename not an image file

However, while this sufficiently covers most cases, some image files like, xcf, svg and psd are not being detected. Psd files throws an OverflowError exception.

Is there someway I could include them as well?

Asked By: Sujoy

||

Source

Answer 1

A lot of times the first couple chars will be a magic number for various file formats. You could check for this in addition to your exception checking above.

Answered By: Brian R. Bondy

Answer 2

On Linux, you could use python-magic which uses libmagic to identify file formats.

AFAIK, libmagic looks into the file and tries to tell you more about it than just the format, like bitmap dimensions, format version etc.. So you might see this as a superficial test for "validity".

For other definitions of "valid" you might have to write your own tests.

Answered By: fmarc

Answer 3

In addition to what Brian is suggesting you could use PIL’s verify method to check if the file is broken.

im.verify()

Attempts to determine if the file is
broken, without actually decoding the
image data. If this method finds any
problems, it raises suitable
exceptions. This method only works on
a newly opened image; if the image has
already been loaded, the result is
undefined. Also, if you need to load
the image after using this method, you
must reopen the image file. Attributes

Answered By: Nadia Alramli

Answer 4

You could use the Python bindings to libmagic, python-magic and then check the mime types. This won’t tell you if the files are corrupted or intact but it should be able to determine what type of image it is.

Answered By: Kamil Kisiel

Answer 5

I have just found the builtin imghdr module. From python documentation:

The imghdr module determines the type
of image contained in a file or byte
stream.

This is how it works:

>>> import imghdr
>>> imghdr.what('/tmp/bass')
'gif'

Using a module is much better than reimplementing similar functionality

UPDATE: imghdr is deprecated as of python 3.11

Answered By: Nadia Alramli

Answer 6

Update

I also implemented the following solution in my Python script here on GitHub.

I also verified that damaged files (jpg) frequently are not ‘broken’ images i.e, a damaged picture file sometimes remains a legit picture file, the original image is lost or altered but you are still able to load it with no errors. But, file truncation cause always errors.

End Update

You can use Python Pillow(PIL) module, with most image formats, to check if a file is a valid and intact image file.

In the case you aim at detecting also broken images, @Nadia Alramli correctly suggests the im.verify() method, but this does not detect all the possible image defects, e.g., im.verify does not detect truncated images (that most viewers often load with a greyed area).

Pillow is able to detect these type of defects too, but you have to apply image manipulation or image decode/recode in or to trigger the check. Finally I suggest to use this code:

from PIL import Image

try:
  im = Image.load(filename)
  im.verify() #I perform also verify, don't know if he sees other types o defects
  im.close() #reload is necessary in my case
  im = Image.load(filename) 
  im.transpose(Image.FLIP_LEFT_RIGHT)
  im.close()
except: 
  #manage excetions here

In case of image defects this code will raise an exception.
Please consider that im.verify is about 100 times faster than performing the image manipulation (and I think that flip is one of the cheaper transformations).
With this code you are going to verify a set of images at about 10 MBytes/sec with standard Pillow or 40 MBytes/sec with Pillow-SIMD module (modern 2.5Ghz x86_64 CPU).

For the other formats xcf,.. you can use Imagemagick wrapper Wand, the code is as follows:
Check the Wand documentation: here, to installation: here

im = wand.image.Image(filename=filename)
temp = im.flip;
im.close()

But, from my experiments Wand does not detect truncated images, I think it loads lacking parts as greyed area without prompting.

I red that Imagemagick has an external command identify that could make the job, but I have not found a way to invoke that function programmatically and I have not tested this route.

I suggest to always perform a preliminary check, check the filesize to not be zero (or very small), is a very cheap idea:

import os

statfile = os.stat(filename)
filesize = statfile.st_size
if filesize == 0:
  #manage here the 'faulty image' case

Answered By: Fabiano Tarlao

Answer 7

Additionally to the PIL image check you can also add file name extension check like this:

filename.lower().endswith(('.png', '.jpg', '.jpeg', '.tiff', '.bmp', '.gif'))

Note that this only checks if the file name has a valid image extension, it does not actually open the image to see if it’s a valid image, that’s why you need to use additionally PIL or one of the libraries suggested in the other answers.

Answered By: tsveti_iko

Answer 8

format = [".jpg",".png",".jpeg"]
 for (path,dirs,files) in os.walk(path):
     for file in files:
         if file.endswith(tuple(format)):
             print(path)
             print ("Valid",file)
         else:
             print(path)
             print("InValid",file)

Answered By: rObinradOO

Answer 9

One option is to use the filetype package.

Installation

python -m pip install filetype

Advantages

Fast: Does its work by loading only the first few bytes of your image (check on the magic number)
Supports different mime type: Images, Videos, Fonts, Audio, Archives.

Example

filetype >= 1.0.7

import filetype

filename = "/path/to/file.jpg"

if filetype.is_image(filename):
    print(f"{filename} is a valid image...")
elif filetype.is_video(filename):
    print(f"{filename} is a valid video...")

filetype <= 1.0.6

import filetype

filename = "/path/to/file.jpg"

if filetype.image(filename):
    print(f"{filename} is a valid image...")
elif filetype.video(filename):
    print(f"{filename} is a valid video...")

Additional information on the official repo: https://github.com/h2non/filetype.py

Answered By: Alex Fortin

Answer 10

Adapting from Fabiano and Tiago’s answer.

from PIL import Image

def check_img(filename):
    try:
        im = Image.open(filename)
        im.verify()
        im.close()
        im = Image.open(filename) 
        im.transpose(Image.FLIP_LEFT_RIGHT)
        im.close()
        return True
    except: 
        print(filename,'corrupted')
        return False

if not check_img('/dir/image'):
    print('do something')

Answered By: durranaik

Answer 11

Extension of the image can be used to check image file as follows.

import os
for f in os.listdir(folderPath):
    if (".jpg" in f) or (".bmp" in f):
        filePath = os.path.join(folderPath, f)

Answered By: Nuwan madhusanka

How to check if a file is a valid image file?

Question:

Answers: