Call exiftool from a python script?

Question:

I’m looking to use exiftool to scan the EXIF tags from my photos and videos. It’s a perl executable. What’s the best way to inferface with this? Are there any Python libraries to do this already? Or should I directly call the executable and parse the output? (The latter seems dirty.) Thanks.

The reason I ask is this because I am currently using pyexiv2, which does not have support for videos. Perl’s exiftool has very broad support for images and videos, and I’d like to use it.

Asked By: ensnare

||

Answers:

To avoid launching a new process for each image, you should start exiftool using the -stay_open flag. You can then send commands to the process via stdin, and read the output on stdout. ExifTool supports JSON output, which is probably the best option for reading the metadata.

Here’s a simple class that launches an exiftool process and features an execute() method to send commands to that process. I also included get_metadata() to read the metadata in JSON format:

import subprocess
import os
import json

class ExifTool(object):

    sentinel = "{ready}n"

    def __init__(self, executable="/usr/bin/exiftool"):
        self.executable = executable

    def __enter__(self):
        self.process = subprocess.Popen(
            [self.executable, "-stay_open", "True",  "-@", "-"],
            stdin=subprocess.PIPE, stdout=subprocess.PIPE)
        return self

    def  __exit__(self, exc_type, exc_value, traceback):
        self.process.stdin.write("-stay_opennFalsen")
        self.process.stdin.flush()

    def execute(self, *args):
        args = args + ("-executen",)
        self.process.stdin.write(str.join("n", args))
        self.process.stdin.flush()
        output = ""
        fd = self.process.stdout.fileno()
        while not output.endswith(self.sentinel):
            output += os.read(fd, 4096)
        return output[:-len(self.sentinel)]

    def get_metadata(self, *filenames):
        return json.loads(self.execute("-G", "-j", "-n", *filenames))

This class is written as a context manager to ensure the process is exited if you are done. You can use it as

with ExifTool() as e:
    metadata = e.get_metadata(*filenames)

EDIT for python 3:
To get this to work in python 3 two small changes are needed. The first is an additional argument to subprocess.Popen:

self.process = subprocess.Popen(
         [self.executable, "-stay_open", "True",  "-@", "-"],
         universal_newlines=True,
         stdin=subprocess.PIPE, stdout=subprocess.PIPE)

The second is that you have to decode the byte series returned by os.read():

output += os.read(fd, 4096).decode('utf-8')

EDIT for Windows: To get this working on Windows, the sentinel need to be changed into "{ready}rn", i.e.

sentinel = "{ready}rn"

Otherwise the program will hang because the while loop inside execute() won’t stop

Answered By: Sven Marnach

using this as reference …

import subprocess
import os
import json

class ExifTool(object):

    sentinel = "{ready}n"

    def __init__(self, executable="/usr/bin/exiftool"):
        self.executable = executable

    def __enter__(self):
        self.process = subprocess.Popen(
            [self.executable, "-stay_open", "True",  "-@", "-"],
            stdin=subprocess.PIPE, stdout=subprocess.PIPE)
        return self

    def  __exit__(self, exc_type, exc_value, traceback):
        self.process.stdin.write("-stay_opennFalsen")
        self.process.stdin.flush()

    def execute(self, *args):
        args = args + ("-executen",)
        self.process.stdin.write(str.join("n", args))
        self.process.stdin.flush()
        output = ""
        fd = self.process.stdout.fileno()
        while not output.endswith(self.sentinel):
            output += os.read(fd, 4096)
        return output[:-len(self.sentinel)]

    def get_metadata(self, *filenames):
        return json.loads(self.execute("-G", "-j", "-n", *filenames))

… return follow ERROR using Python 3.8.10 and IPYTHON;

" AttributeError Traceback (most recent
call last) in
56
57 e = ExifTool()
—> 58 e.load_metadata_lookup(‘/u02/RECOVERY/’)

in load_metadata_lookup(self, locDir)
51 ‘n FILELOC > ‘, FileLoc, ‘n’)
52
—> 53 self.get_metadata(FileLoc)
54
55

in get_metadata(self, FileLoc)
38
39 def get_metadata(self, FileLoc):
—> 40 return json.loads(self.execute("-G", "-j", "-n", FileLoc))
41
42

in execute(self, *args)
28 def execute(self, *args):
29 args = args + ("-executen",)
—> 30 self.process.stdin.write(str.join("n", args))
31 self.process.stdin.flush()
32 output = ""

AttributeError: ‘ExifTool’ object has no attribute ‘process’

then with some modifications … SUCCESS!!! …
modified and adaptations using
[https://stackoverflow.com/users/279627/sven-marnach]

#!/usr/local/bin/python3
#! -*- coding: utf-8-mb4 -*-
from __future__ import absolute_import

import sys
import os
import subprocess
import json

headers_infos = """
.:.
.:. box33 | systems | platform |
.:. [   Renan Moura     ]
.:. [   ver.: 9.1.2-b   ]
.:.
"""

class ExifTool(object):
    sentinel = "{ready}n"
    def __init__(self):
        self.executable         = "/usr/bin/exiftool"
        self.metadata_lookup    = {}

    def  __exit__(self, exc_type, exc_value, traceback):
        self.process.stdin.write("-stay_opennFalsen")
        self.process.stdin.flush()

    def execute(self, *args):
        self.process = subprocess.Popen([self.executable, "-stay_open", "True",  "-@", "-"],
            universal_newlines  = True                          ,
            stdin               = subprocess.PIPE               ,
            stdout              = subprocess.PIPE               ,
            stderr              = subprocess.STDOUT
        )

        args = (args + ("-executen",))

        self.process.stdin.write(str.join("n", args))
        self.process.stdin.flush()

        output  = ""
        fd      = self.process.stdout.fileno()

        while not output.endswith(self.sentinel):
            output += os.read(fd, 4096).decode('utf-8')

        return output[:-len(self.sentinel)]

    def get_metadata(self, *FileLoc):
        return json.loads(self.execute("-G", "-j", "-n", *FileLoc))

    def load_metadata_lookup(self, locDir):
        self.metadata_lookup = {}
        for dirname, dirnames, filenames in os.walk(locDir):
            for filename in filenames:
                FileLoc=(dirname + '/' + filename)
                print(  'n FILENAME    > ', filename,
                        'n DIRNAMES    > ', dirnames,
                        'n DIRNAME     > ', dirname,
                        'n FILELOC     > ', FileLoc, 'n')

                self.metadata_lookup = self.get_metadata(FileLoc)
                print(json.dumps(self.metadata_lookup, indent=3))

e = ExifTool()
e.load_metadata_lookup('/u02/RECOVERY/')

… NOTE
this code ll from "/u02/RECOVERY/" … directory find and execute on every document found …
hope this could help u …

This still turns up in searches, but using stay open etc, is barely faster than just running up subprocess individually for each file. exiftool will handle any number of files at a time, so to process a bunch of files, and extract specific metadata fields, this goes about 8 times faster on my laptop:

def meta_for_batch(fieldlist, filelist):
batchup = ['exiftool', '-j'] + fieldlist + filelist
res = subprocess.run(batchup, capture_output=True)
if res.returncode != 0:
    errdata=('STDERR:n' + res.stderr.decode() if isinstance(res.stderr, bytes) else res.stderr + 
            'STDOUT:' + res.stdout.decode() if isinstance(res.stdout, bytes) else res.stdout)
    raise ValueError(errdata)
return json.loads(res.stdout)

Windoze may have a problem with long lists of files, so you my need to do it using an argfile. On Linux I have run this with 600 files, and it took 11.5 seconds

For example:

reclist = meta_for_batch(
['-FocalLength', '-ImageHeight', '-ISO', 
'-Model', '-CameraTemperature', '-SerialNumber', 
'-ExposureTime', '-LensSerialNumber', 
'-ImageWidth', '-LensModel', '-SubSecCreateDate'],
['_23A1000.CR3', '_23A1002.CR3', '_23A1003.CR3'])

Use exiftool use -args to get the field names

Answered By: pootle
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.