Python tarfile module overwrites existing files during extraction – how to disable it?

Question:

Is there a way prevent tarfile.extractall (API) from overwriting existing files? By “prevent” I mean ideally raising an exception when an overwrite is about to happen. The current behavior is to silently overwrite the files.

Asked By: Sridhar Ratnakumar

||

Answers:

You could check result of tarfile.getnames against the existing files and raise your error.

Answered By: SilentGhost

Have you tried setting tarfile.errorlevel to 2? That will cause non-fatal errors to be raised. I’m assuming an overwrite falls in that category.

Answered By: Karl Bielefeldt

I have a similar situation, where I only want to extract if all the files have not yet already been extracted. I use the following function to check if archive has already been extracted to extract_dir:

from pathlib import Path
import tarfile

def is_extracted(archive, extract_dir):
    tar = tarfile.open(archive)
    filenames = tar.getnames()
    return all([(Path(extract_dir) / f).exists() for f in filenames])
Answered By: James Hirschorn
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.