Elegant way in python to make sure a string is suitable as a filename?

Question:

I want to use a user-provided string as a filename for exporting, but have to make sure that the string is permissible on my system as a filename. From my side it would be OK to replace any forbidden character with e.g. ‘_’.

Here I found a list of forbidden characters for filenames.

It should be easy enough to use the str.replace() function, I was just wondering if there is already something out there that does that, potentially even taking into account what OS I am on.

Asked By: Matthias Arras

||

Answers:

pathvalidate is a Python library to sanitize/validate a string such as filenames/file-paths/etc.

This library provides both utilities for validation of paths:

import sys
from pathvalidate import ValidationError, validate_filename

try:
    validate_filename("fi:l*e/p"a?t>h|.t<xt")
except ValidationError as e:
    print("{}n".format(e), file=sys.stderr)

And utilities for sanitizing paths:

from pathvalidate import sanitize_filename

fname = "fi:l*e/p"a?t>h|.t<xt"
print("{} -> {}".format(fname, sanitize_filename(fname)))
Answered By: kmaork

A better solution may be for you to store the files locally using generated filenames that are guaranteed to be unique and file system safe (any UUID generator would do, for example). Maintain a simple database that maps between the original filename and the UUID for later use.

Answered By: jarmod

Depending on your use case it might be easier to whitelist characters that are allowed in filename instead of attempting to construct a blacklist.

A canonical way would be to check if each character in your filename to be is contained in the list of portable posix filename characters.

https://www.ibm.com/docs/en/zos/2.1.0?topic=locales-posix-portable-file-name-character-set

Uppercase A to Z
Lowercase a to z
Numbers 0 to 9
Period (.)
Underscore (_)
Hyphen (-)

Based on this you can then:

ok = ".-_0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
for character in filename:
    assert character in ok

        
Answered By: Markus Hirsimäki
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.