In Python, how can I get the correctly-cased path for a file?
Question:
Windows uses case-insensitive file names, so I can open the same file with any of these:
r"c:windowssystem32desktop.ini"
r"C:WINdowsSystem32DESKTOP.ini"
r"C:WiNdOwSSyStEm32DeSkToP.iNi"
etc. Given any of these paths, how can I find the true case? I want them all to produce:
r"C:WindowsSystem32desktop.ini"
os.path.normcase
doesn’t do it, it simply lowercases everything. os.path.abspath
returns an absolute path, but each of these is already absolute, and so it doesn’t change any of them. os.path.realpath
is only used to resolve symbolic links, which Windows doesn’t have, so it’s the same as abspath on Windows.
Is there a straightforward way to do this?
Answers:
Since the definition of “true case” on NTFS (or VFAT) filesystems is truly bizarre, it seems the best way would be to walk the path and match against os.listdir().
Yes, this seems like a contrived solution but so are NTFS paths. I don’t have a DOS machine to test this on.
I would use os.walk
, but I think that for diskw with many directories it may be time consuming:
fname = "g:\miCHal\ZzZ.tXt"
if not os.path.exists(fname):
print('No such file')
else:
d, f = os.path.split(fname)
dl = d.lower()
fl = f.lower()
for root, dirs, files in os.walk('g:\'):
if root.lower() == dl:
fn = [n for n in files if n.lower() == fl][0]
print(os.path.join(root, fn))
break
This python-win32 thread has an answer that doesn’t require third-party packages or walking the tree:
import ctypes
def getLongPathName(path):
buf = ctypes.create_unicode_buffer(260)
GetLongPathName = ctypes.windll.kernel32.GetLongPathNameW
rv = GetLongPathName(path, buf, 260)
if rv == 0 or rv > 260:
return path
else:
return buf.value
Ned’s GetLongPathName
answer doesn’t quite work (at least not for me). You need to call GetLongPathName
on the return value of GetShortPathname
. Using pywin32 for brevity (a ctypes solution would look similar to Ned’s):
>>> win32api.GetLongPathName(win32api.GetShortPathName('stopservices.vbs'))
'StopServices.vbs'
Here’s a simple, stdlib only, solution:
import glob
def get_actual_filename(name):
name = "%s[%s]" % (name[:-1], name[-1])
return glob.glob(name)[0]
Ethan answer correct only file name, not subfolders names on the path.
Here is my guess:
def get_actual_filename(name):
dirs = name.split('\')
# disk letter
test_name = [dirs[0].upper()]
for d in dirs[1:]:
test_name += ["%s[%s]" % (d[:-1], d[-1])]
res = glob.glob('\'.join(test_name))
if not res:
#File not found
return None
return res[0]
I prefer the approach of Ethan and xvorsx. AFAIK, the following wouldn’t also harm on other platforms:
import os.path
from glob import glob
def get_actual_filename(name):
sep = os.path.sep
parts = os.path.normpath(name).split(sep)
dirs = parts[0:-1]
filename = parts[-1]
if dirs[0] == os.path.splitdrive(name)[0]:
test_name = [dirs[0].upper()]
else:
test_name = [sep + dirs[0]]
for d in dirs[1:]:
test_name += ["%s[%s]" % (d[:-1], d[-1])]
path = glob(sep.join(test_name))[0]
res = glob(sep.join((path, filename)))
if not res:
#File not found
return None
return res[0]
Based off a couple of the listdir/walk examples above, but supports UNC paths
def get_actual_filename(path):
orig_path = path
path = os.path.normpath(path)
# Build root to start searching from. Different for unc paths.
if path.startswith(r'\'):
path = path.lstrip(r'\')
path_split = path.split('\')
# listdir doesn't work on just the machine name
if len(path_split) < 3:
return orig_path
test_path = r'\{}{}'.format(path_split[0], path_split[1])
start = 2
else:
path_split = path.split('\')
test_path = path_split[0] + '\'
start = 1
for i in range(start, len(path_split)):
part = path_split[i]
if os.path.isdir(test_path):
for name in os.listdir(test_path):
if name.lower() == part.lower():
part = name
break
test_path = os.path.join(test_path, part)
else:
return orig_path
return test_path
I was just struggling with the same problem. I’m not sure, but I think the previous answers do not cover all cases. My actual problem was that the drive letter casing was different than the one seen by the system. Here is my solution that also checks for the correct drive letter casing (using win32api):
def get_case_sensitive_path(path):
"""
Get case sensitive path based on not - case sensitive path.
Returns:
The real absolute path.
Exceptions:
ValueError if the path doesn't exist.
Important note on Windows: when starting command line using
letter cases different from the actual casing of the files / directories,
the interpreter will use the invalid cases in path (e. g. os.getcwd()
returns path that has cases different from actuals).
When using tools that are case - sensitive, this will cause a problem.
Below code is used to get path with exact the same casing as the
actual.
See http://stackoverflow.com/questions/2113822/python-getting-filename-case-as-stored-in-windows
"""
drive, path = os.path.splitdrive(os.path.abspath(path))
path = path.lstrip(os.sep)
path = path.rstrip(os.sep)
folders = []
# Make sure the drive number is also in the correct casing.
drives = win32api.GetLogicalDriveStrings()
drives = drives.split("
Windows uses case-insensitive file names, so I can open the same file with any of these:
r"c:windowssystem32desktop.ini"
r"C:WINdowsSystem32DESKTOP.ini"
r"C:WiNdOwSSyStEm32DeSkToP.iNi"
etc. Given any of these paths, how can I find the true case? I want them all to produce:
r"C:WindowsSystem32desktop.ini"
os.path.normcase
doesn’t do it, it simply lowercases everything. os.path.abspath
returns an absolute path, but each of these is already absolute, and so it doesn’t change any of them. os.path.realpath
is only used to resolve symbolic links, which Windows doesn’t have, so it’s the same as abspath on Windows.
Is there a straightforward way to do this?
Since the definition of “true case” on NTFS (or VFAT) filesystems is truly bizarre, it seems the best way would be to walk the path and match against os.listdir().
Yes, this seems like a contrived solution but so are NTFS paths. I don’t have a DOS machine to test this on.
I would use os.walk
, but I think that for diskw with many directories it may be time consuming:
fname = "g:\miCHal\ZzZ.tXt"
if not os.path.exists(fname):
print('No such file')
else:
d, f = os.path.split(fname)
dl = d.lower()
fl = f.lower()
for root, dirs, files in os.walk('g:\'):
if root.lower() == dl:
fn = [n for n in files if n.lower() == fl][0]
print(os.path.join(root, fn))
break
This python-win32 thread has an answer that doesn’t require third-party packages or walking the tree:
import ctypes
def getLongPathName(path):
buf = ctypes.create_unicode_buffer(260)
GetLongPathName = ctypes.windll.kernel32.GetLongPathNameW
rv = GetLongPathName(path, buf, 260)
if rv == 0 or rv > 260:
return path
else:
return buf.value
Ned’s GetLongPathName
answer doesn’t quite work (at least not for me). You need to call GetLongPathName
on the return value of GetShortPathname
. Using pywin32 for brevity (a ctypes solution would look similar to Ned’s):
>>> win32api.GetLongPathName(win32api.GetShortPathName('stopservices.vbs'))
'StopServices.vbs'
Here’s a simple, stdlib only, solution:
import glob
def get_actual_filename(name):
name = "%s[%s]" % (name[:-1], name[-1])
return glob.glob(name)[0]
Ethan answer correct only file name, not subfolders names on the path.
Here is my guess:
def get_actual_filename(name):
dirs = name.split('\')
# disk letter
test_name = [dirs[0].upper()]
for d in dirs[1:]:
test_name += ["%s[%s]" % (d[:-1], d[-1])]
res = glob.glob('\'.join(test_name))
if not res:
#File not found
return None
return res[0]
I prefer the approach of Ethan and xvorsx. AFAIK, the following wouldn’t also harm on other platforms:
import os.path
from glob import glob
def get_actual_filename(name):
sep = os.path.sep
parts = os.path.normpath(name).split(sep)
dirs = parts[0:-1]
filename = parts[-1]
if dirs[0] == os.path.splitdrive(name)[0]:
test_name = [dirs[0].upper()]
else:
test_name = [sep + dirs[0]]
for d in dirs[1:]:
test_name += ["%s[%s]" % (d[:-1], d[-1])]
path = glob(sep.join(test_name))[0]
res = glob(sep.join((path, filename)))
if not res:
#File not found
return None
return res[0]
Based off a couple of the listdir/walk examples above, but supports UNC paths
def get_actual_filename(path):
orig_path = path
path = os.path.normpath(path)
# Build root to start searching from. Different for unc paths.
if path.startswith(r'\'):
path = path.lstrip(r'\')
path_split = path.split('\')
# listdir doesn't work on just the machine name
if len(path_split) < 3:
return orig_path
test_path = r'\{}{}'.format(path_split[0], path_split[1])
start = 2
else:
path_split = path.split('\')
test_path = path_split[0] + '\'
start = 1
for i in range(start, len(path_split)):
part = path_split[i]
if os.path.isdir(test_path):
for name in os.listdir(test_path):
if name.lower() == part.lower():
part = name
break
test_path = os.path.join(test_path, part)
else:
return orig_path
return test_path
I was just struggling with the same problem. I’m not sure, but I think the previous answers do not cover all cases. My actual problem was that the drive letter casing was different than the one seen by the system. Here is my solution that also checks for the correct drive letter casing (using win32api):
def get_case_sensitive_path(path):
"""
Get case sensitive path based on not - case sensitive path.
Returns:
The real absolute path.
Exceptions:
ValueError if the path doesn't exist.
Important note on Windows: when starting command line using
letter cases different from the actual casing of the files / directories,
the interpreter will use the invalid cases in path (e. g. os.getcwd()
returns path that has cases different from actuals).
When using tools that are case - sensitive, this will cause a problem.
Below code is used to get path with exact the same casing as the
actual.
See http://stackoverflow.com/questions/2113822/python-getting-filename-case-as-stored-in-windows
"""
drive, path = os.path.splitdrive(os.path.abspath(path))
path = path.lstrip(os.sep)
path = path.rstrip(os.sep)
folders = []
# Make sure the drive number is also in the correct casing.
drives = win32api.GetLogicalDriveStrings()
drives = drives.split("