In Python, how can I get the correctly-cased path for a file?

Question

Windows uses case-insensitive file names, so I can open the same file with any of these:

r"c:windowssystem32desktop.ini"
r"C:WINdowsSystem32DESKTOP.ini"
r"C:WiNdOwSSyStEm32DeSkToP.iNi"

etc. Given any of these paths, how can I find the true case? I want them all to produce:

r"C:WindowsSystem32desktop.ini"

os.path.normcase doesn’t do it, it simply lowercases everything. os.path.abspath returns an absolute path, but each of these is already absolute, and so it doesn’t change any of them. os.path.realpath is only used to resolve symbolic links, which Windows doesn’t have, so it’s the same as abspath on Windows.

Is there a straightforward way to do this?

Asked By: Ned Batchelder

||

Source

Answer 1

Since the definition of “true case” on NTFS (or VFAT) filesystems is truly bizarre, it seems the best way would be to walk the path and match against os.listdir().

Yes, this seems like a contrived solution but so are NTFS paths. I don’t have a DOS machine to test this on.

Answered By: msw

Answer 2

I would use os.walk, but I think that for diskw with many directories it may be time consuming:

fname = "g:\miCHal\ZzZ.tXt"
if not os.path.exists(fname):
    print('No such file')
else:
    d, f = os.path.split(fname)
    dl = d.lower()
    fl = f.lower()
    for root, dirs, files in os.walk('g:\'):
        if root.lower() == dl:
            fn = [n for n in files if n.lower() == fl][0]
            print(os.path.join(root, fn))
            break

Answered By: Michał Niklas

Answer 3

This python-win32 thread has an answer that doesn’t require third-party packages or walking the tree:

import ctypes

def getLongPathName(path):
    buf = ctypes.create_unicode_buffer(260)
    GetLongPathName = ctypes.windll.kernel32.GetLongPathNameW
    rv = GetLongPathName(path, buf, 260)
    if rv == 0 or rv > 260:
        return path
    else:
        return buf.value

Answered By: Ned Batchelder

Answer 4

Ned’s GetLongPathName answer doesn’t quite work (at least not for me). You need to call GetLongPathName on the return value of GetShortPathname. Using pywin32 for brevity (a ctypes solution would look similar to Ned’s):

>>> win32api.GetLongPathName(win32api.GetShortPathName('stopservices.vbs'))
'StopServices.vbs'

Answered By: Paul Moore

Answer 5

Here’s a simple, stdlib only, solution:

import glob
def get_actual_filename(name):
    name = "%s[%s]" % (name[:-1], name[-1])
    return glob.glob(name)[0]

Answered By: Ethan Furman

Answer 6

Ethan answer correct only file name, not subfolders names on the path.
Here is my guess:

def get_actual_filename(name):
    dirs = name.split('\')
    # disk letter
    test_name = [dirs[0].upper()]
    for d in dirs[1:]:
        test_name += ["%s[%s]" % (d[:-1], d[-1])]
    res = glob.glob('\'.join(test_name))
    if not res:
        #File not found
        return None
    return res[0]

Answered By: xvorsx

Answer 7

I prefer the approach of Ethan and xvorsx. AFAIK, the following wouldn’t also harm on other platforms:

import os.path
from glob import glob

def get_actual_filename(name):
    sep = os.path.sep
    parts = os.path.normpath(name).split(sep)
    dirs = parts[0:-1]
    filename = parts[-1]
    if dirs[0] == os.path.splitdrive(name)[0]:
        test_name = [dirs[0].upper()]
    else:
        test_name = [sep + dirs[0]]
    for d in dirs[1:]:
        test_name += ["%s[%s]" % (d[:-1], d[-1])]
    path = glob(sep.join(test_name))[0]
    res = glob(sep.join((path, filename)))
    if not res:
        #File not found
        return None
    return res[0]

Answered By: Dobedani

Answer 8

Based off a couple of the listdir/walk examples above, but supports UNC paths

def get_actual_filename(path):
    orig_path = path
    path = os.path.normpath(path)

    # Build root to start searching from.  Different for unc paths.
    if path.startswith(r'\'):
        path = path.lstrip(r'\')
        path_split = path.split('\')
        # listdir doesn't work on just the machine name
        if len(path_split) < 3:
            return orig_path
        test_path = r'\{}{}'.format(path_split[0], path_split[1])
        start = 2
    else:
        path_split = path.split('\')
        test_path = path_split[0] + '\'
        start = 1

    for i in range(start, len(path_split)):
        part = path_split[i]
        if os.path.isdir(test_path):
            for name in os.listdir(test_path):
                if name.lower() == part.lower():
                    part = name
                    break
            test_path = os.path.join(test_path, part)
        else:
            return orig_path
    return test_path

Answered By: Brendan Abel

Answer 9

I was just struggling with the same problem. I’m not sure, but I think the previous answers do not cover all cases. My actual problem was that the drive letter casing was different than the one seen by the system. Here is my solution that also checks for the correct drive letter casing (using win32api):

  def get_case_sensitive_path(path):
      """
      Get case sensitive path based on not - case sensitive path.
      
      Returns:
         The real absolute path.
         
      Exceptions:
         ValueError if the path doesn't exist.
      
      Important note on Windows: when starting command line using
      letter cases different from the actual casing of the files / directories,
      the interpreter will use the invalid cases in path (e. g. os.getcwd()
      returns path that has cases different from actuals).
      When using tools that are case - sensitive, this will cause a problem.
      Below code is used to get path with exact the same casing as the
      actual. 
      See http://stackoverflow.com/questions/2113822/python-getting-filename-case-as-stored-in-windows
      """
      drive, path = os.path.splitdrive(os.path.abspath(path))
      path = path.lstrip(os.sep)
      path = path.rstrip(os.sep)
      folders = []
      
      # Make sure the drive number is also in the correct casing.
      drives = win32api.GetLogicalDriveStrings()
      drives = drives.split("00")[:-1]
      # Get the list of the form C:, d:, E: etc.
      drives = [d.replace("\", "") for d in drives]
      # Now get a lower case version for comparison.
      drives_l = [d.lower() for d in drives]
      # Find the index of matching item.
      idx = drives_l.index(drive.lower())
      # Get the drive letter with the correct casing.
      drive = drives[idx]

      # Divide path into components.
      while 1:
          path, folder = os.path.split(path)
          if folder != "":
              folders.append(folder)
          else:
              if path != "":
                  folders.append(path)
              break

      # Restore their original order.
      folders.reverse()

      if len(folders) > 0:
          retval = drive + os.sep

          for folder in folders:
              found = False
              for item in os.listdir(retval):
                  if item.lower() == folder.lower():
                      found = True
                      retval = os.path.join(retval, item)
                      break
              if not found:
                  raise ValueError("Path not found: '{0}'".format(retval))

      else:
          retval = drive + os.sep

      return retval

Answered By: lutecki

Answer 10

This one unifies, shortens and fixes several approaches:
Standard lib only; converts all path parts (except drive letter); relative or absolute paths; drive letter’ed or not; tolarant:

def casedpath(path):
    r = glob.glob(re.sub(r'([^:/\])(?=[/\]|$)', r'[1]', path))
    return r and r[0] or path

And this one handles UNC paths in addition:

def casedpath_unc(path):
    unc, p = os.path.splitunc(path)
    r = glob.glob(unc + re.sub(r'([^:/\])(?=[/\]|$)', r'[1]', p))
    return r and r[0] or path

Answered By: kxr

Answer 11

In Python 3 you can use the pathlib‘s resolve():

>>> from pathlib import Path

>>> str(Path(r"C:WiNdOwSSyStEm32DeSkToP.iNi").resolve())
r'C:WindowsSystem32desktop.ini'

Answered By: TheAl_T

Answer 12

I was looking for an even simpler version that the “glob trick” so I made this, which only uses os.listdir().

def casedPath(path):
    path = os.path.normpath(path).lower()
    parts = path.split(os.sep)
    result = parts[0].upper()
    # check that root actually exists
    if not os.path.exists(result):
        return
    for part in parts[1:]:
        actual = next((item for item in os.listdir(result) if item.lower() == part), None)
        if actual is None:
            # path doesn't exist
            return
        result += os.sep + actual
    return result

edit: it works fine by the way. Not sure that returning None when path doesn’t exist is expected, but I needed this behaviour. It could raise an error instead, I guess.

Answered By: Paul

In Python, how can I get the correctly-cased path for a file?

Question:

Answers: