Say, I have two absolute paths. I need to check if the location referring to by one of the paths is a descendant of the other. If true, I need to find out the relative path of the descendant from the ancestor. What’s a good way to implement this in Python? Any library that I can benefit from?
>>> print os.path.commonprefix(['/usr/var/log', '/usr/var/security']) '/usr/var' >>> print os.path.commonprefix(['/tmp', '/usr/var']) # No common prefix: the root is the common prefix '/'
You can thus test whether the common prefix is one of the paths, i.e. if one of the paths is a common ancestor:
paths = […, …, …] common_prefix = os.path.commonprefix(list_of_paths) if common_prefix in paths: …
You can then find the relative paths:
relative_paths = [os.path.relpath(path, common_prefix) for path in paths]
You can even handle more than two paths, with this method, and test whether all the paths are all below one of them.
PS: depending on how your paths look like, you might want to perform some normalization first (this is useful in situations where one does not know whether they always end with ‘/’ or not, or if some of the paths are relative). Relevant functions include os.path.abspath() and os.path.normpath().
PPS: as Peter Briggs mentioned in the comments, the simple approach described above can fail:
>>> os.path.commonprefix(['/usr/var', '/usr/var2/log']) '/usr/var'
/usr/var is not a common prefix of the paths. Forcing all paths to end with ‘/’ before calling
commonprefix() solves this (specific) problem.
PPPS: as bluenote10 mentioned, adding a slash does not solve the general problem. Here is his followup question: How to circumvent the fallacy of Python's os.path.commonprefix?
PPPPS: starting with Python 3.4, we have pathlib, a module that provides a saner path manipulation environment. I guess that the common prefix of a set of paths can be obtained by getting all the prefixes of each path (with
PurePath.parents()), taking the intersection of all these parent sets, and selecting the longest common prefix.
PPPPPS: Python 3.5 introduced a proper solution to this question:
os.path.commonpath(), which returns a valid path.
Return a relative filepath to path either from the current directory or from an optional start point.
>>> from os.path import relpath >>> relpath('/usr/var/log/', '/usr/var') 'log' >>> relpath('/usr/var/log/', '/usr/var/sad/') '../log'
So, if relative path starts with
'..' – it means that the second path is not descendant of the first path.
In Python3 you can use
Python 3.5.1 (default, Jan 22 2016, 08:54:32) >>> from pathlib import Path >>> Path('/usr/var/log').relative_to('/usr/var/log/') PosixPath('.') >>> Path('/usr/var/log').relative_to('/usr/var/') PosixPath('log') >>> Path('/usr/var/log').relative_to('/etc/') Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/Cellar/python3/3.5.1/Frameworks/Python.framework/Versions/3.5/lib/python3.5/pathlib.py", line 851, in relative_to .format(str(self), str(formatted))) ValueError: '/usr/var/log' does not start with '/etc'
Another option is
>>> print os.path.relpath('/usr/var/log/', '/usr/var') log
Edit : See jme’s answer for the best way with Python3.
Using pathlib, you have the following solution :
Let’s say we want to check if
son is a descendant of
parent, and both are
We can get a list of the parts in the path with
Then, we just check that the begining of the son is equal to the list of segments of the parent.
>>> lparent = list(parent.parts) >>> lson = list(son.parts) >>> if lson[:len(lparent)] == lparent: >>> ... #parent is a parent of son :)
If you want to get the remaining part, you can just do
It’s a string, but you can of course use it as a constructor of an other Path object.
Pure Python2 w/o dep:
def relpath(cwd, path): """Create a relative path for path from cwd, if possible""" if sys.platform == "win32": cwd = cwd.lower() path = path.lower() _cwd = os.path.abspath(cwd).split(os.path.sep) _path = os.path.abspath(path).split(os.path.sep) eq_until_pos = None for i in xrange(min(len(_cwd), len(_path))): if _cwd[i] == _path[i]: eq_until_pos = i else: break if eq_until_pos is None: return path newpath = [".." for i in xrange(len(_cwd[eq_until_pos+1:]))] newpath.extend(_path[eq_until_pos+1:]) return os.path.join(*newpath) if newpath else "."
A write-up of jme’s suggestion, using pathlib, in Python 3.
from pathlib import Path parent = Path(r'/a/b') son = Path(r'/a/b/c/d') if parent in son.parents or parent==son: print(son.relative_to(parent)) # returns Path object equivalent to 'c/d'