normalize non-existing path using pathlib only
Question:
python has recently added the pathlib module (which i like a lot!).
there is just one thing i’m struggling with: is it possible to normalize a path to a file or directory that does not exist? i can do that perfectly well with os.path.normpath
. but wouldn’t it be absurd to have to use something other than the library that should take care of path related stuff?
the functionality i would like to have is this:
from os.path import normpath
from pathlib import Path
pth = Path('/tmp/some_directory/../i_do_not_exist.txt')
pth = Path(normpath(str(pth)))
# -> /tmp/i_do_not_exist.txt
but without having to resort to os.path
and without having to type-cast to str
and back to Path
. also pth.resolve()
does not work for non-existing files.
is there a simple way to do that with just pathlib
?
Answers:
As of Python 3.5: No, there’s not.
PEP 0428 states:
Path resolution
The resolve() method makes a path absolute, resolving
any symlink on the way (like the POSIX realpath() call). It is the
only operation which will remove ” .. ” path components. On Windows,
this method will also take care to return the canonical path (with the
right casing).
Since resolve()
is the only operation to remove the “..” components, and it fails when the file doesn’t exist, there won’t be a simple means using just pathlib
.
Also, the pathlib documentation gives a hint as to why:
Spurious slashes and single dots are collapsed, but double dots (‘..’)
are not, since this would change the meaning of a path in the face of
symbolic links:
PurePath('foo//bar')
produces PurePosixPath('foo/bar')
PurePath('foo/./bar')
produces PurePosixPath('foo/bar')
PurePath('foo/../bar')
produces PurePosixPath('foo/../bar')
(a naïve approach would make PurePosixPath(‘foo/../bar’) equivalent to PurePosixPath(‘bar’), which is wrong if foo is a symbolic link to another directory)
All that said, you could create a 0 byte file at the location of your path, and then it’d be possible to resolve the path (thus eliminating the ..). I’m not sure that’s any simpler than your normpath approach, though.
If this fits you usecase(e.g. ifle’s directory already exists) you might try to resolve
path’s parent and then re-append file name, e.g.:
from pathlib import Path
p = Path()/'hello.there'
print(p.parent.resolve()/p.name)
is it possible to normalize a path to a file or directory that does not exist?
Starting from 3.6, it’s the default behavior. See https://docs.python.org/3.6/library/pathlib.html#pathlib.Path.resolve
Path.resolve(strict=False)
…
If strict
is False
, the path is resolved as far as possible and any remainder is appended without checking whether it exists
Old question, but here is another solution in particular if you want POSIX paths across the board (like nix paths on Windows too).
I found pathlib resolve()
to be broken as of Python 3.10, and this method is not currently exposed by PurePosixPath
.
What I found worked was to use posixpath.normpath()
. Also found PurePosixPath.joinpath()
to be broken. I.E. It will not join ".." with "myfile.txt" as expected. It will return just "myfile.txt". But posixpath.join()
works perfectly; will return "../myfile.txt".
Note this is in path strings, but easily back to pathlib.Path(my_posix_path)
et al for an OOP container.
And easily transpose to Windows platform paths too by just constructing this way, as the module takes care of the platform independence for you.
Might be the solution for others with Python file path woes..
python has recently added the pathlib module (which i like a lot!).
there is just one thing i’m struggling with: is it possible to normalize a path to a file or directory that does not exist? i can do that perfectly well with os.path.normpath
. but wouldn’t it be absurd to have to use something other than the library that should take care of path related stuff?
the functionality i would like to have is this:
from os.path import normpath
from pathlib import Path
pth = Path('/tmp/some_directory/../i_do_not_exist.txt')
pth = Path(normpath(str(pth)))
# -> /tmp/i_do_not_exist.txt
but without having to resort to os.path
and without having to type-cast to str
and back to Path
. also pth.resolve()
does not work for non-existing files.
is there a simple way to do that with just pathlib
?
As of Python 3.5: No, there’s not.
PEP 0428 states:
Path resolution
The resolve() method makes a path absolute, resolving
any symlink on the way (like the POSIX realpath() call). It is the
only operation which will remove ” .. ” path components. On Windows,
this method will also take care to return the canonical path (with the
right casing).
Since resolve()
is the only operation to remove the “..” components, and it fails when the file doesn’t exist, there won’t be a simple means using just pathlib
.
Also, the pathlib documentation gives a hint as to why:
Spurious slashes and single dots are collapsed, but double dots (‘..’)
are not, since this would change the meaning of a path in the face of
symbolic links:
PurePath('foo//bar')
producesPurePosixPath('foo/bar')
PurePath('foo/./bar')
producesPurePosixPath('foo/bar')
PurePath('foo/../bar')
producesPurePosixPath('foo/../bar')
(a naïve approach would make PurePosixPath(‘foo/../bar’) equivalent to PurePosixPath(‘bar’), which is wrong if foo is a symbolic link to another directory)
All that said, you could create a 0 byte file at the location of your path, and then it’d be possible to resolve the path (thus eliminating the ..). I’m not sure that’s any simpler than your normpath approach, though.
If this fits you usecase(e.g. ifle’s directory already exists) you might try to resolve
path’s parent and then re-append file name, e.g.:
from pathlib import Path
p = Path()/'hello.there'
print(p.parent.resolve()/p.name)
is it possible to normalize a path to a file or directory that does not exist?
Starting from 3.6, it’s the default behavior. See https://docs.python.org/3.6/library/pathlib.html#pathlib.Path.resolve
Path.resolve(strict=False)
…
Ifstrict
isFalse
, the path is resolved as far as possible and any remainder is appended without checking whether it exists
Old question, but here is another solution in particular if you want POSIX paths across the board (like nix paths on Windows too).
I found pathlib resolve()
to be broken as of Python 3.10, and this method is not currently exposed by PurePosixPath
.
What I found worked was to use posixpath.normpath()
. Also found PurePosixPath.joinpath()
to be broken. I.E. It will not join ".." with "myfile.txt" as expected. It will return just "myfile.txt". But posixpath.join()
works perfectly; will return "../myfile.txt".
Note this is in path strings, but easily back to pathlib.Path(my_posix_path)
et al for an OOP container.
And easily transpose to Windows platform paths too by just constructing this way, as the module takes care of the platform independence for you.
Might be the solution for others with Python file path woes..