How can I replace (or strip) an extension from a filename in Python?
Question:
Is there a built-in function in Python that would replace (or remove, whatever) the extension of a filename (if it has one)?
Example:
print replace_extension('/home/user/somefile.txt', '.jpg')
In my example: /home/user/somefile.txt
would become /home/user/somefile.jpg
I don’t know if it matters, but I need this for a SCons module I’m writing. (So perhaps there is some SCons specific function I can use ?)
I’d like something clean. Doing a simple string replacement of all occurrences of .txt
within the string is obviously not clean. (This would fail if my filename is somefile.txt.txt.txt
)
Answers:
Try os.path.splitext it should do what you want.
import os
print os.path.splitext('/home/user/somefile.txt')[0]+'.jpg' # /home/user/somefile.jpg
os.path.splitext('/home/user/somefile.txt') # returns ('/home/user/somefile', '.txt')
As @jethro said, splitext
is the neat way to do it. But in this case, it’s pretty easy to split it yourself, since the extension must be the part of the filename coming after the final period:
filename = '/home/user/somefile.txt'
print( filename.rsplit( ".", 1 )[ 0 ] )
# '/home/user/somefile'
The rsplit
tells Python to perform the string splits starting from the right of the string, and the 1
says to perform at most one split (so that e.g. 'foo.bar.baz'
-> [ 'foo.bar', 'baz' ]
). Since rsplit
will always return a non-empty array, we may safely index 0
into it to get the filename minus the extension.
Another way to do is to use the str.rpartition(sep)
method.
For example:
filename = '/home/user/somefile.txt'
(prefix, sep, suffix) = filename.rpartition('.')
new_filename = prefix + '.jpg'
print new_filename
I prefer the following one-liner approach using str.rsplit():
my_filename.rsplit('.', 1)[0] + '.jpg'
Example:
>>> my_filename = '/home/user/somefile.txt'
>>> my_filename.rsplit('.', 1)
>>> ['/home/user/somefile', 'txt']
For Python >= 3.4:
from pathlib import Path
filename = '/home/user/somefile.txt'
p = Path(filename)
new_filename = p.parent.joinpath(p.stem + '.jpg') # PosixPath('/home/user/somefile.jpg')
new_filename_str = str(new_filename) # '/home/user/somefile.jpg'
Expanding on AnaPana’s answer, how to remove an extension using pathlib (Python >= 3.4):
>>> from pathlib import Path
>>> filename = Path('/some/path/somefile.txt')
>>> filename_wo_ext = filename.with_suffix('')
>>> filename_replace_ext = filename.with_suffix('.jpg')
>>> print(filename)
/some/path/somefile.ext
>>> print(filename_wo_ext)
/some/path/somefile
>>> print(filename_replace_ext)
/some/path/somefile.jpg
Handling multiple extensions
In the case where you have multiple extensions using pathlib
and str.replace
works a treat:
Remove/strip extensions
>>> from pathlib import Path
>>> p = Path("/path/to/myfile.tar.gz")
>>> extensions = "".join(p.suffixes)
# any python version
>>> str(p).replace(extensions, "")
'/path/to/myfile'
# python>=3.9
>>> str(p).removesuffix(extensions)
'/path/to/myfile'
Replace extensions
>>> p = Path("/path/to/myfile.tar.gz")
>>> extensions = "".join(p.suffixes)
>>> new_ext = ".jpg"
>>> str(p).replace(extensions, new_ext)
'/path/to/myfile.jpg'
If you also want a pathlib
object output then you can obviously wrap the line in Path()
>>> Path(str(p).replace("".join(p.suffixes), ""))
PosixPath('/path/to/myfile')
Wrapping it all up in a function
from pathlib import Path
from typing import Union
PathLike = Union[str, Path]
def replace_ext(path: PathLike, new_ext: str = "") -> Path:
extensions = "".join(Path(path).suffixes)
return Path(str(p).replace(extensions, new_ext))
p = Path("/path/to/myfile.tar.gz")
new_ext = ".jpg"
assert replace_ext(p, new_ext) == Path("/path/to/myfile.jpg")
assert replace_ext(str(p), new_ext) == Path("/path/to/myfile.jpg")
assert replace_ext(p) == Path("/path/to/myfile")
TLDR:
Best way to replace all extensions, in my opinion, is the following.
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(p.with_name(p.name.split('.')[0]).with_suffix('.jpg'))
Longer Answer:
The best way to do this will depend on your version of python and how many extensions you need to handle. That said, I’m surprised nobody has mentioned pathlib’s with_name
. I’m also concerned that some answers here don’t handle a .
in the parent directories. Here are several ways to accomplish extension replacement.
Using Path Objects
Replace Up to One Extension
import pathlib
p = pathlib.Path('/path/to.my/file.foo')
print(p.with_suffix('.jpg'))
Replace Up to Two Extensions
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar')
print(p.with_name(p.stem).with_suffix('.jpg'))
Replace All Extensions
Using pathlibs with_name
(best solution, in my opinion):
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(p.with_name(p.name.split('.')[0]).with_suffix('.jpg'))
Using functools.reduce
and pathlib’s with_suffix
:
import pathlib
import functools
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(functools.reduce(lambda v, _: v.with_suffix(''), p.suffixes, p).with_suffix('.jpg'))
print(functools.reduce(lambda v, e: v.with_suffix(e), ['' for _ in p.suffixes] + ['.jpg'], p))
Python 3.9+ Using pathlib and str.removesuffix:
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(pathlib.Path(str(p).removesuffix(''.join(p.suffixes))).with_suffix('.jpg'))
Without Using Path Objects (Strings Only)
In general, I think solutions using pathlib are cleaner, but not everybody can do that. If you’re still using python 2, I’m sorry. If you don’t have the pathlib package for python2, I’m really sorry.
Replace All Extensions
Python 2.7 compatible using os.path
:
import os
ps = '/path/to.my/file.foo.bar.baz.quz'
print(os.path.join(os.path.dirname(ps), os.path.basename(ps).split('.')[0] + '.jpg'))
Python 3.9+ Using removesuffix
and os.path
(if you have python3.9, why aren’t you using pathlib
?):
import os
ps = '/path/to.my/file.foo.bar.baz.quz'
print(ps.removesuffix(os.path.splitext(ps)[-1].split('.', 1)[-1]) + 'jpg')
Is there a built-in function in Python that would replace (or remove, whatever) the extension of a filename (if it has one)?
Example:
print replace_extension('/home/user/somefile.txt', '.jpg')
In my example: /home/user/somefile.txt
would become /home/user/somefile.jpg
I don’t know if it matters, but I need this for a SCons module I’m writing. (So perhaps there is some SCons specific function I can use ?)
I’d like something clean. Doing a simple string replacement of all occurrences of .txt
within the string is obviously not clean. (This would fail if my filename is somefile.txt.txt.txt
)
Try os.path.splitext it should do what you want.
import os
print os.path.splitext('/home/user/somefile.txt')[0]+'.jpg' # /home/user/somefile.jpg
os.path.splitext('/home/user/somefile.txt') # returns ('/home/user/somefile', '.txt')
As @jethro said, splitext
is the neat way to do it. But in this case, it’s pretty easy to split it yourself, since the extension must be the part of the filename coming after the final period:
filename = '/home/user/somefile.txt'
print( filename.rsplit( ".", 1 )[ 0 ] )
# '/home/user/somefile'
The rsplit
tells Python to perform the string splits starting from the right of the string, and the 1
says to perform at most one split (so that e.g. 'foo.bar.baz'
-> [ 'foo.bar', 'baz' ]
). Since rsplit
will always return a non-empty array, we may safely index 0
into it to get the filename minus the extension.
Another way to do is to use the str.rpartition(sep)
method.
For example:
filename = '/home/user/somefile.txt'
(prefix, sep, suffix) = filename.rpartition('.')
new_filename = prefix + '.jpg'
print new_filename
I prefer the following one-liner approach using str.rsplit():
my_filename.rsplit('.', 1)[0] + '.jpg'
Example:
>>> my_filename = '/home/user/somefile.txt'
>>> my_filename.rsplit('.', 1)
>>> ['/home/user/somefile', 'txt']
For Python >= 3.4:
from pathlib import Path
filename = '/home/user/somefile.txt'
p = Path(filename)
new_filename = p.parent.joinpath(p.stem + '.jpg') # PosixPath('/home/user/somefile.jpg')
new_filename_str = str(new_filename) # '/home/user/somefile.jpg'
Expanding on AnaPana’s answer, how to remove an extension using pathlib (Python >= 3.4):
>>> from pathlib import Path
>>> filename = Path('/some/path/somefile.txt')
>>> filename_wo_ext = filename.with_suffix('')
>>> filename_replace_ext = filename.with_suffix('.jpg')
>>> print(filename)
/some/path/somefile.ext
>>> print(filename_wo_ext)
/some/path/somefile
>>> print(filename_replace_ext)
/some/path/somefile.jpg
Handling multiple extensions
In the case where you have multiple extensions using pathlib
and str.replace
works a treat:
Remove/strip extensions
>>> from pathlib import Path
>>> p = Path("/path/to/myfile.tar.gz")
>>> extensions = "".join(p.suffixes)
# any python version
>>> str(p).replace(extensions, "")
'/path/to/myfile'
# python>=3.9
>>> str(p).removesuffix(extensions)
'/path/to/myfile'
Replace extensions
>>> p = Path("/path/to/myfile.tar.gz")
>>> extensions = "".join(p.suffixes)
>>> new_ext = ".jpg"
>>> str(p).replace(extensions, new_ext)
'/path/to/myfile.jpg'
If you also want a pathlib
object output then you can obviously wrap the line in Path()
>>> Path(str(p).replace("".join(p.suffixes), ""))
PosixPath('/path/to/myfile')
Wrapping it all up in a function
from pathlib import Path
from typing import Union
PathLike = Union[str, Path]
def replace_ext(path: PathLike, new_ext: str = "") -> Path:
extensions = "".join(Path(path).suffixes)
return Path(str(p).replace(extensions, new_ext))
p = Path("/path/to/myfile.tar.gz")
new_ext = ".jpg"
assert replace_ext(p, new_ext) == Path("/path/to/myfile.jpg")
assert replace_ext(str(p), new_ext) == Path("/path/to/myfile.jpg")
assert replace_ext(p) == Path("/path/to/myfile")
TLDR:
Best way to replace all extensions, in my opinion, is the following.
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(p.with_name(p.name.split('.')[0]).with_suffix('.jpg'))
Longer Answer:
The best way to do this will depend on your version of python and how many extensions you need to handle. That said, I’m surprised nobody has mentioned pathlib’s with_name
. I’m also concerned that some answers here don’t handle a .
in the parent directories. Here are several ways to accomplish extension replacement.
Using Path Objects
Replace Up to One Extension
import pathlib
p = pathlib.Path('/path/to.my/file.foo')
print(p.with_suffix('.jpg'))
Replace Up to Two Extensions
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar')
print(p.with_name(p.stem).with_suffix('.jpg'))
Replace All Extensions
Using pathlibs with_name
(best solution, in my opinion):
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(p.with_name(p.name.split('.')[0]).with_suffix('.jpg'))
Using functools.reduce
and pathlib’s with_suffix
:
import pathlib
import functools
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(functools.reduce(lambda v, _: v.with_suffix(''), p.suffixes, p).with_suffix('.jpg'))
print(functools.reduce(lambda v, e: v.with_suffix(e), ['' for _ in p.suffixes] + ['.jpg'], p))
Python 3.9+ Using pathlib and str.removesuffix:
import pathlib
p = pathlib.Path('/path/to.my/file.foo.bar.baz.quz')
print(pathlib.Path(str(p).removesuffix(''.join(p.suffixes))).with_suffix('.jpg'))
Without Using Path Objects (Strings Only)
In general, I think solutions using pathlib are cleaner, but not everybody can do that. If you’re still using python 2, I’m sorry. If you don’t have the pathlib package for python2, I’m really sorry.
Replace All Extensions
Python 2.7 compatible using os.path
:
import os
ps = '/path/to.my/file.foo.bar.baz.quz'
print(os.path.join(os.path.dirname(ps), os.path.basename(ps).split('.')[0] + '.jpg'))
Python 3.9+ Using removesuffix
and os.path
(if you have python3.9, why aren’t you using pathlib
?):
import os
ps = '/path/to.my/file.foo.bar.baz.quz'
print(ps.removesuffix(os.path.splitext(ps)[-1].split('.', 1)[-1]) + 'jpg')