Remove substring only at the end of string
Question:
I have a bunch of strings, some of them have ' rec'
. I want to remove that only if those are the last 4 characters.
So in other words I have
somestring = 'this is some string rec'
and I want it to become
somestring = 'this is some string'
What is the Python way to approach this?
Answers:
def rchop(s, suffix):
if suffix and s.endswith(suffix):
return s[:-len(suffix)]
return s
somestring = 'this is some string rec'
rchop(somestring, ' rec') # returns 'this is some string'
You could use a regular expression as well:
from re import sub
str = r"this is some string rec"
regex = r"(.*)srec$"
print sub(regex, r"1", str)
Since you have to get len(trailing)
anyway (where trailing
is the string you want to remove IF it’s trailing), I’d recommend avoiding the slight duplication of work that .endswith
would cause in this case. Of course, the proof of the code is in the timing, so, let’s do some measurement (naming the functions after the respondents proposing them):
import re
astring = 'this is some string rec'
trailing = ' rec'
def andrew(astring=astring, trailing=trailing):
regex = r'(.*)%s$' % re.escape(trailing)
return re.sub(regex, r'1', astring)
def jack0(astring=astring, trailing=trailing):
if astring.endswith(trailing):
return astring[:-len(trailing)]
return astring
def jack1(astring=astring, trailing=trailing):
regex = r'%s$' % re.escape(trailing)
return re.sub(regex, '', astring)
def alex(astring=astring, trailing=trailing):
thelen = len(trailing)
if astring[-thelen:] == trailing:
return astring[:-thelen]
return astring
Say we’ve named this python file a.py
and it’s in the current directory; now, …:
$ python2.6 -mtimeit -s'import a' 'a.andrew()'
100000 loops, best of 3: 19 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack0()'
1000000 loops, best of 3: 0.564 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack1()'
100000 loops, best of 3: 9.83 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.alex()'
1000000 loops, best of 3: 0.479 usec per loop
As you see, the RE-based solutions are “hopelessly outclassed” (as often happens when one “overkills” a problem — possibly one of the reasons REs have such a bad rep in the Python community!-), though the suggestion in @Jack’s comment is way better than @Andrew’s original. The string-based solutions, as expected, shing, with my endswith
-avoiding one having a miniscule advantage over @Jack’s (being just 15% faster). So, both pure-string ideas are good (as well as both being concise and clear) — I prefer my variant a little bit only because I am, by character, a frugal (some might say, stingy;-) person… “waste not, want not”!-)
As kind of one liner generator joined:
test = """somestring='this is some string rec'
this is some string in the end word rec
This has not the word."""
match = 'rec'
print('n'.join((line[:-len(match)] if line.endswith(match) else line)
for line in test.splitlines()))
""" Output:
somestring='this is some string rec'
this is some string in the end word
This has not the word.
"""
If speed is not important, use regex:
import re
somestring='this is some string rec'
somestring = re.sub(' rec$', '', somestring)
Using more_itertools
, we can rstrip
strings that pass a predicate.
Installation
> pip install more_itertools
Code
import more_itertools as mit
iterable = "this is some string rec".split()
" ".join(mit.rstrip(iterable, pred=lambda x: x in {"rec", " "}))
# 'this is some string'
" ".join(mit.rstrip(iterable, pred=lambda x: x in {"rec", " "}))
# 'this is some string'
Here we pass all trailing items we wish to strip from the end.
See also the more_itertools
docs for details.
use:
somestring.rsplit(' rec')[0]
Taking inspiration from @David Foster‘s answer, I would do
def _remove_suffix(text, suffix):
if text is not None and suffix is not None:
return text[:-len(suffix)] if text.endswith(suffix) else text
else:
return text
Reference: Python
string slicing
Here is a one-liner version of Jack Kelly’s answer along with its sibling:
def rchop(s, sub):
return s[:-len(sub)] if s.endswith(sub) else s
def lchop(s, sub):
return s[len(sub):] if s.startswith(sub) else s
def remove_trailing_string(content, trailing):
"""
Strip trailing component `trailing` from `content` if it exists.
"""
if content.endswith(trailing) and content != trailing:
return content[:-len(trailing)]
return content
Starting in Python 3.9
, you can use removesuffix
:
'this is some string rec'.removesuffix(' rec')
# 'this is some string'
I have a bunch of strings, some of them have ' rec'
. I want to remove that only if those are the last 4 characters.
So in other words I have
somestring = 'this is some string rec'
and I want it to become
somestring = 'this is some string'
What is the Python way to approach this?
def rchop(s, suffix):
if suffix and s.endswith(suffix):
return s[:-len(suffix)]
return s
somestring = 'this is some string rec'
rchop(somestring, ' rec') # returns 'this is some string'
You could use a regular expression as well:
from re import sub
str = r"this is some string rec"
regex = r"(.*)srec$"
print sub(regex, r"1", str)
Since you have to get len(trailing)
anyway (where trailing
is the string you want to remove IF it’s trailing), I’d recommend avoiding the slight duplication of work that .endswith
would cause in this case. Of course, the proof of the code is in the timing, so, let’s do some measurement (naming the functions after the respondents proposing them):
import re
astring = 'this is some string rec'
trailing = ' rec'
def andrew(astring=astring, trailing=trailing):
regex = r'(.*)%s$' % re.escape(trailing)
return re.sub(regex, r'1', astring)
def jack0(astring=astring, trailing=trailing):
if astring.endswith(trailing):
return astring[:-len(trailing)]
return astring
def jack1(astring=astring, trailing=trailing):
regex = r'%s$' % re.escape(trailing)
return re.sub(regex, '', astring)
def alex(astring=astring, trailing=trailing):
thelen = len(trailing)
if astring[-thelen:] == trailing:
return astring[:-thelen]
return astring
Say we’ve named this python file a.py
and it’s in the current directory; now, …:
$ python2.6 -mtimeit -s'import a' 'a.andrew()'
100000 loops, best of 3: 19 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack0()'
1000000 loops, best of 3: 0.564 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.jack1()'
100000 loops, best of 3: 9.83 usec per loop
$ python2.6 -mtimeit -s'import a' 'a.alex()'
1000000 loops, best of 3: 0.479 usec per loop
As you see, the RE-based solutions are “hopelessly outclassed” (as often happens when one “overkills” a problem — possibly one of the reasons REs have such a bad rep in the Python community!-), though the suggestion in @Jack’s comment is way better than @Andrew’s original. The string-based solutions, as expected, shing, with my endswith
-avoiding one having a miniscule advantage over @Jack’s (being just 15% faster). So, both pure-string ideas are good (as well as both being concise and clear) — I prefer my variant a little bit only because I am, by character, a frugal (some might say, stingy;-) person… “waste not, want not”!-)
As kind of one liner generator joined:
test = """somestring='this is some string rec'
this is some string in the end word rec
This has not the word."""
match = 'rec'
print('n'.join((line[:-len(match)] if line.endswith(match) else line)
for line in test.splitlines()))
""" Output:
somestring='this is some string rec'
this is some string in the end word
This has not the word.
"""
If speed is not important, use regex:
import re
somestring='this is some string rec'
somestring = re.sub(' rec$', '', somestring)
Using more_itertools
, we can rstrip
strings that pass a predicate.
Installation
> pip install more_itertools
Code
import more_itertools as mit
iterable = "this is some string rec".split()
" ".join(mit.rstrip(iterable, pred=lambda x: x in {"rec", " "}))
# 'this is some string'
" ".join(mit.rstrip(iterable, pred=lambda x: x in {"rec", " "}))
# 'this is some string'
Here we pass all trailing items we wish to strip from the end.
See also the more_itertools
docs for details.
use:
somestring.rsplit(' rec')[0]
Taking inspiration from @David Foster‘s answer, I would do
def _remove_suffix(text, suffix):
if text is not None and suffix is not None:
return text[:-len(suffix)] if text.endswith(suffix) else text
else:
return text
Reference: Python
string slicing
Here is a one-liner version of Jack Kelly’s answer along with its sibling:
def rchop(s, sub):
return s[:-len(sub)] if s.endswith(sub) else s
def lchop(s, sub):
return s[len(sub):] if s.startswith(sub) else s
def remove_trailing_string(content, trailing):
"""
Strip trailing component `trailing` from `content` if it exists.
"""
if content.endswith(trailing) and content != trailing:
return content[:-len(trailing)]
return content
Starting in Python 3.9
, you can use removesuffix
:
'this is some string rec'.removesuffix(' rec')
# 'this is some string'