How to subtract strings in python

Question:

Basically, if I have a string 'AJ' and another string 'AJYF', I would like to be able to write 'AJYF'-'AJ' and get 'YF'.

I tried this but got a syntax error.

Just on a side note the subtractor will always will be shorter than the string it is subtracted from. Also, the subtractor will always be like the string it is subtracted from. For instance, if I have ‘GTYF’ and I want to subtract a string of length 3 from it, that string has to be ‘GTY’.

If it is possible, the full function I am trying to do is convert a string to a list based on how long each item in the list is supposed to be. Is there any way of doing that?

Asked By: jay a

||

Answers:

I think what you want is this:

a = 'AJYF'
b = a.replace('AJ', '')
print b     # produces 'YF'
a = 'GTYF'
b = a.replace('GTY', '')
print b     # produces 'F'
Answered By: Tom Barron

Easy Solution is:

>>> string1 = 'AJYF'
>>> string2 = 'AJ'
>>> if string2 in string1:
...     string1.replace(string2,'')
'YF'
>>>
Answered By: Shubham Namdeo

replace can do something that you do not want if the second string is present at several positions:

s1 = 'AJYFAJYF'
s2 = 'AJ'
if s1.startswith(s2):
    s3 = s1.replace(s2, '')
s3
# 'YFYF'

You can add an extra argument to replace to indicate that you want only one replacement to happen:

if s1.startswith(s2):
    s3 = s1.replace(s2, '', 1)
s3
# 'YFAJYF'

Or you could use the re module:

import re
if s1.startswith(s2):
    s3 = re.sub('^' + s2, '', s1)
s3
# 'YFAJYF'

The '^' is to ensure that s2 it is substituted only at the first position of s1.

Yet another approach, suggested in the comments, would be to take out the first len(s2) characters from s1:

if s1.startswith(s2):
    s3 = s1[len(s2):] 
s3
# 'YFAJYF'

Some tests using the %timeit magic in ipython (python 2.7.12, ipython 5.1.0) suggest that this last approach is faster:

In [1]: s1 = 'AJYFAJYF'

In [2]: s2 = 'AJ'

In [3]: %timeit s3 = s1[len(s2):]
The slowest run took 24.47 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 87.7 ns per loop

In [4]: %timeit s3 = s1[len(s2):]
The slowest run took 32.58 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 87.8 ns per loop

In [5]: %timeit s3 = s1[len(s2):]
The slowest run took 21.81 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 87.4 ns per loop

In [6]: %timeit s3 = s1.replace(s2, '', 1)
The slowest run took 17.64 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 230 ns per loop

In [7]: %timeit s3 = s1.replace(s2, '', 1)
The slowest run took 17.79 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 228 ns per loop

In [8]: %timeit s3 = s1.replace(s2, '', 1)
The slowest run took 16.27 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 234 ns per loop

In [9]: import re

In [10]: %timeit s3 = re.sub('^' + s2, '', s1)
The slowest run took 82.02 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.85 µs per loop

In [11]: %timeit s3 = re.sub('^' + s2, '', s1)
The slowest run took 12.82 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.86 µs per loop

In [12]: %timeit s3 = re.sub('^' + s2, '', s1)
The slowest run took 13.08 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 1.84 µs per loop
Answered By: bli

if you insist on using the ‘-‘ operator, then use a class with the __ sub __ dunder method overitten, with a combination of one of the solutions provided above:

class String(object):
    def __init__(self, string):
        self.string = string

    def __sub__(self, other):
        if self.string.startswith(other.string):
            return self.string[len(other.string):]

    def __str__(self):
        return self.string


sub1 = String('AJYF') - String('AJ')
sub2 = String('GTYF') - String('GTY')
print(sub1)
print(sub2)

It prints:

YF
F
Answered By: moctarjallo

This works for distinct in string

print(set(string1) ^ set(string2))

Answered By: Jyothi Ram
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.