Removing substring from string in Python

Question:

I want to write a function that takes 2 inputs: a string and a substring, then the function will remove that part of the substring from the string.

def remove_substring(s, substr):
   """(str, str) -> NoneType
Returns string without string substr

remove_substring_from_string("Im in cs", "cs")
Im in
    """
    other_s = ''
for substr in s:
    if substr in s:
        continue

How do I continue on from here? Assuming my logic is sound.

Asked By: Michel Hijazin

||

Answers:

Avoiding the use of Python functions.

Method 1

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    i = 0
    while i < len(s) - len(substr) + 1:
        # Check if substring starts at i
        if s[i:i+len(substr)] == substr:
            break   
        i += 1
    else:
        # break not hit, so substr not found
        return s
    
    # break hit
    return s[:i] + s[i+len(substr):]

Method 2

If the range function can be used, the above can be written more compactly as follows.

def remove_substring_from_string(s, substr):
    '''
    find start index in s of substring
    remove it by skipping over it
    '''
    for i in range(len(s) - len(substr) + 1):
        if s[i:i+len(substr)] == substr:
            break
    else:
        # break not hit, so substr not found
        return s
    
    return s[:i] + s[i+len(substr):]

Test

print(remove_substring_from_string("I have nothing to declare except my genuis", " except my genuis"))
# Output: I have nothing to declare'
Answered By: DarrylG

This approach is based on the KMP algorithm:

def KMP(s):
    n = len(s)
    pi = [0 for _ in range(n)]

    for i in range(1, n):
        j = pi[i - 1]
        while j > 0 and s[i] != s[j]:
            j = pi[j - 1]
        
        if s[i] == s[j]:
            j += 1
        
        pi[i] = j
    
    return pi

# Removes all occurences of t in s
def remove_substring_from_string(s, t):
    n = len(s)
    m = len(t)
    
    # Calculate the prefix function using KMP
    pi = KMP(t + 'x00' + s)[m + 1:]
    r = ""

    i = 0
    while i + m - 1 < n: # Before the remaining string is smaller than the substring
        if pi[i + m - 1] == m: # If the substring is here, skip it
            i += m
        else: # Otherwise, add the current character and move to the next
            r += s[i]
            i += 1
    
    # Add the remaining string
    r += s[i:]
    return r

It runs in O(|s| + |t|), but it has a few downsides:

  • The code is long and unintuitive.
  • It requires that there is no null (x00) in the input strings.
  • Its constant factors are pretty bad for short s and t.
  • It doesn’t handle overlapping strings how you might want it: remove_substring_from_string("aaa", "aa") will return "a". The only guarantee made is that t in remove_substring_from_string(s, t) is False for any two strings s and t.

A C++ example and further explanation for the KMP algorithm can be found here. The remove_substring_from_string function then only checks if the entire substring is matched at each position and if so, skips over the substring.

Answered By: dranjohn

I would do this using re.

import re


def remove_substring(s, substr):
    # type: (str, str) -> str
    return re.subn(substr, '', s)[0]
remove_substring('I am in cs', 'cs')
# 'I am in '

remove_substring('This also removes multiple substr that are found. Even if that substr is repeated like substrsubstrsubstr', 'substr')
# 'This also removes multiple  that are found. Even if that  is repeated like '
Answered By: Sam Bourne
def remove_substring(s, substr):
while s != "":
    if substr in s:
        s = s.replace(substr, "")
    else:
        return s

if s == "":
    return "Empty String"

The idea here is that we replace all occurrences of substr within s, by replacing the first instance of substr and then looping until we are done.

Answered By: Farooq
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.