Replacing characters of a string by a given string

Question

Given this string 'www__ww_www_'

I need to replace all the '_' characters with characters from the following string '1234'. The result should be 'www12ww3www4'.

TEXT = 'aio__oo_ecc_'
INSERT = '1234'

insert = list(INSERT)
ret = ''

for char in TEXT:
    if char == '_':
        ret += insert[0]
        insert.pop(0)
    else:
        ret += char

print (ret)
>> aio12oo3ecc4

What is the right way to do this? Because this seems like the most inefficient way.

Asked By: thithien

||

Source

Answer 1

Consider splitting the pattern string by the underscore and zipping it with the string of inserts:

TEXT = 'aio__oo_ecc_a' # '_a' added to illustrate the need for zip_longest
from itertools import zip_longest, chain
''.join(chain.from_iterable(zip_longest(TEXT.split('_'), INSERT, fillvalue='')))
#'aio12oo3ecc4a'

zip_longest is used instead of the “normal” zip to make sure the last fragment of the pattern, if any, is not lost.

A step-by-step exploration:

pieces = TEXT.split('_')
# ['aio', '', 'oo', 'ecc', 'a']
mix = zip_longest(pieces, INSERT, fillvalue='')
# [('aio', '1'), ('', '2'), ('oo', '3'), ('ecc', '4'), ('a', '')]
flat_mix = chain.from_iterable(mix)
# ['aio', '1', '', '2', 'oo', '3', 'ecc', '4', 'a', '']
result = ''.join(flat_mix)

Speed comparison:

This solution: 1.32 µs ± 9.08 ns per loop
Iterator + ternary + list comprehension: 1.77 µs ± 20.8 ns per loop
Original solution: 2 µs ± 13.2 ns per loop
The loop + regex solution: 3.66 µs ± 103 ns per loop

Answered By: DYZ

Answer 2

As pointed in the comments, you can use the str.replace directly:

for c in INSERT:
    TEXT = TEXT.replace('_', c, 1)

You can use also the regex replace for that:

import re
for c in INSERT:
    TEXT = re.sub('_', c, TEXT, 1)

see here: https://docs.python.org/3/library/re.html

Answered By: goutnet

Answer 3

You can loop over the TEXT using a list comprehension that uses a ternary to select from an INSERT iterator or from the current element in TEXT:

>>> TEXT = 'aio__oo_ecc_'
>>> INSERT = '1234'
>>> it = iter(INSERT)
>>> "".join([next(it) if x == "_" else x for x in TEXT])
'aio12oo3ecc4'

The benefits include avoiding Shlemiel the Painter’s Algorithm with ret += char. Also, pop(0) requires the whole list to be shifted forward, so it’s linear (better would be reversing INSERT and using pop()).

In response to some of the comments here, list comprehensions tend to be faster than generators when the whole iterable will be consumed on the spot.

Answered By: ggorlen

Answer 4

You can use an iterator in a replacement function for re.sub:

import re
TEXT = 'aio__oo_ecc_'
INSERT = '1234'
i = iter(INSERT)
print(re.sub('_', lambda _: next(i), TEXT))

This outputs:

aio12oo3ecc4

Answered By: blhsing

Replacing characters of a string by a given string

Question:

Answers: