Replacing characters of a string by a given string

Question:

Given this string 'www__ww_www_'

I need to replace all the '_' characters with characters from the following string '1234'. The result should be 'www12ww3www4'.

TEXT = 'aio__oo_ecc_'
INSERT = '1234'

insert = list(INSERT)
ret = ''

for char in TEXT:
    if char == '_':
        ret += insert[0]
        insert.pop(0)
    else:
        ret += char

print (ret)
>> aio12oo3ecc4

What is the right way to do this? Because this seems like the most inefficient way.

Asked By: thithien

||

Answers:

Consider splitting the pattern string by the underscore and zipping it with the string of inserts:

TEXT = 'aio__oo_ecc_a' # '_a' added to illustrate the need for zip_longest
from itertools import zip_longest, chain
''.join(chain.from_iterable(zip_longest(TEXT.split('_'), INSERT, fillvalue='')))
#'aio12oo3ecc4a'

zip_longest is used instead of the “normal” zip to make sure the last fragment of the pattern, if any, is not lost.

A step-by-step exploration:

pieces = TEXT.split('_')
# ['aio', '', 'oo', 'ecc', 'a']
mix = zip_longest(pieces, INSERT, fillvalue='')
# [('aio', '1'), ('', '2'), ('oo', '3'), ('ecc', '4'), ('a', '')]
flat_mix = chain.from_iterable(mix)
# ['aio', '1', '', '2', 'oo', '3', 'ecc', '4', 'a', '']
result = ''.join(flat_mix)

Speed comparison:

  1. This solution: 1.32 µs ± 9.08 ns per loop
  2. Iterator + ternary + list comprehension: 1.77 µs ± 20.8 ns per loop
  3. Original solution: 2 µs ± 13.2 ns per loop
  4. The loop + regex solution: 3.66 µs ± 103 ns per loop
Answered By: DYZ

As pointed in the comments, you can use the str.replace directly:

for c in INSERT:
    TEXT = TEXT.replace('_', c, 1)

You can use also the regex replace for that:

import re
for c in INSERT:
    TEXT = re.sub('_', c, TEXT, 1)

see here: https://docs.python.org/3/library/re.html

Answered By: goutnet

You can loop over the TEXT using a list comprehension that uses a ternary to select from an INSERT iterator or from the current element in TEXT:

>>> TEXT = 'aio__oo_ecc_'
>>> INSERT = '1234'
>>> it = iter(INSERT)
>>> "".join([next(it) if x == "_" else x for x in TEXT])
'aio12oo3ecc4'

The benefits include avoiding Shlemiel the Painter’s Algorithm with ret += char. Also, pop(0) requires the whole list to be shifted forward, so it’s linear (better would be reversing INSERT and using pop()).

In response to some of the comments here, list comprehensions tend to be faster than generators when the whole iterable will be consumed on the spot.

Answered By: ggorlen

You can use an iterator in a replacement function for re.sub:

import re
TEXT = 'aio__oo_ecc_'
INSERT = '1234'
i = iter(INSERT)
print(re.sub('_', lambda _: next(i), TEXT))

This outputs:

aio12oo3ecc4
Answered By: blhsing
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.