Find and replace string values in list

Question:

I got this list:

words = ['how', 'much', 'is[br]', 'the', 'fish[br]', 'no', 'really']

What I would like is to replace [br] with some fantastic value similar to <br /> and thus getting a new list:

words = ['how', 'much', 'is<br />', 'the', 'fish<br />', 'no', 'really']
Asked By: Eric Herlitz

||

Answers:

You can use, for example:

words = [word.replace('[br]','<br />') for word in words]
Answered By: houbysoft
words = [w.replace('[br]', '<br />') for w in words]

This is called a list comprehension.

Answered By: sberry

Beside list comprehension, you can try map

>>> map(lambda x: str.replace(x, "[br]", "<br/>"), words)
['how', 'much', 'is<br/>', 'the', 'fish<br/>', 'no', 'really']
Answered By: Anthony Kong

In case you’re wondering about the performance of the different approaches, here are some timings:

In [1]: words = [str(i) for i in range(10000)]

In [2]: %timeit replaced = [w.replace('1', '<1>') for w in words]
100 loops, best of 3: 2.98 ms per loop

In [3]: %timeit replaced = map(lambda x: str.replace(x, '1', '<1>'), words)
100 loops, best of 3: 5.09 ms per loop

In [4]: %timeit replaced = map(lambda x: x.replace('1', '<1>'), words)
100 loops, best of 3: 4.39 ms per loop

In [5]: import re

In [6]: r = re.compile('1')

In [7]: %timeit replaced = [r.sub('<1>', w) for w in words]
100 loops, best of 3: 6.15 ms per loop

as you can see for such simple patterns the accepted list comprehension is the fastest, but look at the following:

In [8]: %timeit replaced = [w.replace('1', '<1>').replace('324', '<324>').replace('567', '<567>') for w in words]
100 loops, best of 3: 8.25 ms per loop

In [9]: r = re.compile('(1|324|567)')

In [10]: %timeit replaced = [r.sub('<1>', w) for w in words]
100 loops, best of 3: 7.87 ms per loop

This shows that for more complicated substitutions a pre-compiled reg-exp (as in 9-10) can be (much) faster. It really depends on your problem and the shortest part of the reg-exp.

Answered By: Jörn Hees

An example with for loop (I prefer List Comprehensions).

a, b = '[br]', '<br />'
for i, v in enumerate(words):
    if a in v:
        words[i] = v.replace(a, b)
print(words)
# ['how', 'much', 'is<br/>', 'the', 'fish<br/>', 'no', 'really']
Answered By: Waket Zheng

If performance is important, including an if-else clause improves performance (by about 5% for a list of 1mil strings, no negligible really).

replaced = [w.replace('[br]','<br />') if '[br]' in w else w for w in words]

map() implementation can be improved by calling replace via operator.methodcaller() (by about 20%) but still slower than list comprehension (as of Python 3.9).

from operator import methodcaller
list(map(methodcaller('replace', '[br]', '<br />'), words))

If it suffices to modify the strings in-place, a loop implementation may be the fastest.

for i, w in enumerate(words):
    if '[br]' in w:
        words[i] = w.replace('[br]', '<br />')
Answered By: cottontail
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.