Replace all words from word list with another string in python
Question:
I have a user entered string and I want to search it and replace any occurrences of a list of words with my replacement string.
import re
prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"]
# word[1] contains the user entered message
themessage = str(word[1])
# would like to implement a foreach loop here but not sure how to do it in python
for themessage in prohibitedwords:
themessage = re.sub(prohibitedWords, "(I'm an idiot)", themessage)
print themessage
The above code doesn’t work, I’m sure I don’t understand how python for loops work.
Answers:
try this:
prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"]
themessage = str(word[1])
for word in prohibitedwords:
themessage = themessage.replace(word, "(I'm an idiot)")
print themessage
You can do that with a single call to sub
:
big_regex = re.compile('|'.join(map(re.escape, prohibitedWords)))
the_message = big_regex.sub("repl-string", str(word[1]))
Example:
>>> import re
>>> prohibitedWords = ['Some', 'Random', 'Words']
>>> big_regex = re.compile('|'.join(map(re.escape, prohibitedWords)))
>>> the_message = big_regex.sub("<replaced>", 'this message contains Some really Random Words')
>>> the_message
'this message contains <replaced> really <replaced> <replaced>'
Note that using str.replace
may lead to subtle bugs:
>>> words = ['random', 'words']
>>> text = 'a sample message with random words'
>>> for word in words:
... text = text.replace(word, 'swords')
...
>>> text
'a sample message with sswords swords'
while using re.sub
gives the correct result:
>>> big_regex = re.compile('|'.join(map(re.escape, words)))
>>> big_regex.sub("swords", 'a sample message with random words')
'a sample message with swords swords'
As thg435 points out, if you want to replace words and not every substring you can add the word boundaries to the regex:
big_regex = re.compile(r'b%sb' % r'b|b'.join(map(re.escape, words)))
this would replace 'random'
in 'random words'
but not in 'pseudorandom words'
.
Code:
prohibitedWords =["MVGame","Kappa","DatSheffy","DansGame",
"BrainSlug","SwiftRage","Kreygasm",
"ArsonNoSexy","GingerPower","Poooound","TooSpicy"]
themessage = 'Brain'
self_criticism = '(I`m an idiot)'
final_message = [i.replace(themessage, self_criticism) for i in prohibitedWords]
print final_message
Result:
['MVGame', 'Kappa', 'DatSheffy', 'DansGame', '(I`m an idiot)Slug', 'SwiftRage',
'Kreygasm', 'ArsonNoSexy', 'GingerPower', 'Poooound','TooSpicy']
Based on Bakariu’s answer,
A simpler way to use re.sub would be like this.
words = ['random', 'words']
text = 'a sample message with random words'
new_sentence = re.sub("random|words", "swords", text)
The output is "a sample message with swords swords"
I have a user entered string and I want to search it and replace any occurrences of a list of words with my replacement string.
import re
prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"]
# word[1] contains the user entered message
themessage = str(word[1])
# would like to implement a foreach loop here but not sure how to do it in python
for themessage in prohibitedwords:
themessage = re.sub(prohibitedWords, "(I'm an idiot)", themessage)
print themessage
The above code doesn’t work, I’m sure I don’t understand how python for loops work.
try this:
prohibitedWords = ["MVGame","Kappa","DatSheffy","DansGame","BrainSlug","SwiftRage","Kreygasm","ArsonNoSexy","GingerPower","Poooound","TooSpicy"]
themessage = str(word[1])
for word in prohibitedwords:
themessage = themessage.replace(word, "(I'm an idiot)")
print themessage
You can do that with a single call to sub
:
big_regex = re.compile('|'.join(map(re.escape, prohibitedWords)))
the_message = big_regex.sub("repl-string", str(word[1]))
Example:
>>> import re
>>> prohibitedWords = ['Some', 'Random', 'Words']
>>> big_regex = re.compile('|'.join(map(re.escape, prohibitedWords)))
>>> the_message = big_regex.sub("<replaced>", 'this message contains Some really Random Words')
>>> the_message
'this message contains <replaced> really <replaced> <replaced>'
Note that using str.replace
may lead to subtle bugs:
>>> words = ['random', 'words']
>>> text = 'a sample message with random words'
>>> for word in words:
... text = text.replace(word, 'swords')
...
>>> text
'a sample message with sswords swords'
while using re.sub
gives the correct result:
>>> big_regex = re.compile('|'.join(map(re.escape, words)))
>>> big_regex.sub("swords", 'a sample message with random words')
'a sample message with swords swords'
As thg435 points out, if you want to replace words and not every substring you can add the word boundaries to the regex:
big_regex = re.compile(r'b%sb' % r'b|b'.join(map(re.escape, words)))
this would replace 'random'
in 'random words'
but not in 'pseudorandom words'
.
Code:
prohibitedWords =["MVGame","Kappa","DatSheffy","DansGame",
"BrainSlug","SwiftRage","Kreygasm",
"ArsonNoSexy","GingerPower","Poooound","TooSpicy"]
themessage = 'Brain'
self_criticism = '(I`m an idiot)'
final_message = [i.replace(themessage, self_criticism) for i in prohibitedWords]
print final_message
Result:
['MVGame', 'Kappa', 'DatSheffy', 'DansGame', '(I`m an idiot)Slug', 'SwiftRage',
'Kreygasm', 'ArsonNoSexy', 'GingerPower', 'Poooound','TooSpicy']
Based on Bakariu’s answer,
A simpler way to use re.sub would be like this.
words = ['random', 'words']
text = 'a sample message with random words'
new_sentence = re.sub("random|words", "swords", text)
The output is "a sample message with swords swords"