A list of string replacements in Python
Question:
Is there a far shorter way to write the following code?
my_string = my_string.replace('A', '1')
my_string = my_string.replace('B', '2')
my_string = my_string.replace('C', '3')
my_string = my_string.replace('D', '4')
my_string = my_string.replace('E', '5')
Note that I don’t need those exact values replaced; I’m simply looking for a way to turn 5+ lines into fewer than 5
Answers:
replaceDict = {'A':'1','B':'2','C':'3','D':'4','E':'5'}
for key, replacement in replaceDict.items():
my_string = my_string.replace( key, replacement )
Looks like a good opportunity to use a loop:
mapping = { 'A':'1', 'B':'2', 'C':'3', 'D':'4', 'E':'5'}
for k, v in mapping.iteritems():
my_string = my_string.replace(k, v)
A faster approach if you don’t mind the parentheses would be:
mapping = [ ('A', '1'), ('B', '2'), ('C', '3'), ('D', '4'), ('E', '5') ]
for k, v in mapping:
my_string = my_string.replace(k, v)
Also look into str.translate()
. It replaces characters according to a mapping you provide for Unicode strings, or otherwise must be told what to replace each character from chr(0) to chr(255) with.
You can easily use string.maketrans() to create the mapping string to pass to str.translate():
import string
trans = string.maketrans("ABCDE","12345")
my_string = my_string.translate(trans)
If you want to get the wrong answer, slowly, then use string.replace in a loop. (Though it does work in this case of no overlap among the patterns and replacements.)
For the general case with possible overlaps or a long subject string, use re.sub:
import re
def multisub(subs, subject):
"Simultaneously perform all substitutions on the subject string."
pattern = '|'.join('(%s)' % re.escape(p) for p, s in subs)
substs = [s for p, s in subs]
replace = lambda m: substs[m.lastindex - 1]
return re.sub(pattern, replace, subject)
>>> multisub([('hi', 'bye'), ('bye', 'hi')], 'hi and bye')
'bye and hi'
For the special case of 1-character patterns and 1- or 0-character replacements, use string.maketrans.
One way I do it is with an associated array (a dictionary). Here is an example of the replacements I use when getting a file ready for deployment in LaTeX using regular expressions.
import re
def escapeTexString(string): # Returns TeX-friendly string
rep = { # define desired replacements in this dictionary (mapping)
'&': '\&',
'%': '\%',
'#': '\#',
'_': '\_',
'{': '\{', # REGEX Special
'}': '\}', # REGEX Special
'~': '\char"007E{}', # LaTeX Special
'$': '\$', # REGEX Special
'\': '\char"005C{}', # REGEX/LaTeX Special
'^': '\char"005E{}', # REGEX/LaTeX Special
'"': '\char"FF02{}'
}
# use these two lines to do the replacement (could be shortened to one line)
pattern = re.compile("|".join(map(re.escape,rep.keys()))) # Create single pattern object (key to simultaneous replacement)
new_string = pattern.sub(lambda match: rep[match.group(0)], string)
return new_string
I think it could be a little more efficient:
mapping = { 'A':'1', 'B':'2', 'C':'3', 'D':'4', 'E':'5'}
my_string = "".join([mapping[c] if c in mapping else c for c in my_string])
I suggest some benchmark with “timeit”, with real cases in base of the lenght of “my_string”.
You can do it in one line using Pandas.
import pandas as pd
my_string="A B C test"
my_string =pd.DataFrame([my_string])[0].replace(["A","B","C","D","E"],['1','2','3','4','5'],regex=True)[0]
print(my_string)
'1 2 3 test'
Is there a far shorter way to write the following code?
my_string = my_string.replace('A', '1')
my_string = my_string.replace('B', '2')
my_string = my_string.replace('C', '3')
my_string = my_string.replace('D', '4')
my_string = my_string.replace('E', '5')
Note that I don’t need those exact values replaced; I’m simply looking for a way to turn 5+ lines into fewer than 5
replaceDict = {'A':'1','B':'2','C':'3','D':'4','E':'5'} for key, replacement in replaceDict.items(): my_string = my_string.replace( key, replacement )
Looks like a good opportunity to use a loop:
mapping = { 'A':'1', 'B':'2', 'C':'3', 'D':'4', 'E':'5'}
for k, v in mapping.iteritems():
my_string = my_string.replace(k, v)
A faster approach if you don’t mind the parentheses would be:
mapping = [ ('A', '1'), ('B', '2'), ('C', '3'), ('D', '4'), ('E', '5') ]
for k, v in mapping:
my_string = my_string.replace(k, v)
Also look into str.translate()
. It replaces characters according to a mapping you provide for Unicode strings, or otherwise must be told what to replace each character from chr(0) to chr(255) with.
You can easily use string.maketrans() to create the mapping string to pass to str.translate():
import string
trans = string.maketrans("ABCDE","12345")
my_string = my_string.translate(trans)
If you want to get the wrong answer, slowly, then use string.replace in a loop. (Though it does work in this case of no overlap among the patterns and replacements.)
For the general case with possible overlaps or a long subject string, use re.sub:
import re
def multisub(subs, subject):
"Simultaneously perform all substitutions on the subject string."
pattern = '|'.join('(%s)' % re.escape(p) for p, s in subs)
substs = [s for p, s in subs]
replace = lambda m: substs[m.lastindex - 1]
return re.sub(pattern, replace, subject)
>>> multisub([('hi', 'bye'), ('bye', 'hi')], 'hi and bye')
'bye and hi'
For the special case of 1-character patterns and 1- or 0-character replacements, use string.maketrans.
One way I do it is with an associated array (a dictionary). Here is an example of the replacements I use when getting a file ready for deployment in LaTeX using regular expressions.
import re
def escapeTexString(string): # Returns TeX-friendly string
rep = { # define desired replacements in this dictionary (mapping)
'&': '\&',
'%': '\%',
'#': '\#',
'_': '\_',
'{': '\{', # REGEX Special
'}': '\}', # REGEX Special
'~': '\char"007E{}', # LaTeX Special
'$': '\$', # REGEX Special
'\': '\char"005C{}', # REGEX/LaTeX Special
'^': '\char"005E{}', # REGEX/LaTeX Special
'"': '\char"FF02{}'
}
# use these two lines to do the replacement (could be shortened to one line)
pattern = re.compile("|".join(map(re.escape,rep.keys()))) # Create single pattern object (key to simultaneous replacement)
new_string = pattern.sub(lambda match: rep[match.group(0)], string)
return new_string
I think it could be a little more efficient:
mapping = { 'A':'1', 'B':'2', 'C':'3', 'D':'4', 'E':'5'}
my_string = "".join([mapping[c] if c in mapping else c for c in my_string])
I suggest some benchmark with “timeit”, with real cases in base of the lenght of “my_string”.
You can do it in one line using Pandas.
import pandas as pd
my_string="A B C test"
my_string =pd.DataFrame([my_string])[0].replace(["A","B","C","D","E"],['1','2','3','4','5'],regex=True)[0]
print(my_string)
'1 2 3 test'