Python: Replacing backslashes to avoid escape sequences in string
Question:
I´m trying to replace the single backslashes i get within a string with double backslashes, because sometimes the the “backslash+character” combination creates an escape sequence. I have tried various ways (mostly from other stackoverflow questions), but nothing gets me the correct results so far.
Example s = "aa, bb, cc, dd"
string.replace(s,"\","\\")
replaces the first a and b with special characters (can´t get pasting the exact result here to work?):
@a,@b,\cc,\dd
print s.encode("string_escape")
produces
x07a,x08b,\cc,\dd
(same for “unicode-escape”)
using this function
escape_dict={'a':r'a',
'b':r'b',
'c':r'c',
'f':r'f',
'n':r'n',
'r':r'r',
't':r't',
'v':r'v',
''':r''',
'"':r'"',
' ':r' ',
'1':r'1',
'2':r'2',
'3':r'3',
'4':r'4',
'5':r'5',
'6':r'6',
'7':r'7',
'8':r'8',
'9':r'9'}
def raw(text):
"""Returns a raw string representation of text"""
new_string=''
for char in text:
try: new_string+=escape_dict[char]
except KeyError: new_string+=char
return new_string
produces
7a,bb,cc,dd
and using this function
import re
import codecs
ESCAPE_SEQUENCE_RE = re.compile(r'''
( \U........ # 8-digit hex escapes
| \u.... # 4-digit hex escapes
| \x.. # 2-digit hex escapes
| \[0-7]{1,3} # Octal escapes
| \N{[^}]+} # Unicode characters by name
| \[\'"abfnrtv] # Single-character escapes
)''', re.UNICODE | re.VERBOSE)
def decode_escapes(s):
def decode_match(match):
return codecs.decode(match.group(0), 'unicode-escape')
return ESCAPE_SEQUENCE_RE.sub(decode_match, s)
returns the string with special characters again
@a,@b,\cc,\dd
The actual strings i need to convert would be something like "GroupAGroup2Layer1"
Answers:
In general I agree with Klaus’s comment. Though that’s not always a possibility.
The quick answer is that you can do this: r’aa, bb, cc, dd’.
I found more information here.
The less happy answer if that isn’t a possibility is that you do your replacements as such:
s = 'aa, bb, cc, dd'
string.replace(s,"x07","\a")
I´m trying to replace the single backslashes i get within a string with double backslashes, because sometimes the the “backslash+character” combination creates an escape sequence. I have tried various ways (mostly from other stackoverflow questions), but nothing gets me the correct results so far.
Example s = "aa, bb, cc, dd"
string.replace(s,"\","\\")
replaces the first a and b with special characters (can´t get pasting the exact result here to work?):
@a,@b,\cc,\dd
print s.encode("string_escape")
produces
x07a,x08b,\cc,\dd
(same for “unicode-escape”)
using this function
escape_dict={'a':r'a',
'b':r'b',
'c':r'c',
'f':r'f',
'n':r'n',
'r':r'r',
't':r't',
'v':r'v',
''':r''',
'"':r'"',
' ':r' ',
'1':r'1',
'2':r'2',
'3':r'3',
'4':r'4',
'5':r'5',
'6':r'6',
'7':r'7',
'8':r'8',
'9':r'9'}
def raw(text):
"""Returns a raw string representation of text"""
new_string=''
for char in text:
try: new_string+=escape_dict[char]
except KeyError: new_string+=char
return new_string
produces
7a,bb,cc,dd
and using this function
import re
import codecs
ESCAPE_SEQUENCE_RE = re.compile(r'''
( \U........ # 8-digit hex escapes
| \u.... # 4-digit hex escapes
| \x.. # 2-digit hex escapes
| \[0-7]{1,3} # Octal escapes
| \N{[^}]+} # Unicode characters by name
| \[\'"abfnrtv] # Single-character escapes
)''', re.UNICODE | re.VERBOSE)
def decode_escapes(s):
def decode_match(match):
return codecs.decode(match.group(0), 'unicode-escape')
return ESCAPE_SEQUENCE_RE.sub(decode_match, s)
returns the string with special characters again
@a,@b,\cc,\dd
The actual strings i need to convert would be something like "GroupAGroup2Layer1"
In general I agree with Klaus’s comment. Though that’s not always a possibility.
The quick answer is that you can do this: r’aa, bb, cc, dd’.
I found more information here.
The less happy answer if that isn’t a possibility is that you do your replacements as such:
s = 'aa, bb, cc, dd'
string.replace(s,"x07","\a")