Python: Replacing backslashes to avoid escape sequences in string

Question:

I´m trying to replace the single backslashes i get within a string with double backslashes, because sometimes the the “backslash+character” combination creates an escape sequence. I have tried various ways (mostly from other stackoverflow questions), but nothing gets me the correct results so far.

Example s = "aa, bb, cc, dd"


string.replace(s,"\","\\")

replaces the first a and b with special characters (can´t get pasting the exact result here to work?):

@a,@b,\cc,\dd

print s.encode("string_escape")

produces

x07a,x08b,\cc,\dd

(same for “unicode-escape”)


using this function

escape_dict={'a':r'a',
           'b':r'b',
           'c':r'c',
           'f':r'f',
           'n':r'n',
           'r':r'r',
           't':r't',
           'v':r'v',
           ''':r''',
           '"':r'"',
           '':r'',
           '1':r'1',
           '2':r'2',
           '3':r'3',
           '4':r'4',
           '5':r'5',
           '6':r'6',
           '7':r'7',
           '8':r'8',
           '9':r'9'}

def raw(text):
    """Returns a raw string representation of text"""
    new_string=''
    for char in text:
        try: new_string+=escape_dict[char]
        except KeyError: new_string+=char
    return new_string

produces

7a,bb,cc,dd

and using this function

import re
import codecs

ESCAPE_SEQUENCE_RE = re.compile(r'''
    ( \U........      # 8-digit hex escapes
    | \u....          # 4-digit hex escapes
    | \x..            # 2-digit hex escapes
    | \[0-7]{1,3}     # Octal escapes
    | \N{[^}]+}     # Unicode characters by name
    | \[\'"abfnrtv]  # Single-character escapes
    )''', re.UNICODE | re.VERBOSE)

def decode_escapes(s):
    def decode_match(match):
        return codecs.decode(match.group(0), 'unicode-escape')

    return ESCAPE_SEQUENCE_RE.sub(decode_match, s)

returns the string with special characters again

 @a,@b,\cc,\dd

The actual strings i need to convert would be something like "GroupAGroup2Layer1"

Asked By: rr5577

||

Answers:

In general I agree with Klaus’s comment. Though that’s not always a possibility.

The quick answer is that you can do this: r’aa, bb, cc, dd’.

I found more information here.

The less happy answer if that isn’t a possibility is that you do your replacements as such:

s = 'aa, bb, cc, dd'
string.replace(s,"x07","\a")
Answered By: unflores
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.