Conversion of strings like \uXXXX in python
Question:
I receive a string like this from a third-party service:
>>> s
'\u0e4f\u032f\u0361\u0e4f'
I know that this string actually contains sequences of a single backslash, lowercase u etc. How can I convert the string such that the '\u0e4f'
is replaced by 'u0e4f'
(i.e. '๏'
), etc.? The result for this example input should be '๏̯͡๏'
.
Answers:
In 2.x:
>>> u'\u0e4f\u032f\u0361\u0e4f'.decode('unicode-escape')
u'u0e4fu032fu0361u0e4f'
>>> print u'\u0e4f\u032f\u0361\u0e4f'.decode('unicode-escape')
๏̯͡๏
There’s an interesting list of encodings supported by .encode()
and .decode()
methods. Those magic ones in the second table include the unicode_escape
.
Python3:
bytes("\u0e4f\u032f\u0361\u0e4f", "ascii").decode("unicode-escape")
I receive a string like this from a third-party service:
>>> s
'\u0e4f\u032f\u0361\u0e4f'
I know that this string actually contains sequences of a single backslash, lowercase u etc. How can I convert the string such that the '\u0e4f'
is replaced by 'u0e4f'
(i.e. '๏'
), etc.? The result for this example input should be '๏̯͡๏'
.
In 2.x:
>>> u'\u0e4f\u032f\u0361\u0e4f'.decode('unicode-escape')
u'u0e4fu032fu0361u0e4f'
>>> print u'\u0e4f\u032f\u0361\u0e4f'.decode('unicode-escape')
๏̯͡๏
There’s an interesting list of encodings supported by .encode()
and .decode()
methods. Those magic ones in the second table include the unicode_escape
.
Python3:
bytes("\u0e4f\u032f\u0361\u0e4f", "ascii").decode("unicode-escape")