Conversion of strings like \uXXXX in python

Question:

I receive a string like this from a third-party service:

>>> s
'\u0e4f\u032f\u0361\u0e4f'

I know that this string actually contains sequences of a single backslash, lowercase u etc. How can I convert the string such that the '\u0e4f' is replaced by 'u0e4f' (i.e. '๏'), etc.? The result for this example input should be '๏̯͡๏'.

Asked By: Soid

||

Answers:

In 2.x:

>>> u'\u0e4f\u032f\u0361\u0e4f'.decode('unicode-escape')
u'u0e4fu032fu0361u0e4f'
>>> print u'\u0e4f\u032f\u0361\u0e4f'.decode('unicode-escape')
๏̯͡๏

There’s an interesting list of encodings supported by .encode() and .decode() methods. Those magic ones in the second table include the unicode_escape.

Answered By: Tuttle

Python3:

bytes("\u0e4f\u032f\u0361\u0e4f", "ascii").decode("unicode-escape")
Answered By: Ilya Kharlamov
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.