Convert octal representation of UTF-8
Question:
I have a variable like this:
>>> s = '\320\227\320\264\320\260\320\275\320\270\320\265 \320\261\321\213\320\262\321\210\320\265\320\271'
>>> print(s)
320227320264320260320275320270320265 320261321213320262321210320265320271
This contains the octal escape representations of the UTF-8 encoding of the string “Зданиебывшей” (octal 320 227
= hex D0 97
= UTF-8 for “З”). How can I decode this string to “Зданиебывшей”?
Answers:
This is a bit of a hack.
s = '\320\227\320\264\320\260\320\275\320\270\320\265 \320\261\321\213\320\262\321\210\320\265\320\271'
b = bytes([int(i, 8) for i in s.split("\")[1:]])
print(b.decode("utf8"))
yields: Зданиебывшей
Or use the codecs
module.
b2 = codecs.escape_decode(s)[0]
print(b2.decode("utf8"))
Which would yield the same result.
I have a variable like this:
>>> s = '\320\227\320\264\320\260\320\275\320\270\320\265 \320\261\321\213\320\262\321\210\320\265\320\271'
>>> print(s)
320227320264320260320275320270320265 320261321213320262321210320265320271
This contains the octal escape representations of the UTF-8 encoding of the string “Зданиебывшей” (octal 320 227
= hex D0 97
= UTF-8 for “З”). How can I decode this string to “Зданиебывшей”?
This is a bit of a hack.
s = '\320\227\320\264\320\260\320\275\320\270\320\265 \320\261\321\213\320\262\321\210\320\265\320\271'
b = bytes([int(i, 8) for i in s.split("\")[1:]])
print(b.decode("utf8"))
yields: Зданиебывшей
Or use the codecs
module.
b2 = codecs.escape_decode(s)[0]
print(b2.decode("utf8"))
Which would yield the same result.