How Python decodes UTF8 Encoding in String Format
Question:
Now there is a string of utf-8:
s = '\346\235\216\346\265\267\347\216\211'
I need to decode it, but now I only do it in this way:
result = eval(bytes(f"b'{s}'", encoding="utf8")).decode('utf-8')
This is not safe, so is there a better way?
Answers:
you can do decoded_string = s.decode("utf8")
Use ast.literal_eval()
, it’s not unsafe.
Then you don’t need to call bytes()
, since it will return a byte string.
result = ast.literal_eval(f"b'{s}'").decode('utf-8')
Might be what you are hoping to get … :
'\346\235\216\346\265\267\347\216\211'.encode('utf8').decode('unicode-escape')
Now there is a string of utf-8:
s = '\346\235\216\346\265\267\347\216\211'
I need to decode it, but now I only do it in this way:
result = eval(bytes(f"b'{s}'", encoding="utf8")).decode('utf-8')
This is not safe, so is there a better way?
you can do decoded_string = s.decode("utf8")
Use ast.literal_eval()
, it’s not unsafe.
Then you don’t need to call bytes()
, since it will return a byte string.
result = ast.literal_eval(f"b'{s}'").decode('utf-8')
Might be what you are hoping to get … :
'\346\235\216\346\265\267\347\216\211'.encode('utf8').decode('unicode-escape')