Decode Hex String in Python 3

Question:

In Python 2, converting the hexadecimal form of a string into the corresponding unicode was straightforward:

comments.decode("hex")

where the variable ‘comments’ is a part of a line in a file (the rest of the line does not need to be converted, as it is represented only in ASCII.

Now in Python 3, however, this doesn’t work (I assume because of the bytes/string vs. string/unicode switch. I feel like there should be a one-liner in Python 3 to do the same thing, rather than reading the entire line as a series of bytes (which I don’t want to do) and then converting each part of the line separately. If it’s possible, I’d like to read the entire line as a unicode string (because the rest of the line is in unicode) and only convert this one part from a hexadecimal representation.

Asked By: chimeracoder

||

Answers:

Something like:

>>> bytes.fromhex('4a4b4c').decode('utf-8')
'JKL'

Just put the actual encoding you are using.

Answered By: unbeli
import codecs

decode_hex = codecs.getdecoder("hex_codec")

# for an array
msgs = [decode_hex(msg)[0] for msg in msgs]

# for a string
string = decode_hex(string)[0]
Answered By: Niklas

The answers from @unbeli and @Niklas are good, but @unbeli’s answer does not work for all hex strings and it is desirable to do the decoding without importing an extra library (codecs). The following should work (but will not be very efficient for large strings):

>>> result = bytes.fromhex((lambda s: ("%s%s00" * (len(s)//2)) % tuple(s))('4a82fdfeff00')).decode('utf-16-le')
>>> result == 'x4ax82xfdxfexffx00'
True

Basically, it works around having invalid utf-8 bytes by padding with zeros and decoding as utf-16.

Answered By: HackerBoss
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.