Convert hex-string to integer with python
Question:
Note that the problem is not hex to decimal but a string of hex values to integer.
Say I’ve got a sting from a hexdump (eg. ‘6c 02 00 00’) so i need to convert that into actual hex first, and then get the integer it represents… (this particular one would be 620 as an int16 and int32)
I tried a lot of things but confused myself more. Is there a quick way to do such a conversion in python (preferably 3.x)?
Answers:
update From Python 3.7 on, bytes.from_hex will ignore whitespaces -so, the straightforward thing to do is parse the string to a bytes object, and then see then as an integer:
In [10]: int.from_bytes(bytes.fromhex("6c 02 00 00"), byteorder="little")
Out[10]: 620
original answer
Not only that is a string, but it is in little endian order – meanng that just removing the spaces, and using int(xx, 16)
call will work. Neither does it have the actual byte values as 4 arbitrary 0-255 numbers (in which case struct.unpack would work).
I think a nice approach is to swap the components back into "human readable" order, and use the int call – thus:
number = int("".join("6c 02 00 00".split()[::-1]), 16)
What happens there: the first part of th expession is the split
– it breaks the string at the spaces, and provides a list with four strings, two digits in each. The [::-1] special slice goes next – it means roughly "provide me a subset of elements from the former sequence, starting at the edges, and going back 1 element at a time" – which is a common Python idiom to reverse any sequence.
This reversed sequence is used in the call to "".join(...)
– which basically uses the empty string as a concatenator to every element on the sequence – the result of the this call is "0000026c". With this value, we just call Python’s int
class which accepts a secondary optional paramter denoting the base that should be used to interpret the number denoted in the first argument.
>>> int("".join("6c 02 00 00".split()[::-1]), 16)
620
Another option, is to cummulatively add the conversion of each 2 digits, properly shifted to their weight according to their position – this can also be done in a single expression using reduce
, though a 4 line Python for
loop would be more readable:
>>> from functools import reduce #not needed in Python2.x
>>> reduce(lambda x, y: x + (int(y[1], 16)<<(8 * y[0]) ), enumerate("6c 02 00 00".split()), 0)
620
update The OP just said he does not actually have the "spaces" in the string – in that case, one can use just abotu the same methods, but taking each two digits instead of the split()
call:
reduce(lambda x, y: x + (int(y[1], 16)<<(8 * y[0]//2) ), ((i, a[i:i+2]) for i in range(0, len(a), 2)) , 0)
(where a
is the variable with your digits, of course) –
Or, convert it to an actual 4 byte number in memory, usign the hex codec, and unpack the number with struct – this may be more semantic correct for your code:
import codecs
import struct
struct.unpack("<I", codecs.decode("6c020000", "hex") )[0]
So the approach here is to pass each 2 digits to an actual byte in memory in a bytes object returned by the codecs.decode
call, and struct to read the 4 bytes in the buffer as a single 32bit integer.
You can use unhexlify()
to convert the hex string to its binary form, and then use struct.unpack()
to decode the little endian value into an int:
>>> from struct import unpack
>>> from binascii import unhexlify
>>> n = unpack('<i', unhexlify('6c 02 00 00'.replace(' ','')))[0]
>>> n
The format string '<i'
means little endian signed integer. You can substitute with '<I'
or '<L'
for unsigned int or long (both 4 bytes).
If the data does not contain spaces this simplifies to
>>> n = unpack('<i', unhexlify('6c020000'))[0]
Note that the problem is not hex to decimal but a string of hex values to integer.
Say I’ve got a sting from a hexdump (eg. ‘6c 02 00 00’) so i need to convert that into actual hex first, and then get the integer it represents… (this particular one would be 620 as an int16 and int32)
I tried a lot of things but confused myself more. Is there a quick way to do such a conversion in python (preferably 3.x)?
update From Python 3.7 on, bytes.from_hex will ignore whitespaces -so, the straightforward thing to do is parse the string to a bytes object, and then see then as an integer:
In [10]: int.from_bytes(bytes.fromhex("6c 02 00 00"), byteorder="little")
Out[10]: 620
original answer
Not only that is a string, but it is in little endian order – meanng that just removing the spaces, and using int(xx, 16)
call will work. Neither does it have the actual byte values as 4 arbitrary 0-255 numbers (in which case struct.unpack would work).
I think a nice approach is to swap the components back into "human readable" order, and use the int call – thus:
number = int("".join("6c 02 00 00".split()[::-1]), 16)
What happens there: the first part of th expession is the split
– it breaks the string at the spaces, and provides a list with four strings, two digits in each. The [::-1] special slice goes next – it means roughly "provide me a subset of elements from the former sequence, starting at the edges, and going back 1 element at a time" – which is a common Python idiom to reverse any sequence.
This reversed sequence is used in the call to "".join(...)
– which basically uses the empty string as a concatenator to every element on the sequence – the result of the this call is "0000026c". With this value, we just call Python’s int
class which accepts a secondary optional paramter denoting the base that should be used to interpret the number denoted in the first argument.
>>> int("".join("6c 02 00 00".split()[::-1]), 16)
620
Another option, is to cummulatively add the conversion of each 2 digits, properly shifted to their weight according to their position – this can also be done in a single expression using reduce
, though a 4 line Python for
loop would be more readable:
>>> from functools import reduce #not needed in Python2.x
>>> reduce(lambda x, y: x + (int(y[1], 16)<<(8 * y[0]) ), enumerate("6c 02 00 00".split()), 0)
620
update The OP just said he does not actually have the "spaces" in the string – in that case, one can use just abotu the same methods, but taking each two digits instead of the split()
call:
reduce(lambda x, y: x + (int(y[1], 16)<<(8 * y[0]//2) ), ((i, a[i:i+2]) for i in range(0, len(a), 2)) , 0)
(where a
is the variable with your digits, of course) –
Or, convert it to an actual 4 byte number in memory, usign the hex codec, and unpack the number with struct – this may be more semantic correct for your code:
import codecs
import struct
struct.unpack("<I", codecs.decode("6c020000", "hex") )[0]
So the approach here is to pass each 2 digits to an actual byte in memory in a bytes object returned by the codecs.decode
call, and struct to read the 4 bytes in the buffer as a single 32bit integer.
You can use unhexlify()
to convert the hex string to its binary form, and then use struct.unpack()
to decode the little endian value into an int:
>>> from struct import unpack
>>> from binascii import unhexlify
>>> n = unpack('<i', unhexlify('6c 02 00 00'.replace(' ','')))[0]
>>> n
The format string '<i'
means little endian signed integer. You can substitute with '<I'
or '<L'
for unsigned int or long (both 4 bytes).
If the data does not contain spaces this simplifies to
>>> n = unpack('<i', unhexlify('6c020000'))[0]