Python – Numpy – Converting a numpy array of hex strings to integers
Question:
I have a numpy array of hex string (eg: [‘9’, ‘A’, ‘B’]) and want to convert them all to integers between 0 255. The only way I know how to do this is use a for loop and append a seperate numpy array.
import numpy as np
hexArray = np.array(['9', 'A', 'B'])
intArray = np.array([])
for value in hexArray:
intArray = np.append(intArray, [int(value, 16)])
print(intArray) # output: [ 9. 10. 11.]
Is there a better way to do this?
Answers:
With the use of list comprehension:
array1=[int(value, 16) for value in hexArray]
print (array1)
output:
[9, 10, 11]
Alternative using map:
import functools
list(map(functools.partial(int, base=16), hexArray))
[9, 10, 11]
A vectorized way with array’s-view functionality –
In [65]: v = hexArray.view(np.uint8)[::4]
In [66]: np.where(v>64,v-55,v-48)
Out[66]: array([ 9, 10, 11], dtype=uint8)
Timings
Setup with given sample scaled-up by 1000x
–
In [75]: hexArray = np.array(['9', 'A', 'B'])
In [76]: hexArray = np.tile(hexArray,1000)
# @tianlinhe's soln
In [77]: %timeit [int(value, 16) for value in hexArray]
1.08 ms ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# @FBruzzesi soln
In [78]: %timeit list(map(functools.partial(int, base=16), hexArray))
1.5 ms ± 40.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# From this post
In [79]: %%timeit
...: v = hexArray.view(np.uint8)[::4]
...: np.where(v>64,v-55,v-48)
15.9 µs ± 294 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
intArray = [int(hexNum, 16) for hexNum in list(hexArray)]
Try this, uses list comprehension to convert each hexadecimal number to an integer.
Here is another good one:
int_array = np.frompyfunc(int, 2, 1) #Can be used, for example, to add broadcasting to a built-in Python function
int_array(hexArray,16).astype(np.uint32)
If you want to know more about it: https://numpy.org/doc/stable/reference/generated/numpy.frompyfunc.html?highlight=frompyfunc#numpy.frompyfunc
Check out the speed:
import numpy as np
import functools
hexArray = np.array(['ffaa', 'aa91', 'b1f6'])
hexArray = np.tile(hexArray,1000)
def x_test(hexArray):
v = hexArray.view(np.uint32)[::4]
return np.where(v > 64, v - 55, v - 48)
int_array = np.frompyfunc(int, 2, 1)
%timeit -n 100 int_array(hexArray,16).astype(np.uint32)
%timeit -n 100 np.fromiter(map(functools.partial(int, base=16), hexArray),dtype=np.uint32)
%timeit -n 100 [int(value, 16) for value in hexArray]
%timeit -n 100 x_test(hexArray)
print(f'nn{int_array(hexArray,16).astype(np.uint32)=}n{np.fromiter(map(functools.partial(int, base=16), hexArray),dtype=np.uint32)=}n{[int(value, 16) for value in hexArray][:10]=}n{x_test(hexArray)=}')
460 µs ± 2.42 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.25 ms ± 2.66 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.11 ms ± 6.56 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
16.8 µs ± 165 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
int_array(hexArray,16).astype(np.uint32)=array([65450, 43665, 45558, ..., 65450, 43665, 45558], dtype=uint32)
np.fromiter(map(functools.partial(int, base=16), hexArray),dtype=np.uint32)=array([65450, 43665, 45558, ..., 65450, 43665, 45558], dtype=uint32)
[int(value, 16) for value in hexArray][:10]=[65450, 43665, 45558, 65450, 43665, 45558, 65450, 43665, 45558, 65450]
x_test(hexArray)=array([47, 42, 43, ..., 47, 42, 43], dtype=uint32)
Divakar’s answer is the fastest, but, unfortunately, does not work for bigger hex numbers (at least for me)
I have a numpy array of hex string (eg: [‘9’, ‘A’, ‘B’]) and want to convert them all to integers between 0 255. The only way I know how to do this is use a for loop and append a seperate numpy array.
import numpy as np
hexArray = np.array(['9', 'A', 'B'])
intArray = np.array([])
for value in hexArray:
intArray = np.append(intArray, [int(value, 16)])
print(intArray) # output: [ 9. 10. 11.]
Is there a better way to do this?
With the use of list comprehension:
array1=[int(value, 16) for value in hexArray]
print (array1)
output:
[9, 10, 11]
Alternative using map:
import functools
list(map(functools.partial(int, base=16), hexArray))
[9, 10, 11]
A vectorized way with array’s-view functionality –
In [65]: v = hexArray.view(np.uint8)[::4]
In [66]: np.where(v>64,v-55,v-48)
Out[66]: array([ 9, 10, 11], dtype=uint8)
Timings
Setup with given sample scaled-up by 1000x
–
In [75]: hexArray = np.array(['9', 'A', 'B'])
In [76]: hexArray = np.tile(hexArray,1000)
# @tianlinhe's soln
In [77]: %timeit [int(value, 16) for value in hexArray]
1.08 ms ± 5.67 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# @FBruzzesi soln
In [78]: %timeit list(map(functools.partial(int, base=16), hexArray))
1.5 ms ± 40.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
# From this post
In [79]: %%timeit
...: v = hexArray.view(np.uint8)[::4]
...: np.where(v>64,v-55,v-48)
15.9 µs ± 294 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
intArray = [int(hexNum, 16) for hexNum in list(hexArray)]
Try this, uses list comprehension to convert each hexadecimal number to an integer.
Here is another good one:
int_array = np.frompyfunc(int, 2, 1) #Can be used, for example, to add broadcasting to a built-in Python function
int_array(hexArray,16).astype(np.uint32)
If you want to know more about it: https://numpy.org/doc/stable/reference/generated/numpy.frompyfunc.html?highlight=frompyfunc#numpy.frompyfunc
Check out the speed:
import numpy as np
import functools
hexArray = np.array(['ffaa', 'aa91', 'b1f6'])
hexArray = np.tile(hexArray,1000)
def x_test(hexArray):
v = hexArray.view(np.uint32)[::4]
return np.where(v > 64, v - 55, v - 48)
int_array = np.frompyfunc(int, 2, 1)
%timeit -n 100 int_array(hexArray,16).astype(np.uint32)
%timeit -n 100 np.fromiter(map(functools.partial(int, base=16), hexArray),dtype=np.uint32)
%timeit -n 100 [int(value, 16) for value in hexArray]
%timeit -n 100 x_test(hexArray)
print(f'nn{int_array(hexArray,16).astype(np.uint32)=}n{np.fromiter(map(functools.partial(int, base=16), hexArray),dtype=np.uint32)=}n{[int(value, 16) for value in hexArray][:10]=}n{x_test(hexArray)=}')
460 µs ± 2.42 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.25 ms ± 2.66 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.11 ms ± 6.56 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
16.8 µs ± 165 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)
int_array(hexArray,16).astype(np.uint32)=array([65450, 43665, 45558, ..., 65450, 43665, 45558], dtype=uint32)
np.fromiter(map(functools.partial(int, base=16), hexArray),dtype=np.uint32)=array([65450, 43665, 45558, ..., 65450, 43665, 45558], dtype=uint32)
[int(value, 16) for value in hexArray][:10]=[65450, 43665, 45558, 65450, 43665, 45558, 65450, 43665, 45558, 65450]
x_test(hexArray)=array([47, 42, 43, ..., 47, 42, 43], dtype=uint32)
Divakar’s answer is the fastest, but, unfortunately, does not work for bigger hex numbers (at least for me)