convert float to string numba python numpy array
Question:
I am running a @nb.njit
function within which I am trying to put an integer within a string array.
import numpy as np
import numba as nb
@nb.njit(nogil=True)
def func():
my_array = np.empty(6, dtype=np.dtype("U20"))
my_array[0] = np.str(2.35646)
return my_array
if __name__ == '__main__':
a = func()
print(a)
I am getting the following error :
numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<class 'str'>) with argument(s) of type(s): (float64)
Which function am I supposed to use to do the conversion from float
to string
within numba
?
Answers:
The numpy.str
function is not supported so far. A list of all the supported numpy
functions is available on Numba’s website.
The built-in str
is not supported either. This can be checked on the supported Python features page.
The only way to do what you are trying would be to somehow make a function that converts a float to a string, using only the features of Python and Numpy supported by Numba.
Before going in this direction, I would nevertheless reconsider the necessity to convert floats into strings. It may not be very efficient and you may lose the benefit of jitting a few functions by adding some overhead due to the conversion of floats to string.
Of course, this is hard to tell without knowing more about the project.
I wanted a float to string conversion that delivers a string like "23.45" with two fraction digits.
My solution is this. Maybe it helps someone.
def floatToString(self, floatnumber:float32) -> str:
stringNumber:str = ""
whole:int = math.floor(floatnumber)
frac:int = 0
digits:float = float(floatnumber % 1)
digitsTimes100:float = float(digits) * float(100.0)
if digitsTimes100 is not None:
frac = math.floor(digitsTimes100)
stringNumber = str(whole)+"."+str(frac)
return stringNumber
Be aware of rounding issues tho, but it was enough for me.
While this does not address the float-to-str conversion, I think it is worth mentioning that with later version of Numba you can use str()
on integers:
import numba as nb
@nb.njit
def int2str(value):
return str(value)
n = 10
print([int2str(x) for x in range(-n, n)])
# ['-10', '-9', '-8', '-7', '-6', '-5', '-4', '-3', '-2', '-1', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
n = 1000
%timeit [str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 391 µs per loop
%timeit [int2str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 813 µs per loop
This is slower than pure Python but it may come handy to ensure njit()
acceleration.
The same functionality is not implemented yet for floats.
A relatively generic approach is provided:
import math
import numba as nb
@nb.njit
def cut_trail(f_str):
cut = 0
for c in f_str[::-1]:
if c == "0":
cut += 1
else:
break
if cut == 0:
for c in f_str[::-1]:
if c == "9":
cut += 1
else:
cut -= 1
break
if cut > 0:
f_str = f_str[:-cut]
if f_str == "":
f_str = "0"
return f_str
@nb.njit
def float2str(value):
if value == 0.0:
return "0.0"
elif value < 0.0:
return "-" + float2str(-value)
else:
max_digits = 16
min_digits = -4
e10 = math.floor(math.log10(value)) if value != 0.0 else 0
if min_digits < e10 < max_digits:
i_part = math.floor(value)
f_part = math.floor((1 + value % 1) * 10.0 ** max_digits)
i_str = str(i_part)
f_str = cut_trail(str(f_part)[1:max_digits - e10])
return i_str + "." + f_str
else:
m10 = value / 10.0 ** e10
exp_str_len = 4
i_part = math.floor(m10)
f_part = math.floor((1 + m10 % 1) * 10.0 ** max_digits)
i_str = str(i_part)
f_str = cut_trail(str(f_part)[1:max_digits])
e_str = str(e10)
if e10 >= 0:
e_str = "+" + e_str
return i_str + "." + f_str + "e" + e_str
This is less precise and comparatively slower (by a ~3x factor) than pure Python:
numbers = 0.0, 1.0, 1.000001, -2000.000014, 1234567890.12345678901234567890, 1234567890.12345678901234567890e10, 1234567890.12345678901234567890e-30, 1.234e-200, 1.234e200
k = 32
for number in numbers:
print(f"{number!r:{k}} {str(number)!r:{k}} {float2str(number)!r:{k}}")
# 0.0 '0.0' '0.0'
# 1.0 '1.0' '1.0'
# 1.000001 '1.000001' '1.000001'
# -2000.000014 '-2000.000014' '-2000.0000139'
# 1234567890.1234567 '1234567890.1234567' '1234567890.123456'
# 1.2345678901234567e+19 '1.2345678901234567e+19' '1.234567890123456e+19'
# 1.2345678901234568e-21 '1.2345678901234568e-21' '1.234567890123457e-21'
# 1.234e-200 '1.234e-200' '1.234e-200'
# 1.234e+200 '1.234e+200' '1.2339e+200'
%timeit -n 10 -r 10 [float2str(x) for x in numbers]
# 10 loops, best of 10: 18.1 µs per loop
%timeit -n 10 -r 10 [str(x) for x in numbers]
# 10 loops, best of 10: 6.5 µs per loop
but it may be used as a workaround until str()
gets implemented for float arguments natively in Numba.
Note that the edge cases are handled relatively coarsely, and an error in the last 1-2 digits is relatively frequent.
I am running a @nb.njit
function within which I am trying to put an integer within a string array.
import numpy as np
import numba as nb
@nb.njit(nogil=True)
def func():
my_array = np.empty(6, dtype=np.dtype("U20"))
my_array[0] = np.str(2.35646)
return my_array
if __name__ == '__main__':
a = func()
print(a)
I am getting the following error :
numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<class 'str'>) with argument(s) of type(s): (float64)
Which function am I supposed to use to do the conversion from float
to string
within numba
?
The numpy.str
function is not supported so far. A list of all the supported numpy
functions is available on Numba’s website.
The built-in str
is not supported either. This can be checked on the supported Python features page.
The only way to do what you are trying would be to somehow make a function that converts a float to a string, using only the features of Python and Numpy supported by Numba.
Before going in this direction, I would nevertheless reconsider the necessity to convert floats into strings. It may not be very efficient and you may lose the benefit of jitting a few functions by adding some overhead due to the conversion of floats to string.
Of course, this is hard to tell without knowing more about the project.
I wanted a float to string conversion that delivers a string like "23.45" with two fraction digits.
My solution is this. Maybe it helps someone.
def floatToString(self, floatnumber:float32) -> str:
stringNumber:str = ""
whole:int = math.floor(floatnumber)
frac:int = 0
digits:float = float(floatnumber % 1)
digitsTimes100:float = float(digits) * float(100.0)
if digitsTimes100 is not None:
frac = math.floor(digitsTimes100)
stringNumber = str(whole)+"."+str(frac)
return stringNumber
Be aware of rounding issues tho, but it was enough for me.
While this does not address the float-to-str conversion, I think it is worth mentioning that with later version of Numba you can use str()
on integers:
import numba as nb
@nb.njit
def int2str(value):
return str(value)
n = 10
print([int2str(x) for x in range(-n, n)])
# ['-10', '-9', '-8', '-7', '-6', '-5', '-4', '-3', '-2', '-1', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
n = 1000
%timeit [str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 391 µs per loop
%timeit [int2str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 813 µs per loop
This is slower than pure Python but it may come handy to ensure njit()
acceleration.
The same functionality is not implemented yet for floats.
A relatively generic approach is provided:
import math
import numba as nb
@nb.njit
def cut_trail(f_str):
cut = 0
for c in f_str[::-1]:
if c == "0":
cut += 1
else:
break
if cut == 0:
for c in f_str[::-1]:
if c == "9":
cut += 1
else:
cut -= 1
break
if cut > 0:
f_str = f_str[:-cut]
if f_str == "":
f_str = "0"
return f_str
@nb.njit
def float2str(value):
if value == 0.0:
return "0.0"
elif value < 0.0:
return "-" + float2str(-value)
else:
max_digits = 16
min_digits = -4
e10 = math.floor(math.log10(value)) if value != 0.0 else 0
if min_digits < e10 < max_digits:
i_part = math.floor(value)
f_part = math.floor((1 + value % 1) * 10.0 ** max_digits)
i_str = str(i_part)
f_str = cut_trail(str(f_part)[1:max_digits - e10])
return i_str + "." + f_str
else:
m10 = value / 10.0 ** e10
exp_str_len = 4
i_part = math.floor(m10)
f_part = math.floor((1 + m10 % 1) * 10.0 ** max_digits)
i_str = str(i_part)
f_str = cut_trail(str(f_part)[1:max_digits])
e_str = str(e10)
if e10 >= 0:
e_str = "+" + e_str
return i_str + "." + f_str + "e" + e_str
This is less precise and comparatively slower (by a ~3x factor) than pure Python:
numbers = 0.0, 1.0, 1.000001, -2000.000014, 1234567890.12345678901234567890, 1234567890.12345678901234567890e10, 1234567890.12345678901234567890e-30, 1.234e-200, 1.234e200
k = 32
for number in numbers:
print(f"{number!r:{k}} {str(number)!r:{k}} {float2str(number)!r:{k}}")
# 0.0 '0.0' '0.0'
# 1.0 '1.0' '1.0'
# 1.000001 '1.000001' '1.000001'
# -2000.000014 '-2000.000014' '-2000.0000139'
# 1234567890.1234567 '1234567890.1234567' '1234567890.123456'
# 1.2345678901234567e+19 '1.2345678901234567e+19' '1.234567890123456e+19'
# 1.2345678901234568e-21 '1.2345678901234568e-21' '1.234567890123457e-21'
# 1.234e-200 '1.234e-200' '1.234e-200'
# 1.234e+200 '1.234e+200' '1.2339e+200'
%timeit -n 10 -r 10 [float2str(x) for x in numbers]
# 10 loops, best of 10: 18.1 µs per loop
%timeit -n 10 -r 10 [str(x) for x in numbers]
# 10 loops, best of 10: 6.5 µs per loop
but it may be used as a workaround until str()
gets implemented for float arguments natively in Numba.
Note that the edge cases are handled relatively coarsely, and an error in the last 1-2 digits is relatively frequent.