convert float to string numba python numpy array

Question:

I am running a @nb.njit function within which I am trying to put an integer within a string array.

import numpy as np
import numba as nb

@nb.njit(nogil=True)
def func():
    my_array = np.empty(6, dtype=np.dtype("U20"))
    my_array[0] = np.str(2.35646)
    return my_array


if __name__ == '__main__':
    a = func()
    print(a)

I am getting the following error :

numba.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Invalid use of Function(<class 'str'>) with argument(s) of type(s): (float64)

Which function am I supposed to use to do the conversion from float to string within numba ?

Asked By: Chapo

||

Answers:

The numpy.str function is not supported so far. A list of all the supported numpy functions is available on Numba’s website.

The built-in str is not supported either. This can be checked on the supported Python features page.

The only way to do what you are trying would be to somehow make a function that converts a float to a string, using only the features of Python and Numpy supported by Numba.

Before going in this direction, I would nevertheless reconsider the necessity to convert floats into strings. It may not be very efficient and you may lose the benefit of jitting a few functions by adding some overhead due to the conversion of floats to string.

Of course, this is hard to tell without knowing more about the project.

Answered By: Jacques Gaudin

I wanted a float to string conversion that delivers a string like "23.45" with two fraction digits.
My solution is this. Maybe it helps someone.

    def floatToString(self, floatnumber:float32) -> str:
        stringNumber:str = ""
        whole:int = math.floor(floatnumber)
        frac:int = 0
        digits:float = float(floatnumber % 1)
        digitsTimes100:float = float(digits) * float(100.0)
        if digitsTimes100 is not None:
            frac = math.floor(digitsTimes100)
        stringNumber = str(whole)+"."+str(frac)
        return stringNumber

Be aware of rounding issues tho, but it was enough for me.

Answered By: Sandra Peters

While this does not address the float-to-str conversion, I think it is worth mentioning that with later version of Numba you can use str() on integers:

import numba as nb


@nb.njit
def int2str(value):
    return str(value)


n = 10
print([int2str(x) for x in range(-n, n)])
# ['-10', '-9', '-8', '-7', '-6', '-5', '-4', '-3', '-2', '-1', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']

n = 1000
%timeit [str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 391 µs per loop
%timeit [int2str(x) for x in range(-n, n)]
# 1000 loops, best of 5: 813 µs per loop

This is slower than pure Python but it may come handy to ensure njit() acceleration.

The same functionality is not implemented yet for floats.

Answered By: norok2

A relatively generic approach is provided:

import math
import numba as nb


@nb.njit
def cut_trail(f_str):
    cut = 0
    for c in f_str[::-1]:
        if c == "0":
            cut += 1
        else:
            break
    if cut == 0:
        for c in f_str[::-1]:
            if c == "9":
                cut += 1
            else:
                cut -= 1
                break
    if cut > 0:
        f_str = f_str[:-cut]
    if f_str == "":
        f_str = "0"
    return f_str


@nb.njit
def float2str(value):
    if value == 0.0:
        return "0.0"
    elif value < 0.0:
        return "-" + float2str(-value)
    else:
        max_digits = 16
        min_digits = -4
        e10 = math.floor(math.log10(value)) if value != 0.0 else 0
        if min_digits < e10 < max_digits:
            i_part = math.floor(value)
            f_part = math.floor((1 + value % 1) * 10.0 ** max_digits)
            i_str = str(i_part)
            f_str = cut_trail(str(f_part)[1:max_digits - e10])
            return i_str + "." + f_str
        else:
            m10 = value / 10.0 ** e10
            exp_str_len = 4
            i_part = math.floor(m10)
            f_part = math.floor((1 + m10 % 1) * 10.0 ** max_digits)
            i_str = str(i_part)
            f_str = cut_trail(str(f_part)[1:max_digits])
            e_str = str(e10)
            if e10 >= 0:
                e_str = "+" + e_str
            return i_str + "." + f_str + "e" + e_str

This is less precise and comparatively slower (by a ~3x factor) than pure Python:

numbers = 0.0, 1.0, 1.000001, -2000.000014, 1234567890.12345678901234567890, 1234567890.12345678901234567890e10, 1234567890.12345678901234567890e-30, 1.234e-200, 1.234e200
k = 32
for number in numbers:
    print(f"{number!r:{k}}  {str(number)!r:{k}}  {float2str(number)!r:{k}}")
# 0.0                               '0.0'                             '0.0'                           
# 1.0                               '1.0'                             '1.0'                           
# 1.000001                          '1.000001'                        '1.000001'                      
# -2000.000014                      '-2000.000014'                    '-2000.0000139'                 
# 1234567890.1234567                '1234567890.1234567'              '1234567890.123456'             
# 1.2345678901234567e+19            '1.2345678901234567e+19'          '1.234567890123456e+19'         
# 1.2345678901234568e-21            '1.2345678901234568e-21'          '1.234567890123457e-21'         
# 1.234e-200                        '1.234e-200'                      '1.234e-200'                    
# 1.234e+200                        '1.234e+200'                      '1.2339e+200' 

%timeit -n 10 -r 10 [float2str(x) for x in numbers]
# 10 loops, best of 10: 18.1 µs per loop
%timeit -n 10 -r 10 [str(x) for x in numbers]
# 10 loops, best of 10: 6.5 µs per loop

but it may be used as a workaround until str() gets implemented for float arguments natively in Numba.

Note that the edge cases are handled relatively coarsely, and an error in the last 1-2 digits is relatively frequent.

Answered By: norok2