Inconvenience to use single precision floats in numpy

Question:

When writing code using single precision (float32) in numpy, it is too hard to write.

First, the decalartion of single presion float is too long. We have to type all variables as follows.

a = np.float32(5)

But some other language use simpler representation.

a = 5.f

Second, arthemetic operations are also inconvenient.

b = np.int32(5)+np.float32(5)

I expected the type of b is numpy.float32 but it is numpy.float64.

Of course,

b = np.add(np.int32(5), np.float32(5), dtype=np.float32)

returns what I want. But it is too long to replace all operations.

Are there any simpler way to use single precision in numpy?

Asked By: Hyuck Kang

||

Answers:

The problem is that NumPy promotes the types when you use different types in an operation. float32 only stays float32 if the other numeric operand has a dtype of:

  • float32 or less
  • int16 or less
  • uint16 or less

If the other operand has another dtype the result will be float64 (or complex if the other operand is complex). The dtypes listed above aren’t the most common ones so almost any operation (especially when the other is a Python integer/float) using the standard operators +, -, /, *, … will promote your float32 to float64.

Unfortunately there’s not much you can do to avoid that. In lots of the cases it’s okay that NumPy does that because:

  • Most architectures can process double precision as fast as single precision floats. Arithmetic operations in Python work fast on Python types but slower on others.
import numpy as np
a32 = np.float32(1)
a64 = np.float64(1)
a = 1.
%timeit [a32 + a32 for _ in range(20000)]  # 100 loops, best of 3: 4.58 ms per loop
%timeit [a64 + a64 for _ in range(20000)]  # 100 loops, best of 3: 4.83 ms per loop
%timeit [a + a for _ in range(20000)]      # 100 loops, best of 3: 2.72 ms per loop
  • Python types have so much overhead that the memory overhead of scalar double precision floats is almost negligible.
import sys
import numpy as np
    
sys.getsizeof(np.float32(1))  # 28
sys.getsizeof(np.float64(1))  # 32
sys.getsizeof(1.)             # 24  # that's also a double on my computer!

However it makes sense to use single precision floats if you have huge arrays and you run into memory problems otherwise, or if you interact with other libraries that expect single precision floats (machine-learning, GPU, …).

But as mentioned above you’ll almost always fight against the coercion rules, which prevent you from running into unexpected problems.

The example with int32 + float32 is actually a great example! You expect the result to be float32 – but there is a problem: you can’t represent the every int32 as float32:

np.iinfo(np.int32(1))             # iinfo(min=-2147483648, max=2147483647, dtype=int32)
int(np.float32(2147483647))       # 2147483648
np.int32(np.float32(2147483647))  # -2147483648

Yes, just by converting the value to a single precision float and converting it back to an integer it changed it’s value. That’s why NumPy uses double precision, so that you don’t end up with an unexpected result! That’s why you need to force NumPy to do something that could be wrong (from the general user perspective).


Since there’s no (as far as I know) way to restrict type promotion with Numpy you have to invent your own.

For example you could create a class that wraps a NumPy array and uses special methods to implement the dtype-d functions for the operators:

import numpy as np

class Arr32:
    def __init__(self, arr):
        self.arr = arr
        
    def __add__(self, other):
        other_arr = other
        if isinstance(other, Arr32):
            other_arr = other.arr
        return self.__class__(np.add(self.arr, other_arr, dtype=np.float32))
        
    def __sub__(self, other):
        other_arr = other
        if isinstance(other, Arr32):
            other_arr = other.arr
        return self.__class__(np.subtract(self.arr, other_arr, dtype=np.float32))
        
    def __mul__(self, other):
        other_arr = other
        if isinstance(other, Arr32):
            other_arr = other.arr
        return self.__class__(np.multiply(self.arr, other_arr, dtype=np.float32))
        
    def __truediv__(self, other):
        other_arr = other
        if isinstance(other, Arr32):
            other_arr = other.arr
        return self.__class__(np.divide(self.arr, other_arr, dtype=np.float32))

But that only implements a small subset of the NumPy functionality and will quickly result in lots of code and edge cases that might have been forgotten. There might be smarter ways nowadays using __array_ufunc__ or __array_function__, but I haven’t used these myself so I cannot comment on the amount of work or suitability.

So my preferred solution would be to create helper functions for the function that are needed:

import numpy as np

def arr32(a):
    return np.float32(a)

def add32(a1, a2):
    return np.add(a1, a2, dtype=np.float32)

def sub32(a1, a2):
    return np.subtract(a1, a2, dtype=np.float32)

def mul32(a1, a2):
    return np.multiply(a1, a2, dtype=np.float32)

def div32(a1, a2):
    return np.divide(a1, a2, dtype=np.float32)

Or only use in-place operations because these won’t promote the type:

>>> import numpy as np

>>> arr = np.float32([1,2,3])
>>> arr += 2
>>> arr *= 3
>>> arr
array([ 9., 12., 15.], dtype=float32)
Answered By: MSeifert