Much different result when using numba

Question

I have here pure python code, except just making a NumPy array. My problem here is that the result I get is completely wrong when I use @jit, but when I remove it its good. Could anyone give me any tips on why this is?

@jit
def grayFun(image: np.array) -> np.array:
      
    gray_image = np.empty_like(image)
    
    
    for i in range(image.shape[0]):
        for j in range(image.shape[1]):
            gray = gray_image[i][j][0]*0.21 + gray_image[i][j][1]*0.72 + gray_image[i][j][2]*0.07
            gray_image[i][j] = (gray,gray,gray)
    
    gray_image = gray_image.astype("uint8")
    return gray_image

Asked By: ili

||

Source

Answer 1

This will return a grayscale image with your conversion formula. USUALLY, you do not need to duplicate the columns; a grayscale image with shape (X,Y) can be used just like an image with shape (X,Y,3).

def gray(image):
   return image[:,:,0]*0.21+image[:,:,1]*0.72 + image[:,:,2]*0.07

Answered By: Tim Roberts

Answer 2

This should work just fine with numba. @TimRobert’s answer is definitely fast, so you may just want to go with that implementation. But the biggest win is simply from vectorization. I’m sure others could find additional performance tweaks but at this point I think we’ve whittled down most of the runtime & issues:

# your implementation, but fixed so that `gray` is calculated from `image`
def grayFun(image: np.array) -> np.array:
    gray_image = np.empty_like(image)
    for i in range(image.shape[0]):
        for j in range(image.shape[1]):
            gray = image[i][j][0]*0.21 + image[i][j][1]*0.72 + image[i][j][2]*0.07
            gray_image[i][j] = (gray,gray,gray)
    gray_image = gray_image.astype("uint8")
    return gray_image

# a vectorized numpy version of your implementation
def grayQuick(image: np.array) -> np.array:
    return np.tile(
        np.expand_dims(
            (image[:, :, 0]*0.21 + image[:, :, 1]*0.72 + image[:, :, 2]*0.07), -1
        ),
        (1,1, 3)
    ).astype(np.uint8)

# a parallelized implementation in numba
@numba.jit
def gray_numba(image: np.array) -> np.array:
    out = np.empty_like(image)
    for i in numba.prange(image.shape[0]):
        for j in numba.prange(image.shape[1]):
            gray = np.uint8(image[i, j, 0]*0.21 + image[i, j, 1]*0.72 + image[i, j, 2]*0.07)
            out[i, j, :] = gray
    return out

# a 2D solution leveraging @TimRoberts's speedup
def gray_2D(image):
   return image[:,:,0]*0.21+image[:,:,1]*0.72 + image[:,:,2]*0.07

I loaded a reasonably large image:

In [69]: img = matplotlib.image.imread(os.path.expanduser(
    ...:     "~/Desktop/Screen Shot.png"
    ...: ))
    ...: image = (img[:, :, :3] * 256).astype('uint8')
    ...: 

In [70]: image.shape
Out[70]: (1964, 3024, 3)

Now, running these three reveals a slight speedup from numba, while the fastest is the 2D solution:

In [71]: %%timeit
    ...: grey = grayFun(image)  # watch out - this takes ~21 minutes
    ...:
    ...:
2min 56s ± 1min 58s per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [72]: %%timeit
    ...: grey_np = grayQuick(image)
    ...:
    ...:
556 ms ± 25.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [73]: %%timeit
    ...: grey = gray_numba(image)
    ...:
    ...:
246 ms ± 19.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [74]: %%timeit
    ...: grey = gray_2D(image)
    ...:
    ...:
117 ms ± 10.4 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Note that numba will be noticeably slower on the first iteration, so the vectorized numpy solutions will significantly outperform numba if you’re only doing this once. But if you’re going to call the function repeatedly within the same python session numba is a good option. You could of course use numba for the 2D result to get a further speedup – I’m not sure if this would outperform numpy though.

Answered By: Michael Delgado

Much different result when using numba

Question:

Answers: