Turn off list reflection in Numba

Question:

I’m trying to accelerate my code using Numba. One of the arguments I’m passing into the function is a mutable list of lists. When I try changing one of the sublists, I get this error:

Failed in nopython mode pipeline (step: nopython mode backend)
cannot reflect element of reflected container: reflected list(reflected list(int64))

I don’t actually care about reflecting changes I make to the native list into the original Python list. How do I go about telling Numba not to reflect the changes? The documentation is pretty vague regarding list reflection in Numba.

Thanks,

Asked By: Alec Tarashansky

||

Answers:

Quoting directly from the docs:

In nopython mode, Numba does not operate on Python objects. list are
compiled into an internal representation. Any list arguments must be
converted into this representation on the way in to nopython mode and
their contained elements must be restored in the original Python
objects via a process called reflection.

Reflection is required to maintain the same semantics as found in
regular Python code. However, the reflection process can be expensive
for large lists and it is not supported for lists that contain
reflected data types. Users cannot use list-of-list as an argument
because of this limitation.

Your best bet would be to give a 2D numpy array of shape len(ll) x max(len(x) for x in ll), ll being the list of lists. I myself use something like this to achieve this, and then pass the arr, lengths to the njit compiled function:

def make_2D_array(lis):
    """Funciton to get 2D array from a list of lists
    """
    n = len(lis)
    lengths = np.array([len(x) for x in lis])
    max_len = np.max(lengths)
    arr = np.zeros((n, max_len))

    for i in range(n):
        arr[i, :lengths[i]] = lis[i]
    return arr, lengths

HTH.

Answered By: Deepak Saini

If you pass list of list parameter to numba, you should use numpy array instead of original Python list. Numba raise reflection error because of the not supported list features. You can compare the two examples below:

This one is getting the same error:

TypeError: Failed in nopython mode pipeline (step: nopython mode backend)
cannot reflect element of reflected container: reflected list(reflected list(int64))

import numba

list_of_list = [[1, 2], [34, 100]]


@numba.njit()
def test(list_of_list):
    if 1 in list_of_list[0]:
        return 'haha'

test(list_of_list)

Smooth running version is;

from numba import njit
import numpy as np


@njit
def test():
    if 1 in set(np_list_of_list[0]):
        return 'haha'


if __name__ == '__main__':
    list_of_list = [[1, 2], [34, 100]]
    np_list_of_list = np.array(list_of_list)
    print(test())
Answered By: Yagmur SAHIN

There could be some ways to handle such reflected data structures; here I will show it using NumPy, Numba, and Awkward libraries; So, we need to import them and create some samples as:

import numpy as np
import numba as nb
import awkward as ak

list_uniform = [random.sample(range(80), 20) for i in range(1000000)]
list_nonuniform = [random.sample(range(80), random.randint(1, 40)) for i in range(1000000)]

1. Uniform structures:

If all inner lists have the same length, so the best choice, AIK, is to

  • Convert the list to a NumPy array for using by numba njit decorators:
matrix_np_uniform = np.array(list_uniform, dtype=np.int32)

the following ways can be used for uniform structures, too:

2. Nonuniform structures:

If we are facing to a list that contains a lot of lists that are not the same in lengths (nonuniform structures), so, we can handle this problem in the following ways:

  • Converting the all lists to numba typed lists:
nb_list = nb.typed.List
matrix_nb = nb_list(nb_list(i) for i in list_nonuniform )  # --> ListType[ListType[int64]]
  • Converting the all lists to numpy arrays in a numba typed list:
matrix_np = nb_list(np.array(i) for i in list_nonuniform )
  • Import the list in numba-compatible way using other libraries e.g., awkward; AIK, it won’t need signatures because stores datatypes internally and it uses some broadcasting and can be utilized for ragged arrays/lists in NumPy manners (read more):
matrix_ak = ak.Array(list_nonuniform )

Benchmarks

Based on my little experiences around the prepared data volume:

1. For uniform structured ones:

  • Time for converting the lists to numba compatible ones:
NumPy (~1.3 Sec) < Awkward (~3.8 Sec) <<< Numba (~1.5 Min)
  • Time for function execution:
NumPy (~21 ms) < Awkward (~31 ms) <<< Numba (~503 ms)

2. For nonuniformed structured ones:

  • Time for converting the lists to numba compatible ones:
Awkward (~3.8 Sec) < NumPy (~5.3 Sec) <<< Numba (~1.5 Min)
  • Time for function execution:
Awkward (~40 ms) < NumPy (~100 ms) <<< Numba (~513 ms)
Answered By: Ali_Sh
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.