Assigning ndarray in cython to new variable very slow? Or what is going on here?

Question

I am fairly new to cython and I am wondering why the following takes very long:

cpdef test(a):
    cdef  np.ndarray[dtype=int] b
    for i in range(10):
        b=a

a=np.array([1,2,3],dtype=int)
t = timeit.Timer(functools.partial(test.test, a))
print(t.timeit(1000000))
-> 0.5446977 Seconds

If i comment out the cdef declaration this is done in no-time. If i declare "a" as np.ndarray in the function header nothing changes. Also, id(a) == id(b) so no new objects are created.

Similar behaviour can be observed when calling a function that takes many ndarray as args, e.g.

cpdef foo(np.ndarray a, np.ndarray b,np.ndarray c, ..... )

Can anybody help me? What am i missing here?

Edit:
I noticed the following:

This is slow:

cpdef foo(np.ndarray[dtype=int,ndim=1] a,np.ndarray[dtype=int,ndim=1] b,np.ndarray[dtype=int,ndim=1] c ) :
        return

This is faster:

def foo(np.ndarray[dtype=int,ndim=1] a,np.ndarray[dtype=int,ndim=1] b,np.ndarray[dtype=int,ndim=1] c ) :
    return

This is the fastest

cpdef foo( a,b,c ) :
    return

The function foo() is called very frequently (many million times) in my project from many different locations and does some calculus with the three numpy arrays (however, it doesnt change their content).

I basically need the speed of knowing the data-type inside of the arrays while also having a very low function-call overead. What would be the most adequate solution for this?

Asked By: Google Hupf

||

Source

Answer 1

b = a generates a bunch of type checking that needs to identify whether the type of a is actually an ndarray and makes sure it exports the buffer protocol with an appropriate element type. In exchange for this one-off cost you get fast indexing of single elements.

If you’re not doing indexing of single elements then typing as np.ndarray is literally pointless and you’re pessimizing your code. If you are doing this indexing then you can get significant optimizations.

If i comment out the cdef declaration this is done in no-time.

This is often a sign that the C compiler has realized the entire function does nothing and optimized it out completely. And therefore your measurement may be meaningless.

cpdef foo(np.ndarray a, np.ndarray b,np.ndarray c, ..... )

just specifying the type as np.ndarray without specifying the element dtype usually gains you very little, and is probably not worthwhile.

If you have a function that you’re calling millions of times then it is likely that the input arrays come from somewhere, and can be pre-typed, probably with less frequency. For example they might come by taking slices from a larger array?

The newer memoryview syntax (int[:]) is quick to slice, so for example if you already have a 2D memoryview (int[:,:] x) it’s very quick to generate a 1D memoryview from it with (e.g. x[:,0]), and it’s quick to pass existing memoryviews into a cdef function with memoryview arguments. (Note that (a) I’m just unsure if all of this applies to np.ndarray too, and (b) seeing up a fresh memoryview is likely to be about the same cost an an np.ndarray so I’m only suggesting using them because I know slicing is quick).

Therefore my main suggestion is to move the typing outwards to try to reduce the number of fresh initializations of these typed arrays. If that isn’t possible then I think you may be stuck.

Answered By: DavidW

Assigning ndarray in cython to new variable very slow? Or what is going on here?

Question:

Answers: