Python: Element-wise multiplication of 3d with 3d arrays

Question:

I am having some problems implementing the following equation in a performant way using Python:

Equation

beta and gamma are cartesian coordinates {x,y} and b,m are some index value which can be quite big n=10000. I have a working version of the code which is shown below for the simple case of l=2 and m,b = 4 (l and m always have the same length). I checked the code using timeit and the the bottleneck is the element-wise multiplication with an array of size (3,3) and the reshaping of the resulting array into shape (3m,3m).
Does anybody has an idea how to increase the performance? (I also noticed that my current version suffers a quite big overhead for large values of l….)

import numpy as np 

g_l3 = np.array([[1, 4, 5],[2, 6, 7]])
g_l33  = g_l3.reshape(-1, 3, 1) * g_l3.reshape(-1, 1, 3)

A_lm = np.arange(1, 9, 1).reshape(2, 4)
B_lb = np.arange(7, 15, 1).reshape(2, 4)

AB_lmb = A_lm.reshape(-1, 4, 1) * B_lb.reshape(-1, 1, 4)

D_lmb33 = np.sum(g_l33.reshape(-1, 1, 1, 3, 3) * AB_lmb.reshape(-1, 4, 4, 1, 1), axis=0)
D = np.concatenate(np.concatenate(D_lmb33, axis=2), axis=0)
Asked By: Jan

||

Answers:

In [387]: %%timeit
     ...: g_l3 = np.array([[1, 4, 5],[2, 6, 7]])
     ...
     ...: D_lmb33 = np.sum(g_l33.reshape(-1, 1, 1, 3, 3) * AB_lmb.reshape(-1, 4,
     ...:  4, 1, 1), axis=0)
     ...: D = np.concatenate(np.concatenate(D_lmb33, axis=2), axis=0)
     ...: 
     ...: 
70.7 µs ± 226 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Examining the pieces, and rewriting the reshape with newaxis, which is visually clearer to me – though basically the same speed:

In [388]: g_l3.shape
Out[388]: (2, 3)
In [389]: g_l33.shape
Out[389]: (2, 3, 3)
In [390]: np.allclose(g_l33, g_l3[:,:,None]*g_l3[:,None,:])
Out[390]: True
In [391]: AB_lmb.shape
Out[391]: (2, 4, 4)
In [392]: np.allclose(AB_lmb, A_lm[:,:,None]*B_lb[:,None,:])
Out[392]: True

So these the common outer products on the last dimension of 2d arrays.

And another outer,

In [393]: temp=g_l33.reshape(-1, 1, 1, 3, 3) * AB_lmb.reshape(-1, 4, 4, 1, 1)
In [394]: temp.shape
Out[394]: (2, 4, 4, 3, 3)
In [396]: np.allclose(temp, g_l33[:,None,None,:,:] * AB_lmb[:, :,:, None,None])
Out[396]: True

These probably could be combined into one expression, but that’s not necessary.

D_lmb33 sums on the leading dimension:

In [405]: D_lmb33.shape
Out[405]: (4, 4, 3, 3)

the double concatenate can also be done with a transpose and reshape:

In [406]: np.allclose(D_lmb33.transpose(1,2,0,3).reshape(12,12),D)
Out[406]: True

Overall your code appears to make efficient use of the numpy. For a large leading dimension that (N,4,4,3,3) intermediate array could be large, and take time. But within numpy itself there isn’t an alternative. I don’t think the algebra allows us to do the sum earlier. Using numba or numexpr another question.

Answered By: hpaulj