Speed performance improvement needed. Using nested for loops


I have a 2D array shaped (1002,1004). For this question it could be generated via:

a = numpy.arange( (1002 * 1004) ).reshape(1002, 1004)

What I do is generate two lists. The lists are generated via:

theta = (61/180.) * numpy.pi
x = numpy.arange(a.shape[0])             #(1002, )
y = numpy.arange(a.shape[1])             #(1004, )

max_y_for_angle = int(y[-1] - (x[-1] / numpy.tan(theta)))

The first list is given by:

x_list = numpy.linspace(0, x[-1], len(x))

Note that this list is identical to x. However, for illustration purposes and to give a clear picture I declared this ‘list’.

What I now want to do is create a y_list which is as long as x_list. I want to use these lists to determine the elements in my 2D array. After I determine and store the sum of the elements, I want to shift my y_list by one and determine the sum of the elements again. I want to do this for max_y_for_angle iterations. The code I have is:

sum_list = numpy.zeros(max_y_for_angle)
for idx in range(max_y_for_angle):
    y_list = numpy.linspace((len(x) / numpy.tan(theta)) + idx, y[0] + idx , len(x))
    elements = 0
    for i in range(len(x)):
       elements += a[x_list[i]][y_list[i]]
    sum_list[idx] = elements

This operation works. However, as one might imagine this takes a lot of time due to the for loop within a for loop. The number of iterations of the for loops do not help as well. How can I speed things up? The operation now takes about 1 s. I’m looking for something below 200 ms.

Is it maybe possible to return a list of the 2D array elements when the inputs are x_list and y_list? I tried the following but this does not work:


Thank you very much!

Asked By: The Dude



It’s possible to return an array of elements form a 2d array by doing a[x, y] where x and y are both integer arrays. This is called advanced indexing or sometimes fancy indexing. In your question you mention lists a lot but never actually use any lists in your code, x_list and y_list are both arrays. Also, numpy multidimensional arrays are generally indexed a[i, j] even when when i and j are integers values.

Using fancy indexing along with some clean up of you code produced this:

import numpy

def line_sums(a, thata):
    xsize, ysize = a.shape
    tan_theta = numpy.tan(theta)
    max_y_for_angle = int(ysize - 1 - ((xsize - 1) / tan_theta))

    x = numpy.arange(xsize)
    y_base = numpy.linspace(xsize / tan_theta, 0, xsize)
    y_base = y_base.astype(int)
    sum_list = numpy.zeros(max_y_for_angle)

    for idx in range(max_y_for_angle):
        sum_list[idx] = a[x, y_base + idx].sum()

    return sum_list

a = numpy.arange( (1002 * 1004) ).reshape(1002, 1004)
theta = (61/180.) * numpy.pi
sum_list = line_sums(a, theta)

Hope that helps.

Answered By: Bi Rico