Vectorize a For Loop that search for elements in another array

Question:

I have two arrays that look like this:

Array 1 (locs_array) is a 2D array containing pairs of values I call lon and lat e.g.

array([[-122.463425,   47.195741],
       [-122.498139,   47.190166]])

Array 2 (parents_array) is also a 2D array containing the following columns [id, centerlon, centerlat, upperlon, lowerlon, upperlat, lowerlat] e.g.:

array([[   1.        , -122.463425  ,   47.195741  , -122.46331271,
        -122.46353729,   47.19585367,   47.19562833],
       [   2.        , -122.498149  ,   47.190275  , -122.49803671,
        -122.49826129,   47.19038767,   47.19016233]])

To put a little context, the first array is a list of lon/lat locations while the second array contains definitions for a grid system which is achieved by storing the center, upper and lower coordinates of the lon and lat.

I’m basically trying to find the fastest way of finding which grids (in array 2) the individual locations belong to (in array 1).

This is what I’m doing so far:

This is my function to find the parent for a given lon/lat pair:

def _findFirstParent(self, lon, lat, parents_array):
    parentID = -1
    a = np.where(parents_array[:,3] >= lon)[0] # indices with upperLon >= lon
    b = np.where(parents_array[:,4] <= lon)[0] # indices with lowerLon <= lon
    c = np.intersect1d(a, b) # lon matches

    d = np.where(parents_array[:,5] >= lat)[0] # indices with upperLat >= lat
    e = np.where(parents_array[:,6] <= lat)[0] # indices with lowerLat <= lat
    f = np.intersect1d(d, e) # lat matches

    g = np.intersect1d(c, f) # both match
    if len(g) > 0:
        parentID = g[0]
    return parentID

And I call this using the following code:

for i in range(len(locs_array)):
     each_lon = locs_array[i][0]
     each_lat = locs_array[i][1]
     parentID = findFirstParent(each_lon, each_lat, parents_array)

The Array 1 (Locations array) contains over 100 million records but I figure I can break this down to smaller chunks if needed.
The Array 2 (Grids array) contains over a million records.

What options do I have to make this faster? Any help will be appreciated.

Asked By: OFJ

||

Answers:

One way to vectorize such a loop is broadcasting to force numpy to compare any entry from loc_array to any entry from parent_array.

For example, if you are trying to find, for each integer of A, which integer of B is multiple of it, instead of iterating on values of A, you can

A=np.array([1,2,3,4,5,6,7,8,9,10])
B=np.array([11,12,13,14,15,16,17,18,19,20])

res = B[None,:]%A[:,None] == 0

This returns a 2D array, such as res[i,j] is True iff B[j] is a multiple of A[i]. Then np.argmax(B[None,:]%A[:,None] == 0, axis=1) gives the answer (assuming there is always one, which is the case here)

So, in your case

(locs_array[None,:,0]<=parents_array[:,None,3]) & (locs_array[None,:,0]>=parents_array[:,None,4]) & (locs_array[None,:,1]<=parents_array[:,None,5]) & (locs_array[None,:,1]>=parents_array[:,None,6])

is a 2D array whose value of index [i,j] is True iff locs_array[j] is within boundaries of parents_array[i]

So

idx = np.argmax((locs_array[None,:,0]<=parents_array[:,None,3]) & (locs_array[None,:,0]>=parents_array[:,None,4]) & (locs_array[None,:,1]<=parents_array[:,None,5]) & (locs_array[None,:,1]>=parents_array[:,None,6]), axis=0)

is an array of len(locs_array) integers, each integer being the index of an entry of parents_array in which each location is.
In other words, idx[j]=i means that locs_array[j] is within boundaries of parents_array[i]. Which, I believe, is what you want.

Two warnings tho

  • That’s assume that there is always an answer. If a entry j of loc_array is withing no boundaries from any entry of parents_array, then idx[j] would be 0. So you may need, either to double check result if it is 0 (because 0 could mean both that the location in included in parents_array[0] or that it is included in none). Or, a better solution, would be to have a sentinel: let parents_array[0] be an impossible entry (for example upper bound is smaller than lower bound, so nothing can be included in it). Then, you know that if index is 0, it means that there is no solution.
  • That is just the vectorization of your algorithm. I am pretty sure it is way faster than your implementation. Because you are iterating in python (with your for loop), while I am iterating in numpy (letting broadcasting do all the combinations). But that is still the exact same computation : compare all locs_array[j] with all parents_array[i]. It is just that numpy handle the for loop for us. So, again, certainly way faster (can’t benchmark it, since you didn’t provide a [mre]), but still same algorithm. What was suggested in comments is another (faster, with a different O complexity) algorithm, that can benefit on the knowledge that parents_array is sorted. Which allows, for example dichotomy search. But is not that easy to vectorize.
Answered By: chrslg