Find index mapping between two numpy arrays

Question

Is there a nice way in numpy to get element-wise indexes of where each element in array1 is in array2?

An example:

array1 = np.array([1, 3, 4])
array2 = np.arange(-2, 5, 1, dtype=np.int)

np.where(array1[0] == array2)
# (array([3]),)
np.where(array1[1] == array2)
# (array([5]),)
np.where(array1[2] == array2)
# (array([6]),)

I would like to do

np.where(array1 == array2)
# (array([3 5 6]),)

Is something like this possible? We are guaranteed that all entries in array1 can be found in array2.

Asked By: pingul

||

Source

Answer 1

Approach #1 : Use np.in1d there to get a mask of places where matches occur and then np.where to get those index positions –

np.where(np.in1d(array2, array1))

Approach #2 : With np.searchsorted –

np.searchsorted(array2, array1)

Please note that if array2 is not sorted, we need to use the additional optional argument sorter with it.

Sample run –

In [14]: array1
Out[14]: array([1, 3, 4])

In [15]: array2
Out[15]: array([-2, -1,  0,  1,  2,  3,  4])

In [16]: np.where(np.in1d(array2, array1))
Out[16]: (array([3, 5, 6]),)

In [17]: np.searchsorted(array2, array1)
Out[17]: array([3, 5, 6])

Runtime test –

In [62]: array1 = np.random.choice(10000,1000,replace=0)

In [63]: array2 = np.sort(np.random.choice(100000,10000,replace=0))

In [64]: %timeit np.where(np.in1d(array2, array1))
1000 loops, best of 3: 483 µs per loop

In [65]: %timeit np.searchsorted(array2, array1)
10000 loops, best of 3: 40 µs per loop

Answered By: Divakar

Answer 2

Here is a simpler way if your arrays are not too big.

np.equal.outer(array1,array2).argmax(axis=1)

If array1 has size N and array2 has size M, this creates a temporary array of shape (N,M), therefore the above method is not recommended if you have arrays so large that it doesn’t fit in memory.

Answered By: syockit

Find index mapping between two numpy arrays

Question:

Answers: