Applying mathematical operation between rows of two numpy arrays

Question:

Let’s assume we have two numpy arrays A (n1xm) and B (n2xm) and I want to apply a certain mathematical operation between the rows of both tables.

For example, let’s say that we want to calculate the Euclidean distance between each row of A and each row of B and store it at a new numpy table C (n1xn2).

The simple for-loop approach would be something like the following:

C = np.zeros((A.shape[0],B.shape[0]))
for i in range(A.shape[0]):
  for j in range(B.shape[0]):
    C[i,j] = np.linalg.norm(A[i]-B[j])

However, the above implementation is not the most efficient. How could I write this differently by using vectorization to speed up the implementation ?

Asked By: MJ13

||

Answers:

You can broadcast over a new axis:

# n1 x m x n2
diff = A[:, :, None] - B[:, :, None].T

# n1 x n2 after summing across m
dists = np.sqrt((diff * diff).sum(1))
Answered By: August