Numpy subtracting rows of a 2D array from another 2D array without for loop


I know that if we try to subtract a row vector v(1,3072) from a 2D array A(5000,3072) if A and v does have same number of column v is broadcasted, but subtracting stack of row vectors V (each row of V having to be subtracted to the whole of A) cannot be done.
I can’t figure out how to subtract V’s rows one by one from A without using a for loop.

 def compute_distances_one_loop(V):
        num_test = V.shape[0]
        num_train = A.shape[0]
        dists = np.zeros((num_test, num_train))
        for i in range(num_test):   
        return dists

Heres the semi-vectorized form of this problem, how do I get rid of that for loop?

For example


how do I get the matrix_g of shape (6,3) without using a for loop?



If I understood your question correctly, you can make a big array (of the same size as A) by concatenating occurrences of V:

height_A, height_V = A.shape[0], V.shape[0]
occurrences, remainder = divmod(height_A, height_V)
mask = [V for i in range(occurrences)] + [V[:remainder]]
big_V = np.concatenate(mask)

Now you can safely do A – big_V !

(I separated steps to make it clearer, but you can easily combine them into a single statement

big_V = np.concatenate([V for i in range(A.shape[0]//V.shape[0])] + [V[:A.shape[0]%V.shape[0]]])


Edit – I better understand what you need now: subtract EACH row of V from the whole of A. It’s possible by adding a third dimension to both arrays like in the following picture, where A2 – V2 is represented by the array of green panesenter image description here, to make use of broadcasting.

A2 = np.expand_dims(A, axis = 0)  # from shape (5000, 3072) to (1, 5000, 3072)
V2 = np.expand_dims(V, axis = 1)  # from shape (500, 3072) to (500, 1, 3072)
print (A2 - V2)   # broadcasting makes the resulting shape (500, 5000, 3072)

Example, with:

A = np.ones((3,3))*6
V = np.array([[1,2,3],[4,5,6]])

print(A2 - V2)
# array([[[5., 4., 3.],
#         [5., 4., 3.],
#          [5., 4., 3.]],
#        [[2., 1., 0.],
#         [2., 1., 0.],
#         [2., 1., 0.]]])

And you can calculate the array of distances between rows of A and V:

D = np.sqrt(np.square(A2 - V2).sum(axis = 2))

# array([[7.07106781, 7.07106781, 7.07106781],
#        [2.23606798, 2.23606798, 2.23606798]])
Answered By: Swifty
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.