What is the multiplication operator actually doing with numpy arrays?

Question:

I am learning NumPy and I am not really sure what is the operator * actually doing. It seems like some form of multiplication, but I am not sure how is it determined. From ipython:

In [1]: import numpy as np

In [2]: a=np.array([[1,2,3]])

In [3]: b=np.array([[4],[5],[6]])

In [4]: a*b
Out[4]: 
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [5]: b*a
Out[5]: 
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [6]: b.dot(a)
Out[6]: 
array([[ 4,  8, 12],
       [ 5, 10, 15],
       [ 6, 12, 18]])

In [7]: a.dot(b)
Out[7]: array([[32]])

It seems like it is doing matrix multiplication, but only b multiplied by a, not the other way around. What is going on?

Asked By: Karel Bílek

||

Answers:

It’s a little bit complicated and has to do with the concept of broadcasting and the fact that all numpy operations are element wise.

  1. a is a 2D array with 1 row and 3 columns and b is a 2D array with 1 column and 3 rows.
  2. If you try to multiply them element by element (which is what numpy tries to do if you do a * b because every basic operation except the dot operation is element wise), it must broadcast the arrays so that they match in all their dimensions.
  3. Since the first array is 1×3 and the second is 3×1 they can be broadcasted to 3×3 matrix according to the broadcasting rules. They will look like:
a = [[1, 2, 3],
     [1, 2, 3],
     [1, 2, 3]]

b = [[4, 4, 4],
     [5, 5, 5],
     [6, 6, 6]]

And now Numpy can multiply them element by element, giving you the result:

[[ 4,  8, 12],
 [ 5, 10, 15],
 [ 6, 12, 18]]

When you are doing a .dot operation it does the standard matrix multiplication. More in docs

Answered By: Viktor Kerkez

* does elementwise multiplication.

Since the arrays are of different shapes, broadcasting rules will be applied.

In [5]: a.shape
Out[5]: (1, 3)

In [6]: b.shape
Out[6]: (3, 1)

In [7]: (a * b).shape
Out[7]: (3, 3)
  1. All input arrays with ndim smaller than the input array of largest ndim, have 1’s prepended to their shapes (does not apply here).
  2. The size in each dimension of the output shape is the maximum of all the input sizes in that dimension.
  3. An input can be used in the calculation if its size in a particular dimension either matches the output size in that dimension, or has value exactly 1.
  4. If an input has a dimension size of 1 in its shape, the first data entry in that dimension will be used for all calculations along that dimension. In other words, the stepping machinery of the ufunc will simply not step along that dimension (the stride will be 0 for that dimension).

So, the resulting shape must be (3, 3) (maximums of a and b dimension sizes) and while performing the multiplication numpy will not step through a’s first dimension and b’s second dimension (their sizes are 1).

The result’s [i][j] element is equal to the product of broadcasted a‘s and b‘s [i][j] element.

(a * b)[0][0] == a[0][0] * b[0][0]
(a * b)[0][1] == a[0][1] * b[0][0]  # (not stepping through b's second dimension)
(a * b)[0][2] == a[0][2] * b[0][0]
(a * b)[1][0] == a[0][0] * b[1][0]  # (not stepping through a's first dimension)

etc.
Answered By: Pavel Anossov
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.