Why is add and assign (+=) behaving strangely with numpy.ndarrays?

Question:

consider the following Python code:

import numpy

a = numpy.random.rand(3,4)
b = numpy.random.rand(3,4)

c = a
c += b

c/2. - (a + b)/2.

The result of the last line is not an array with zeros. However, if I do:

d = a
d = d + b

d/2. - (a + b)/2.

Then the result is 0, as expected. This looks strange to me, can anybody please explain this behaviour? Is it wise to use +=, /=, ... for numpy arrays at all? Thank you!

(This is only a minimal example, I have to add up several arrays.)

Asked By: Marius

||

Answers:

The operation += is in place. This means it changes the content of array a in your first example!

The operation c=a makes c point to exactly the same data as a. Doing c += b also adds b to a.

The operation d = a also makes d point to a. But then d = d + b assigns a new spot in memory to d + b and then references d to this new spot.

As you can see, the differences are very important! For many algorithms you can exploit either one of these properties to gain efficiency, but caution is always necessary.

See here for a tutorial and here for an indepth SO question.

Answered By: eickenberg

Because the line c = a only makes c point to a. It doesn’t copy a. Then c += b also adds to a.

To add up several arrays, you have to either do it directly, or use a sum function.

c = a + b
c = sum([a, b])
c = numpy.sum([a, b], axis=0)

Or copy the array first:

c = a.copy()
c += b
Answered By: parchment

it is because when you do:

c = a

from then on, a and c are the same object. so after,

c += b

you still have c == a

Answered By: behzad.nouri
Categories: questions Tags: , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.