Why does `/=` raise an error but not `x = x / y` with a read-only numpy array?

Question:

I’ve always thought x /= y is equal to x = x / y. But now I’m facing a situation that I’ll have an error when I use /= but not when using x = x / y. so definitely they shouldn’t be the same in python.

The code is this. (is a simple deep learning code in Tensorflow, read code comments for some details).

import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()

# x_train is a (60000, 28, 28) numpy matrix

x_train /= 1 # this will raise error "ValueError: output array is read-only"
x_train = x_train / 1 # but this will work fine
ValueError                                Traceback (most recent call last)
<ipython-input-44-fceb080f135a> in <module>()
      1 
----> 2 x_train /= 1

ValueError: output array is read-only

I want to ask the difference between them. Why I’m getting this error from /=?

Asked By: Peyman

||

Answers:

As given in Immutable numpy array?, numpy arrays can be explicitly marked read-only.

When you run somearray /= value, you’re asking that array to be modified, in a way that (for typical mutable objects, like most numpy arrays) changes not just the individual reference, but the object itself; this means all copies of somearray (including ones internal to the tensorflow library that provided it) are subject to the change.

By contrast, when you run somearray = somearray / value, you’re creating a new object, not modifying the old one, so there’s no conflict with somearray being marked read-only.

The implementation of __itruediv__ in use could return a completely new object instead of modifying a read-only array, thus making /= on read-only arrays work the same way += works for integers; however, this would mean having the operation allocate memory — potentially expensive; for numpy, it makes sense to not do expensive things unless the author knows they’re being done, so people don’t write unnecessarily slow code by mistake. (Having the read-only flag significantly change performance and memory usage characteristics of an object would be a fair bit of extra state that a developer needs to keep in their head to write correct code!)

Answered By: Charles Duffy

It has to do with python operator semantics and the principle of least surprise.

  • The expression x = x + 1 is approximately equivalent to x = type(x).__add__(x, 1)
  • The expression x += 1 is approximately equivalent to x = type(x).__iadd__(x, 1)

By convention, __add__ never mutates the object it is invoked on. __iadd__ sometimes does and sometimes doesn’t. There are examples of both among Python’s built-ins:

  • int and str are immutable, so always return a new object. The re-assignment step is very important in this case, since otherwise the name wouldn’t get bound to the new value.
  • list adds in-place. The reassignment is effectively a no-op in this case (but it still happens).

Numpy arrays are usually mutable. Operations like +=, -=, *=, /=, etc. are truly in-place. In fact, they are supported by the same ufuncs that support the regular operations (add, subtract, multiply, true_divide), using the out parameter.

The developers had a choice to make for read-only arrays: either change the semantics of __iadd__, or raise an error. The principle of least surprise dictated the latter: a function should not change its fundamental behavior under certain circumstances without telling you. It’s safe to assume that you used __iadd__ rather than __add__ because you wanted an in-place operation. The function notifies you when it can’t do that, because the alternative is to do something you specifically didn’t ask for.

Answered By: Mad Physicist
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.