Is making in-place operations return the object a bad idea?

Question:

I’m talking mostly about Python here, but I suppose this probably holds for most languages. If I have a mutable object, is it a bad idea to make an in-place operation also return the object? It seems like most examples just modify the object and return None. For example, list.sort.

Asked By: asmeurer

||

Answers:

I suppose it depends on the use case. I don’t see why returning an object from an in-place operation would hurt, other than maybe you won’t use the result, but that’s not really a problem if you’re not being super-fastidious about pure functionalism. I like the call-chaining pattern, such as jQuery uses, so I appreciate it when functions return the object they’ve acted upon, in case I want to use it further.

Answered By: Peter Hull

Returning the modified object from the method that modified it can have some benefits, but is not recommended in Python. Returning self after a modification operation will allow you to perform method chaining on the object, which is a convenient way of executing several methods on the same object, it’s a very common idiom in object-oriented programming. And in turn, method chaining allows a straightforward implementation of fluent interfaces. Also, it allows some functional-programming idioms to be expressed more easily.

To name a few examples: in Python, the Moka library uses method chaining. In Java, the StringBuilder class allows multiple append() invocations on the same object. In JavaScript, JQuery uses method chaining extensively. Smalltalk takes this idea to the next level: by default, all methods return self unless otherwise specified (therefore encouraging method chaining) – contrast this with Python, which returns None by default.

The use of this idiom is not common in Python, because Python abides by the Command/Query Separation Principle, which states that “every method should either be a command that performs an action, or a query that returns data to the caller, but not both”.

All things considered, whether it’s a good or bad idea to return self at the end is a matter of programming culture and convention, mixed with personal taste. As mentioned above, some programming languages encourage this (like Smalltalk) whereas others discourage it (like Python). Each point of view has advantages and disadvantages, open to heated discussions. If you’re a by-the-book Pythonist, better refrain from returning self – just be aware that sometimes it can be useful to break this rule.

Answered By: Óscar López

Yes, it is a bad idea. The reason is that if in-place and non-in-place operations have apparently identical output, then programmers will frequently mix up in-place operations and non-in-place operations (List.sort() vs. sorted()) and that results in hard-to-detect errors.

In-place operations returning themselves can allow you to perform “method chaining”, however, this is bad practice because you may bury functions with side-effects in the middle of a chain by accident.

To prevent errors like this, method chains should only have one method with side-effects, and that function should be at the end of the chain. Functions before that in the chain should transform the input without side-effects (for instance, navigating a tree, slicing a string, etc.). If in-place operations return themselves then a programmer is bound to accidentally use it in place of an alternative function that returns a copy and therefore has no side effects (again, List.sort() vs. sorted()) which may result in an error that is difficult to debug.

This is the reason Python standard library functions always either return a copy or return None and modify objects in-place, but never modify objects in-place and also return themselves. Other Python libraries like Django also follow this practice (see this very similar question about Django).

Answered By: Andrew Gorcester

The answers here about not returning from in-place operations messed me up for a bit until I came across this other SO post that links to the Python documentation (which I thought I read, but must have only skimmed). The documentation, in reference to in-place operators, says:

These methods should attempt to do the operation in-place (modifying self) and return the result (which could be, but does not have to be, self).

When I tried to use the in-place operation without returning self, then it became None. In this example, it will say vars requires an object with __dict__. Looking at the type of self there shows None.

# Skipping type enforcement and such.
from copy import copy
import operator
import imported_utility # example.
class A:
    def __init__(self, a, b):
        self.a = a
        self.b = b
    def one(self, scaler):
        self *= scaler
        return imported_utility(vars(self))
    def two(self, scaler):
        tmp = self * scaler
        return imported_utility(vars(tmp))
    def three(self, scaler):
        return imported_utility(vars(self * scaler))
    # ... addition, subtraction, etc.; as below.
    def __mul__(self, other):
        tmp = copy(self)
        tmp._inplace_operation(other, operator.imul)
        return tmp
    def __imul__(self, other): # fails.
        self._inplace_operation(other, operator.imul)
    # Fails for __imul__.
    def _inplace_operation(self, other, op):
        self.a = op(self.a, other)
        self.b = op(self.b, other)

* works (two and three), but *= (one) does not until self is returned.

    def __imul__(self, other):
        return self._inplace_operation(other, operator.imul)
    def _inplace_operation(self, other, op):
        self.a = op(self.a, other)
        self.b = op(self.b, other)
        return self

I do not fully understand this behavior, but a follow-up comment to the referenced post, says without returning self, the in-place method is truly modifying that object, but rebinding its name to None. Unless self is returned, Python does not know what to rebind to. That behavior can be seen by keeping a separate reference to the object.

Answered By: Kevin