Ensuring __init__ is only called once when class instance is created by constructor or __new__

Question:

I’m trying to understand how new instances of a Python class should be created when the creation process can either be via the constructor or via the __new__ method. In particular, I notice that when using the constructor, the __init__ method will be automatically called after __new__, while when invoking __new__ directly the __init__ class will not automatically be called. I can force __init__ to be called when __new__ is explicitly called by embedding a call to __init__ within __new__, but then __init__ will end up getting called twice when the class is created via the constructor.

For example, consider the following toy class, which stores one internal property, namely a list object called data: it is useful to think of this as the start of a vector class.

class MyClass(object):
    def __new__(cls, *args, **kwargs):
        obj = object.__new__(cls, *args, **kwargs)
        obj.__init__(*args, **kwargs)
        return obj

    def __init__(self, data):
        self.data = data

    def __getitem__(self, index):
        return self.__new__(type(self), self.data[index])

    def __repr__(self):
        return repr(self.data)

A new instance of the class can be created either using the constructor (not actually sure if that is the right terminology in Python), something like

x = MyClass(range(10))

or via slicing, which you can see invokes a call to __new__ in the __getitem__ method.

x2 = x[0:2]

In the first instance, __init__ will be called twice (both via the explicit call within __new__ and then again automatically), and once in the second instance. Obviously I would only like __init__ to be invoked once in any case. Is there a standard way to do this in Python?

Note that in my example I could get rid of the __new__ method and redefine __getitem__ as

def __getitem__(self, index):
    return MyClass(self.data[index])

but then this would cause a problem if I later want to inherit from MyClass, because if I make a call like child_instance[0:2] I will get back an instance of MyClass, not the child class.

Asked By: Abiel

||

Answers:

There are a couple of things that shouldn’t be done:

  • Call __init__ from __new__
  • Call __new__ directly in a method

As you have already seen, both the __new__ and the __init__ methods are automatically called when creating an object of a given class. Using them directly would break this functionality (calling __init__ inside another __init__ is allowed though, as it can be seen in the example below).

You can get the class of the object in any method getting the __class__ attribute as in the following example:

class MyClass(object):
    def __new__(cls, *args, **kwargs):
        # Customized __new__ implementation here
        return obj

    def __init__(self, data):
        super(MyClass, self).__init__(self)
        self.data = data

    def __getitem__(self, index):
        cls = self.__class__
        return cls(self.data[index])

    def __repr__(self):
        return repr(self.data)

x = MyClass(range(10))
x2 = x[0:2]
Answered By: jcollado

First, some basic facts about __new__ and __init__:

  • __new__ is a constructor.
  • __new__ typically returns an instance of cls, its first argument.
  • By __new__ returning an instance of cls, __new__ causes Python to call __init__.
  • __init__ is an initializer. It modifies the instance (self)
    returned by __new__. It does not need to return self.

When MyClass defines:

def __new__(cls, *args, **kwargs):
    obj = object.__new__(cls, *args, **kwargs)
    obj.__init__(*args, **kwargs)
    return obj

MyClass.__init__ gets called twice. Once from calling obj.__init__ explicitly, and a second time because __new__ returned obj, an instance of cls. (Since the first argument to object.__new__ is cls, the instance returned is an instance of MyClass so obj.__init__ calls MyClass.__init__, not object.__init__.)


The Python 2.2.3 release notes has an interesting comment, which sheds light on when to use __new__ and when to use __init__:

The __new__ method is called with the class as its first argument; its
responsibility is to return a new instance of that class.

Compare this to __init__:__init__ is called with an instance as its
first argument, and it doesn’t return anything; its responsibility is
to initialize the instance.

All this is done so that immutable types can preserve their
immutability while allowing subclassing.

The immutable types (int, long, float, complex, str, unicode, and
tuple) have a dummy __init__, while the mutable types (dict, list,
file, and also super, classmethod, staticmethod, and property) have a
dummy __new__.

So, use __new__ to define immutable types, and use __init__ to define mutable types. While it is possible to define both, you should not need to do so.


Thus, since MyClass is mutable, you should only define __init__:

class MyClass(object):
    def __init__(self, data):
        self.data = data

    def __getitem__(self, index):
        return type(self)(self.data[index])

    def __repr__(self):
        return repr(self.data)

x = MyClass(range(10))
x2 = x[0:2]
Answered By: unutbu

When you create an instance of a class with MyClass(args), the default instance creation sequence is as follows:

  1. new_instance = MyClass.__new__(args) is invoked to get a new "blank" instance
  2. new_instance.__init__(args) is invoked (new_instance is the instance returned from the call to __new__ as above) to initialise the attributes of the new instance [1]
  3. new_instance is returned as the result of MyClass(args)

From this, it is clear to see that calling MyClass.__new__ yourself will not result in __init__ being called, so you’ll end up with an uninitialised instance. It’s equally clear that putting a call to __init__ into __new__ will also not be correct, as then MyClass(args) will call __init__ twice.

The source of your problem is this:

I’m trying to understand how new instances of a Python class should be
created when the creation process can either be via the constructor or
via the new method

The creation process should not normally be via the __new__ method at all. __new__ is a part of the normal instance creation protocol, so you shouldn’t expect it to invoke the whole protocol for you.

One (bad) solution would be to implement this protocol by hand yourself; instead of:

def __getitem__(self, index):
    return self.__new__(type(self), self.data[index])

you could have:

def __getitem__(self, index):
    new_item = self.__new__(type(self), self.data[index])
    new_item.__init__(self.data[index])
    return new_item

But really, what you want to do is not mess with __new__ at all. The default __new__ is fine for your case, and the default instance creation protocol is fine for you case, so you should neither implement __new__ nor call it directly.

What you want is to create a new instance of the class the normal way, by calling the class. If there’s no inheritance going on and you don’t think there ever will be, simply replace self.__new__(type(self), self.data[index]) with MyClass(self.data[index]).

If you think there might one day be subclasses of MyClass that would want to create instances of the subclass through slicing rather than MyClass, then you need to dynamically get the class of self and invoke that. You already know how to do this, because you used it in your program! type(self) will return the type (class) of self, which you then can invoke exactly as you would invoke it directly through MyClass: type(self)(self.data[index]).


As an aside, the point of __new__ is when you want to customise the process of getting a "new" blank instance of a class before it is initialised. Almost all of the time, this is completely unnecessary and the default __new__ is fine.

You only need __new__ in two circumstances:

  1. You’re have an unusual "allocation" scheme, where you might return an existing instance rather than create a genuinely new one (the only way to actually create a new instance is to delegate to the ultimate default implementation of __new__ anyway).
  2. You’re implementing a subclass of an immutable builtin type. Since the immutable builtin types can’t be modified after creation (because they’re immutable), they must be initialised as they’re created rather than afterwards in __init__.

As a generalisation of point (1), you can make __new__ return whatever you like (not necessarily an instance of the class) to make invoking a class behave in some arbitrarily bizarre manner. This seems like it would almost always be more confusing than helpful, though.


[1] I believe in fact the protocol is slightly more complex; __init__ is only invoked on the value returned by __new__ if it’s an instance of the class that was invoked to start the process. However it’s very unusual for this not to be the case.

Answered By: Ben
Categories: questions Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.