What exactly is __weakref__ in Python?

Question:

Surprisingly, there’s no explicit documentation for __weakref__. Weak references are explained here. __weakref__ is also shortly mentioned in the documentation of __slots__. But I could not find anything about __weakref__ itself.

What exactly is __weakref__?
– Is it just a member acting as a flag: If present, the object may be weakly-referenced?
– Or is it a function/variable that can be overridden/assigned to get a desired behavior? How?

Asked By: Michael

||

Answers:

Before we talk about what __weakref__ is and does we need to define the notion of weak reference(s).

In most texts and sources the weak reference is usually defined as a reference that does not protect the referenced object (referent) from being collected by a garbage collector.

But what exactly is, garbage collection?

Garbage collection is simply the process of freeing memory when it is not used/reached by any reference/pointer anymore. Python performs garbage collection via a technique called reference counting (and a cyclic garbage collector that is used to detect and break reference cycles). Using reference counting, GC collects the objects as soon as they become unreachable which happens when the number of references to the object is 0. (for more info read https://docs.python.org/3/reference/datamodel.html#objects-values-and-types)

Now, the way with which weak references perform the task of NOT protecting the object from being collected by GC, or better to say the way with which they cause an object to be collected by GC is that (in case of a GC that uses reference counting rather than tracing technique) they just don’t get to be counted as a reference.Otherwise, if counted, they will be called strong references.

Now, when we are defining an object in Python, since by default all references are strong references, in order to create weak references to an object you need to use weakref module. Whenever you create a weak reference to an object that reference will be reachable via __weakref__ attribute.

Here is an example:

In [51]: class A:
    ...:     def sample_method(self):
    ...:         return "I'm a sample method"
    ...: 
    ...: 

In [52]: a = A()

In [53]: a.__weakref__

In [54]: # There's no weak reference defined

In [55]: import weakref

In [56]: r1 = weakref.ref(a)

In [57]: a.__weakref__
Out[57]: <weakref at 0x7f515b7583b0; to 'A' at 0x7f51680f4ee0>

In [58]: r1 is a.__weakref__
Out[58]: True

# As you can see they both point to one object 
# which means that a.__weakref__ is precisely r1
# Also, using weakref.ref always one reference to 
# the object will be created (singleton)

In [60]: r2 = weakref.ref(a)
In [62]: r1 is r2
Out[62]: True
Answered By: Mazdak

Interestingly enough, the language documentation is somewhat non-enlightening on this topic:

Without a __weakref__ variable for each instance, classes defining __slots__ do not support weak references to its instances. If weak reference support is needed, then add '__weakref__' to the sequence of strings in the __slots__ declaration.

The C API documentation is more useful:

When a type’s __slots__ declaration contains a slot named __weakref__, that slot becomes the weak reference list head for instances of the type, and the slot’s offset is stored in the type’s tp_weaklistoffset.

Weak references form a stack. The top of that stack (the most recent weak reference to an object) is available via __weakref__. Weakrefs are re-used whenever possible, so the stack is typically either empty or contains a single element.

Example

When you first use weakref.ref(), you create a new weak reference stack for the target object. The top of this stack is the new weak reference and gets stored in the target object’s __weakref__:

>>> import weakref
>>> class A: pass
...
>>> a = A()
>>> b = weakref.ref(a)
>>> c = weakref.ref(a)
>>> c is b, b is a.__weakref__
True, True
>>> weakref.getweakrefs(a)
[<weakref at 0x10dbe5270; to 'A' at 0x10dbc2fd0>]

As you can see, c re-uses b. You can force Python to create a new weak reference by passing a callback argument:

>>> import weakref
>>> class A: pass
...
>>> def callback(ref): pass
...
>>> a = A()
>>> b = weakref.ref(a)
>>> c = weakref.ref(a, callback)
>>> c is b, b is a.__weakref__
False, True
>>> weakref.getweakrefs(a)
[<weakref at 0x10dbcfcc0; to 'A' at 0x10dbc2fd0>,
 <weakref at 0x10dbe5270; to 'A' at 0x10dbc2fd0>]

Now c is a new weak reference in the stack.

Answered By: dhke

__weakref__ is just an opaque object that references all the weak references to the current object. In actual fact it’s an instance of weakref (or sometimes weakproxy) which is both a weak reference to the object and part of a doubly linked list to all weak references for that object.

It’s just an implementation detail that allows the garbage collector to inform weak references that its referent has been collected, and to not allow access to its underlying pointer anymore.

The weak reference can’t rely on checking the reference count of the object it refers to. This is because that memory may have been reclaimed and is now being used by another object. Best case scenario the VM will crash, worst case the weak reference will allow access to an object it wasn’t originally referring to. This is why the garbage collector must inform the weak reference its referent is no longer valid.

See weakrefobject.h for the structure and C-API for this object. And the implementation detail is here

Answered By: Dunes