Why isn't the __del__ method called?

Question:

In python3’s multiprocess, can’t call the __del__ method.

I’ve read other issues about circular references,but I can’t find the situation in multiprocess.

There is a circular reference in foo, __del__ will be called when foo is called directly,but in multiprocess the __del__ will never be called.

import multiprocessing
import weakref


class Foo():
    def __init__(self):
        self.b = []

    def __del__(self):
        print ('del')


def foo():
    print ('call foo')
    f = Foo()
    a = [f]
    # a = [weakref.ref(f)]
    f.b.append(a)


# call foo in other process
p = multiprocessing.Process(target=foo)
p.start()
p.join()


# call foo
foo()

Output:
call foo
call foo
del

why __del__ is not called in p?

Asked By: QuantumEnergy

||

Answers:

Forked Process objects terminate after running their task using os._exit(), which forcibly terminates the child process without the normal cleanup Python performs on exit. Cyclic garbage isn’t cleaned (because the process is terminated without giving the cyclic GC a chance to run), it’s just dropped on the floor, leaving the OS to clean up.

This is intentional, since exiting normally (invoking all normal cleanup procedures) would risk stuff like unflushed buffers getting flushed in both parent and child (doubling output), and other weirdness involved when a forked process inherits all the state of the parent but isn’t supposed to use it except when told to do so explicitly.

You could write a wrapper function that would invoke the "real" function, then trigger a cycle collection before returning, but it’s hard to write correctly and quite brittle. An initial stab at it would be something like:

import gc
import traceback

def clear_cycles_after(func, *args, **kwargs):
    try:
        return func(*args, **kwargs)
    except BaseException as e:
        # Clear locals of all frames in the traceback
        traceback.clear_frames(e.__traceback__)  # Requires 3.4+
        raise  # Reraises original exception with locals cleaned from all frames
    finally:
        gc.collect()  # Now that we've cleaned the locals from any exception traceback,
                      # it should be possible to identify cyclic garbage
                      # and gc.collect() will clean it up

You’d use it by replacing:

p = multiprocessing.Process(target=foo)

with:

p = multiprocessing.Process(target=clear_cycles_after, args=(foo,))

I don’t really recommend this solution though. Ideally, if some cleanup (not related to process memory, which the OS cleans for you anyway) must occur in the child, you’d implement the context manager protocol on the relevant type(s) (contextlib.contextmanager can be used to provide such functionality for existing types you can’t modify directly) and create/control them with with statements, which would guarantee cleanup was performed deterministically, even in the presence of cyclic references, even on non-CPython interpreters (which aren’t reference counted, and therefore don’t perform deterministic cleanup without with statements even when there are no cyclic references). Anything less than with statements (or try/finally blocks with equivalent effect) is going to be some combination of brittle, non-portable, or non-functional.

Using context management, your class and function would look like:

class Foo:
    def __init__(self):
        self.b = []

    def close(self):  # Convenient to have a way to manually clean up when needed
        print('del')
        del self.b[:]  # Clear contents of b to avoid cyclic references after cleanup

    # Optional: Provide __del__ as best effort in case user doesn't close or context managev
    __del__ = close
    
    # Define context management special methods in terms of shared close
    def __enter__(self):
        return self  # No-op when entering with block
    def __exit__(self, typ, exc, tb):
        self.close()


def foo():
    print ('call foo')
    with Foo() as f:  # Create and manage with with statement
        a = [f]
        f.b.append(a)
    # f's contents are cleaned here, so when foo returns, on CPython, a and f will be cleaned
    # since they're not part of a reference cycle anymore, and the actual Foo
    # object bound to f and a[0] will be removed deterministically

You’ll actually see multiple del outputs now in some cases, particularly when you include the optional __del__ = close line as a backup when the user fails to context manage (where close and/or __exit__ gets invoked, then __del__ gets invoked later), but there’s no harm there (the contained list just gets emptied twice).

Answered By: ShadowRanger

The question isn’t actually why it isn’t called in multiprocess, but why it is called in the other example. And the answer to that is that it isn’t called when you call foo. It’s called at the end of the program. Since the program is finished, Python knows that anything else can be cleaned up even if it’s still referenced, so it cleans up circular references.

If you add a print statement at the end of the script, or call this from the REPL, you can see that __del__ still isn’t called at your second foo call either, but only at the end of the script.

Given that Python cleans up circular references when the script ends, ShadowRanger’s answer explains why that doesn’t happen when the multiprocessing function is finished.

Answered By: Daniel H