Multiprocessing example giving AttributeError

Question:

I am trying to implement multiprocessing in my code, and so, I thought that I would start my learning with some examples. I used the first example found in this documentation.

from multiprocessing import Pool
def f(x):
    return x*x

if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(f, [1, 2, 3]))

When I run the above code I get an AttributeError: can't get attribute 'f' on <module '__main__' (built-in)>. I do not know why I am getting this error. I am also using Python 3.5 if that helps.

Asked By: PiccolMan

||

Answers:

This problem seems to be a design feature of multiprocessing.Pool. See https://bugs.python.org/issue25053. For some reason Pool does not always work with objects not defined in an imported module. So you have to write your function into a different file and import the module.

File: defs.py

def f(x):
    return x*x

File: run.py

from multiprocessing import Pool
import defs

 if __name__ == '__main__':
    with Pool(5) as p:
        print(p.map(defs.f, [1, 2, 3]))

If you use print or a different built-in function, the example should work. If this is not a bug (according to the link), the given example is chosen badly.

Answered By: hr87

If you’re using Jupyter notebook (like the OP), then defining the function in a separate cell and executing that cell first fixes the problem. The accepted answer works too, but it’s more work. Defining the function before, i.e. above the pool, isn’t adequate. It has to be in a completely different notebook cell which is executed first.

Answered By: ASDFQWERTY

The multiprocessing module has a major limitation when it comes to IPython use:

Functionality within this package requires that the __main__ module be
importable by the children. […] This means that some examples, such
as the multiprocessing.pool.Pool examples will not work in the
interactive interpreter. [from the documentation]

Fortunately, there is a fork of the multiprocessing module called multiprocess which uses dill instead of pickle to serialization and overcomes this issue conveniently.

Just install multiprocess and replace multiprocessing with multiprocess in your imports:

import multiprocess as mp

def f(x):
    return x*x

with mp.Pool(5) as pool:
    print(pool.map(f, [1, 2, 3, 4, 5]))

Of course, externalizing the code as suggested in this answer works as well, but I find it very inconvenient: That is not why (and how) I use IPython environments.

<tl;dr> multiprocessing does not work in IPython environments right away, use its fork multiprocess instead.

Answered By: Michael Dorner

This answer is for those who get this error on Windows 10 in 2021.

I’ve researched this error a bit since I got it myself. I get this error when running any examples from the official Python 3 documentation on multiprocessing.

Test environment:

  • x86 Windows 10.0.19043.1165 + Python 3.9.2 – there is an error
  • x86 Windows 10.0.19043.1165 + Python 3.9.6 – there is an error
  • x86 Windows 10.0.19043.1110 + Python 3.9.6 – there is an error
  • ARM Windows 10.0.21354.1 + Python 3.9.6 – no error (version from DEV branch)
  • ARM macOS 11.5.2 + Python 3.9.6 – no errors

I have no way to test this situation in other conditions. But my guess is that the problem is on Windows as there is no such bug in the developer version "10.0.21354.1", but this ARM version probably has x86 emulation.

Also note that there was no such bug at the time Python 3.9.2 was released (February). Since all this time I was working on the same computer, I was surprised by the situation when the previously working code stopped working, and only the version for Windows changed.

I was unable to find a bug request with a similar situation in the Python bug tracker (I probably did a poor search). And the message marked "Correct answer" refers to a different situation. The problem is easy to reproduce, you can try to follow any example from the multiprocessing documentation on a freshly installed Windows 10 + Python 3.

Later, I will have the opportunity to check out Python 3.10 and the latest version of Windows 10.
I am also interested in this situation in the context of Windows 11.

If you have information about this error (link to the bug tracker or something similar), be sure to share it.

At the moment I switched to Linux to continue working.

Answered By: AtachiShadow

Why not use joblib? Your code is equivalent to:

# pip install joblib

from joblib import Parallel, delayed


def f(x):
    return x*x

res = Parallel(
    n_jobs=5
)(
    delayed(f)(x) for x in [1, 2, 3]
)
Answered By: Wenmin Wu
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.