Create multiprocessing pool outside main to avoid recursion

Question:

I have a py file with functions that requires multiprocessing, so i do something like this:

pool = Pool()
def function_():
  pool.map(...)

Then, I’ll import this file into the main one, but when i run function_ I get:

daemonic processes are not allowed to have children

and this is usually due to the fact that multiprocessing will re-run the file where it’s called (thus usually the pool has to be inserted in if __name__ == "__main__", see here Python multiprocessing gets stuck)… is there a way to avoid this?

Asked By: Alberto Sinigaglia

||

Answers:

Just refactor all the functions that require a multiprocessing pool so that they take the pool instance as their parameter. Then, instantiate the pool in your main module and pass it to functions imported from your package. Lets say, your module worker.py looks something like this:

def _func(x):
    return x * x

def process(pool):
    return pool.map(_func, list(range(10)))

And your main module main.py be like:

#!/usr/bin/env python
import multiprocessing as mp

import worker


if __name__ == '__main__':

    with mp.Pool(5) as pool:
        results = worker.process(pool)

    print(results)

All works just fine:

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
Answered By: constt

It’s not clear why pool needs to be a global based on what you have posted. But if that is the case, you can add to the imported module the following function definition:

def create_pool():
    from multiprocessing import Pool
    global pool

    pool = Pool()

Your main script simply imports this function and calls it before calling function_. If pool does not need to be global, just move the pool creation to inside function_.

Answered By: Booboo