Can a process in Python multiprocessing start another process?

Question:

A long time ago, I had once faced the problem a long time ago with some library (I don’t recall which) that certain processes/threads (not the main process/threads) cannot create another thread.

I am currently using asyncio and multiprocessing for creating separate threads of execution. I have a background task (done in its own multiprocessing process) that needs to create about 60 other processes.

Can this be implemented using multiprocessing or not? If not, how can I bypass any relevant limitation in the ability to create processes?

Asked By: AS7RID

||

Answers:

As larsks said in the comments:

A process spawned using multiprocessing is just a regular process. It can go ahead and create additional child processes; there aren’t any multiprocessing-specific restrictions.

Answered By: AS7RID

You have not provided important details as to what was the actual problem you had with "certain processes/threads (not the main process/threads) cannot create another thread." You have also failed to post any code or describe in any detail what it is you are trying to do. So the following is based purely on my speculating what the problem you may have encountered was and why:

Yes, there are restrictions! A daemon process cannot create processes that are not themselves also daemon processes. The processes belonging to a multiprocessing pool are such daemon processes and I am assuming you are using such a pool with aiohttp with the run_in_executor method. You need to make the child processa daemon processes. That does not prevent the parent process from waiting for its completion:

import asyncio
import concurrent.futures
from multiprocessing import Process

def worker():
    p = Process(target=some_task, daemon=True)
    p.start()
    p.join()
    return 'Done!'

def some_task():
    pass

async def main():
    loop = asyncio.get_running_loop()

    # Run in a custom process pool of size 1
    with concurrent.futures.ProcessPoolExecutor(1) as pool:
        result = await loop.run_in_executor(
            pool, worker)
        print('result =', result)

# Required for Windows:
if __name__ == '__main__':
    asyncio.run(main())

But you say you need to create many processes. If they are all heavily CPU-intensive and you will be running 60 of these in parallel, do you have enough cores to handle that?

What I would do is use a multithreading pool of size 1 whose purpose is to launch the 60 tasks into a multiprocessing pool of appropriate size (let’s assume 60 if you have enough cores or if the tasks being submitted frequently wait for I/O or network requests to complete):

import asyncio
import concurrent.futures

def worker():
    with concurrent.futures.ProcessPoolExecutor(60) as process_pool:
        results = list(process_pool.map(some_task, range(60)))
    return results

def some_task(x):
    return x * x

async def main():
    loop = asyncio.get_running_loop()

    # Run in a custom thread pool of size 1
    with concurrent.futures.ThreadPoolExecutor(1) as thread_pool:
        result = await loop.run_in_executor(
            thread_pool, worker)
        print('result =', result)

# Required for Windows:
if __name__ == '__main__':
    asyncio.run(main())

Prints:

result = [0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 289, 324, 361, 400, 441, 484, 529, 576, 625, 676, 729, 784, 841, 900, 961, 1024, 1089, 1156, 1225, 1296, 1369, 1444, 1521, 1600, 1681, 1764, 1849, 1936, 2025, 2116, 2209, 2304, 2401, 2500, 2601, 2704, 2809, 2916, 3025, 3136, 3249, 3364, 3481]

If worker also needed to do heavy CPU processing in addition to just creating a pool and running tasks within, then I would replace the multithreading pool with a multiprocessing pool. In effect, you would be using two multiprocessing pools.

async def main():
    loop = asyncio.get_running_loop()

    # Run in a custom processing pool of size 1
    with concurrent.futures.ProcessPoolExecutor(1) as pool:
        result = await loop.run_in_executor(
            pool, worker)
        print('result =', result)
Answered By: Booboo
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.