Using Asyncio to create new Python Processes


I’m setting up a function to asynchronously start a new process to run a very cpu heavy function. Most of the documentation don’t cover this thoroughly, and what I’ve pieced together doesn’t seem to asynchronously work.

I have a function procManager which takes in a function, the args to pass into the function, and an object name for basic logging.

async def procManager(f,a,o):
    print(f"{o} started at {time.strftime('%X')}")
    p = Process(target=f, args=(a,))
    p_parent = os.getppid()   # parent process
    p_curr = os.getpid()     # current process
    print("parent process:", p_parent)
    print("current process:", p_curr)
    print(f"{o} finished at {time.strftime('%X')}")

I have this cpu heavy function that runs Louvain’s community detection on a networkX graph that I pass into def procManager to spawn on a new process.

def community(cg):
    start = timer()
    partition = c.best_partition(cg) #default louvain community detection
    v = {} #create dict to group nodes by community
    for key, value in sorted(partition.items()):
        v.setdefault(value, []).append(key)
    stop = timer()

The main function looks as such. I’m initializing 2 graphs A and B of 3000 and 1000 nodes respectively, with an average degree of 5. I’m using a Jupyter notebook to run this, so I use await main() instead of

A = nx.barabasi_albert_graph(3000,5)  
B = nx.barabasi_albert_graph(1000,5)  

async def main():
    task1 = asyncio.create_task(
        procManager(community, A, "A"))

    task2 = asyncio.create_task(
        procManager(community, B, "B"))

    print("async start")

await main()

What I’m trying to do is to get A and B processed asynchronously (i.e. start at the same time) but on different processes. Current outputs look like this, where A and B are processed on new processes but are blocking. I’ll need to compute for A and B communities in an async manner because they’ll be triggered by a rabbitMQ stream and responses need to be non-blocking.

async done
A started at 06:03:48
parent process: 5783
current process: 12121
A finished at 06:03:59
B started at 06:03:59
parent process: 5783
current process: 12121
B finished at 06:03:59

Hope you guys can help!

Asked By: Francis



In your case the problem is the join() method. It blocks until the process has finished. Also, you wouldn’t even need asyncio for that. Have a look at this quick example:

import time
from multiprocessing import Process

def procManager(f,a,o):
    print(f"{o} started at {time.strftime('%X')}")
    p = Process(target=f, args=(a,))
    # p.join()
    print(f"{o} finished at {time.strftime('%X')}") # This will occur immediately

def community(cg):
    for i in range(10):
        print("%s - %s" %(cg, i))

procManager(community, "This is A", "A")
procManager(community, "This is B", "B")

This should give you an idea on how to solve your problem. I hope it helps!

Answered By: nullchimp

In regards to Asyncio, you need to use the asyncio.create_task method. The trick to this method is that you should only specify the funcitons that you have declared async. In order to run them, you should use await asyncio.gather.

Example would be:

import asyncio

async def print_hello(name):
    print("Hello! {}".format(name))

name_list = ["billy", "bob", "buffalo bob"]

for item in name_list:
    await asyncio.gather(print_hello(item))

The most simple form of creating and running subprocesses with asyncio is the create_task method as outlined here: Asyncio Docs

Hope this helps!

Answered By: billy

one of examoles for run asyncio in processpool:

loop = asyncio.get_running_loop()

with concurrent.futures.ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(
            pool, cpu_bound_function, args)
        print('custom process pool', result)

run_in_exacutor() works fine howewer it’s not best way to use asyncio

Answered By: Pavel Mostovoy