Create task from within another running task

Question:

In Python I create two async tasks:

tasks = [
    asyncio.create_task(task1(queue)),
    asyncio.create_task(task2(queue)),
]
await asyncio.gather(*tasks)

Now, I have a need to create a third task "task3" within task1.

So I have:

async def task1(queue):
    # and here I need to create the "task3":
    asyncio.create_task(task3(queue))
    # and how can I schedule this?

So I wish to schedule task3 also, without hurting task1 and task2 (they shall stay running).

How am I supposed to do this?

Asked By: Daniel

||

Answers:

You can add a done callback to it, and just let the your task1 run forward. https://docs.python.org/3/library/asyncio-task.html#asyncio.Task.add_done_callback

A matter that could arise there, though, and that is buried in the docs: the asyncio loop avoids creating hard references (just weak) to the tasks, and when it is under heavy load, it may just "drop" tasks that are not referenced somewhere else.

So, you can have a registry, a set structure at module level will do, to keep track of your new tasks, and then you can use the done callback so that each task can remove itself from there:

task3_registry = set()
...

async def task1(queue):
    # and here I need to create the "task3":
    t3 = asyncio.create_task(task3(queue))
    task3_registry.add(t3)
    t3.add_done_callback(lambda task: task3_registry.remove(task))
    ...
...

Even with this, when shutting down your asyncio loop (if that happens), the asyncio loop could just cancel the un-awaited-for task3’s : then you can simply use that registry again to await for the completion of all of them, before returning from your root co-routine:

async def main():
   tasks = [
       asyncio.create_task(task1(queue)),
       asyncio.create_task(task2(queue)),
   ]
   await asyncio.gather(*tasks)
   await asyncio.gather(task3_registry)
   # return 

Further answering:

If I have a t3_is_running = True variable in task1, can lambda in the add_done_callback change it to False?

If you need the variable inside task1 co-routine, it can be seen and changed from the callback as a closure variable. That requires writting the callback with the def syntax instead of lambda (it is just completely equivalent performance wise):

async def task1(queue):
    # and here I need to create the "task3":
    t3 = asyncio.create_task(task3(queue))
    task3_registry.add(t3)
    task3_is_running = True
    def done_callback(task):
        nonlocal task3_is_running
        task3_is_running = False
        task3_registry.remove(task)
    t3.add_done_callback(done_callback)
    
...

If you want to see that variable from the the code in main however, and not as just something the code inside task1 can see, the callback have to be able to see the "task1" task instance itself. If they are single tasks, a global (in Python really a "module level variable") is just good enough – otherwise similar registries to task3_registry have to be created.

If you don’t want to have global states running around, you can just encapsulate your tasks in a class, even if you are not otherwise using object orientation: this allows one task to change the state for others by using attributes in the instance:

class TaskSet:
    def __init__(self):
        self.task3_instance = None
        self.task3_is_running = False
        self.queue = ...
        pass  # can't really be run as async - can actually be omitted
    def __await__(self):
        # this will be executed by Python as an instance of the class 
        # when it is awaited.
        

        tasks = [
            asyncio.create_task(self.task1(self.queue)),
            asyncio.create_task(self.task2(self.queue)),
        ]
        await asyncio.gather(*tasks)
        if self.task3_instance:  # or use a set if more than one task3 might be running
            await self.task3_instance 
        
    async def task1(self):
        ...
        self.task3_instance = asyncio.create_task(t3)
        self.task3_is_running = True  # not actually needed, as then one could just 
                                      # check self.task3_instance and maybe call self.task3_instance.done
        
        self.task3_instance.add_done_callback(self.task3_done_callback)
        ...
        
    async def task2(self):
        ...
    async def task3(self):
        ...
        
    def task3_done_callback(self):
        self.task3_instance = None
        self.task3_is_running = False
        
        
asyncio.run(TaskSet())


Answered By: jsbueno
Categories: questions Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.